Molecular Systematics of Fishes
This Page Intentionally Left Blank
Molecular Systematics Fishes Edited by
Thomas D...
243 downloads
2032 Views
25MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Molecular Systematics of Fishes
This Page Intentionally Left Blank
Molecular Systematics Fishes Edited by
Thomas D. Kocher Department of Zoology University of New Hampshire Durham, New Hampshire
Carol A. Stepien Department of Biology Case Western Reserve University Cleveland, Ohio
Academic Press San Diego London Boston New York Sydney Tokyo Toronto
This book is printed on acid-free paper. ( ~
Copyright 9 1997 by ACADEMIC PRESS All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. A c a d e m i c Press a division o f Harcourt Brace & Company
525 B Street, Suite 1900, San Diego, California 92101-4495, USA http://www.apnet.com Academic Press Limited 24-28 Oval Road, London NW 1 7DX, UK http://www.hbuk.co.uk/ap/ Library of Congress Cataloging-in-Publication Data Molecular systematics of fishes / edited by Thomas D. Kocher, Carol A. Stepien. p. cm. Includes bibliographical references and index. ISBN 0-12-417540-6 (alk. paper) 1. Fishes--Phylogeny. 2. Fishes--Molecular aspects. I. Kocher, Thomas D. II. Stepien, Carol A. QL618.2.M65 1997 96-49199 597.13'8--dc21 CIP
PRINTED IN THE UNITED STATES OF AMERICA 97 98 99 00 01 02 EB 9 8 7 6 5
4
3
2
1
Contents
Contributors Preface xi
ix
CHAPTER
3 CHAPTER
1 Molecules and Morphology in Studies of Fish Evolution Carol A. S tepien and Thomas D. Kocher
I. II. III. IV.
Introduction 1 History of Molecular Techniques 2 Controversy over Analytical Methods 5 Achievements and Failures of Molecular Systematics 7 V. Eight Promising Directions for Future Research 8 VI. A New Age of Synthesis 9 References 9
Molecular Systematics of a Rapidly Evolving Species Flock: The mbuna of Lake Malawi and the Search for Phylogenetic Signal Irv Kornfield and Alex Parker
I. Introduction 25 II. Molecular Investigations 26 III. Mitochondrial DNA and Ancestral Polymorphisms 26 IV. Alternate Molecular Approaches 27 V. Microsatellite Loci 28 VI. A Test of the Phylogenetic Potential of Microsatellites 29 VII. Materials and Methods 31 VIII. Results 32 IX. Discussion 33 X. Summary 35 References 35
CHAPTER
2
CHAPTER
Base Substitution in Fish Mitochondrial DNA: Patterns and Rates
4
Thomas D. Kocher and Karen L. Carleton
Reconstruction of Cichlid Phylogeny Using Nuclear DNA Markers
I. II. III. IV.
Introduction 13 Simple Models of Substitution 13 Evolution of Real Sequences 15 Implications for Phylogenetic Reconstruction 19 V. Conclusions 23 References 24
Holger S~iltmann and Werner E. Mayer I. Introduction 39 II. Methods Used for Reconstructing Cichlid Phylogeny 40 III. Random Amplification of Polymorphic DNA (RAPD) 41
vi
CONTENTS
IV. Allele Size Frequencies at Dinucleotide Microsatellite Loci 45 V. Critical Evaluation Using RAPD and Microsatellite Allele Frequencies for the Reconstruction of Cichlid Fish Phylogeny References 49
CHAPTER
7 47
Phylogeographic Patterns in Populations of Cichlid Fishes from Rocky Habitats in Lake Tanganyika Christian Sturmbauer, Erik Verheyen, Lukas R~iber and Axel Meyer
CHAPTER
5 Biogeographic Analysis of Pacific Trout
(Oncorhynchusmykiss) in California and Mexico Based on Mitochondrial DNA and Nuclear Microsatellites Jennifer L. Nielsen, Monique C. Fountain and Jonathan M. Wright
I. Lake Tanganyika and Its Cichlid Species Flock 97 II. Speciation and DNA 98 III. From Patterns toward an Understanding of Processes 105 IV. Conclusions 109 References 109
CHAPTER
8 Fish Biogeography and Molecular Clocks: Perspectives from the Panamanian Isthmus
I. II. III. IV.
Introduction 53 Materials and Methods Results 57 Discussion 64 References 66 Appendices 70
Eldredge Bermingham, S. Shawn McCafferty and Andrew P. Martin
55
I. Introduction 113 II. Temporal Scaling: The Panama Isthmus and Molecular Clocks 114 III. Geographic Scaling: The Panama Isthmus and Caribbean Fish 119 IV. Geographic Scaling: The Panama Isthmus and the Circumtropical Abudefduf (Teleostei: Pomacentridae) Species Group 121 V. Geographic Scaling: The Panama Isthmus and Neotropical Freshwater Fishes 123 VI. Concluding Remarks 125 References 126
CHAPTER
6 Mitochondrial DNA Sequence Variation among the Sand Darters (Percidae: Teleostei) E. O. Wiley and Robert H. Hagen
I. II. III. IV. V. VI.
Introduction 75 Systematics of Sand Darters Methods and Materials 78 Results 78 Discussion 91 Summary 94 References 94 Appendices 96
76
CHAPTER
9 The Utility of Mitochondrial DNA Control Region Sequences for Analyzing Phylogenetic Relationships among Populations, Species, and Genera of the Percidae Joseph E. Faber and Carol A. S tepien
I. Introduction 129 II. Materials and Methods
131
CONTENTS
III. Results 133 IV. Discussion 137 V. Material Examined References 140
140
vii
III. IV. V. VI. VII.
Allozymes and DNA 191 Fish Samples 191 DNA Sequences 191 Phylogenetic Relationships Conclusion 195 References 195
193
CHAPTER
10 Phylogenetic Relationships among the Salmoninae Based on Nuclear and Mitochondrial DNA Sequences Ruth B. Phillips and Todd H. Oakley
CHAPTER
13 Interrelationships of Lamniform Sharks: Testing Phylogenetic Hypotheses with Sequence Data Gavin J. P. Naylor, Andrew P. Martin, Erik G. Mattison and Wesley M. Brown
I. Introduction 145 II. Conclusions 158 References 159
CHAPTER
11
I. Introduction 199 II. Materials and Methods III. Results and Discussion References 216 Appendix 218
Combining Molecular and Morphological Data in Fish Systematics: Examples from the Cyprinodontiformes
CHAPTER
14
Alex Parker
I. II. III. IV. v. VI. VII. VIII. IX.
Introduction 163 Analysis of Combined Data: Justification 164 Analysis of Combined Data: Methods 165 Consensus Approaches: Justification 166 Consensus Methods 166 Analysis of Cyprinodontiform Data 167 Methods 167 Results and Discussion 170 Conclusions 181 References 182 Appendices 184
202 204
Radiation of Characiform Fishes: Evidence from Mitochondrial and Nuclear DNA Sequences Guillermo Ortf
I. Introduction 219 II. Materials and Methods III. Results and Discussion References 240 Appendix 242
222 222
CHAPTER
15
CHAPTER
12 Molecular Phylogeny of the Fundulidae (Teleostei, Cyprinodontiformes) Based on the Cytochrome b Gene Giacomo Bernardi
I. Introduction 189 II. Morphology
190
The Evolution of Blennioid Fishes Based on an Analysis of Mitochondrial 12S rDNA Carol A. Stepien, Alison K. Dillon, Meriel J. Brooks, Kristen L. Chase and Allyson N. Hubers
I. II. III. IV.
Introduction 245 Materials and Methods Results 253 Discussion 258
250
viii
CONTENTS
V. Summary 267 References 268
VII. Conclusion 279 References 281
CHAPTER
CHAPTER
16
17
Major Histocompatibility Complex Genes in the Study of Fish Phylogeny
The Phylogenetic Utility of the Mitochondrial Cytochrome b Gene for Inferring Relationships among Actinopterygian Fishes
Jan Klein, Dagmar Klein, Felipe Figueroa, Akie Sato and Colin O'hUigin I. Introduction 271 II. Major Histocompatibility Complex (Mhc) Structure and Function 271 III. Mhc as a Source of Systematic Information 273 IV. Sequences as a Source of Phylogenetic and Systematic Information 273 V. Cladistic Analysis with Macromutations 275 VI. Mhc Gene Frequencies in Populations Undergoing Adaptive Radiation 276
Charles Lydeard and Kevin J. Roe I. Introduction 285 II. Materials and Methods III. Results and Discussion References 300 Taxonomic Index 305 Subject Index 311
288 289
Contributors
Numbers in parentheses indicate the pages on which the authors' contributions begin.
Allyson N. Hubers (245) Department of Biology, Case Western Reserve University, Cleveland, Ohio 44106.
Eldredge Bermingham (113) Smithsonian Tropical Research Institute, Balboa, Republic of Panama.
Dagmar Klein (271) Department of Microbiology and
Immunology, University of Miami School of Medicine, Miami, Florida 33136.
Giacomo Bernardi (189) Department of Biology, University of California, Santa Cruz, Santa Cruz, California 95064.
Jan Klein (271) Max-Planck-Institut ftir Biologie, Ab-
teilung Immungenetik, D-72076 Ttibingen, Germany and Department of Microbiology and Immunology, University of Miami School of Medicine, Miami, Florida 33136.
Meriel J. Brooks (245) Department of Science, Notre Dame College, South Euclid, Ohio 44121. Wesley M. Brown (199) Department of Biology, University of Michigan, Ann Arbor, Michigan 48109.
Thomas D. Kocher (1,13) Department of Zoology, University of New Hampshire, Durham, New Hampshire 03824.
Karen L. Carleton (13) Department of Zoology, University of New Hampshire, Durham, New Hampshire 03824.
Irv Kornfield (25) Department of Zoology and School of Marine Sciences, University of Maine, Orono, Maine 04469.
Kristen L. Chase (245) Department of Biology, Case Western Reserve University, Cleveland, Ohio 44106.
Western Reserve University, Cleveland, Ohio 44106.
Charles Lydeard (285) Aquatic Biology Program, University of Alabama, Department of Biological Sciences, Tuscaloosa, Alabama 35487.
Joseph E. Faber (129) Department of Biology, Case
Andrew P. Martin (113,199) Smithsonian Tropical Re-
Western Reserve University, Cleveland, Ohio 44106.
search Institute, Balboa, Republic of Panama and Department of Biological Sciences, University of Nevada Las Vegas, Las Vegas, Nevada 89154.
Alison K. Dillon (245) Department of Biology, Case
Felipe Figueroa (271) Max-Planck-Institut ftir Biologie, Abteilung Immungenetik, D-72076 Ttibingen, Germany.
Erik G. Mattison (199) Department of Biology, University of Michigan, Ann Arbor, Michigan 48109.
Monique C. Fountain (53) USDA Forest Service, Pa-
cific Southwest Research Station and Hopkins Marine Station, Department of Biology, Stanford University, Pacific Grove, California 93950.
Werner E. Mayer (39) Max-Planck-Institut ftir Bio-
Robert H. Hagan (75) Department of Entomology, University of Kansas, Lawrence, Kansas 66045.
S. Shawn McCafferty (113) Smithsonian Tropical Research Institute, Balboa, Republic of Panama.
logie, Abteilung Immungenetik, D-72076 Ttibingen, Germany.
ix
x
CONTRIBUTORS
Axel Meyer (97) Department of Ecology and Evolution, State University of New York at Stony Brook, Stony Brook, New York 11794.
Kevin J. Roe (285) Aquatic Biology Program, University of Alabama, Department of Biological Sciences, Tuscaloosa, Alabama 35487.
Gavin J. P. Naylor (199) Department of Biology, Osborn Memorial Laboratory, Yale University, New Haven, Connecticut 06520.
Lukas Riiber (97) Zoological Museum of the University of Zfirich, Switzerland.
Jennifer L. Nielsen (53) USDA Forest Service, Pacific Southwest Research Station and Hopkins Marine Station, Department of Biology, Stanford University, Pacific Grove, California 93950.
Akie Sato (271) Max-Planck-Institut ftir Biologie, Abteilung Immungenetik, D-72076 Ttibingen, Germany. Carol A. Stepien (1,129,245) Department of Biology, Case Western Reserve University, Cleveland, Ohio 44106.
Colm O'hUigin (271) Max-Planck-Institut ffir Biologie, Abteilung Immungenetik, D-72076 Tfibingen, Germany.
Christian Sturmbauer (97) Department of Zoology, University of Innsbruck, A-6020 Innsbruck Austria.
Todd H. Oakley (145) Department of Biological Sciences, University of Wisconsin- Milwaukee, Milwaukee, Wisconsin 53201.
I-Iolger Siiltmann (39) Max-Planck-Institut ftir Biologie, Abteilung Immungenetik, D-72076 Tfibingen, Germany.
Guillermo Orti (219) Department of Genetics, University of Georgia, Athens, Georgia 30602.
Erik Verheyen (97) Royal Belgium Institute of Natural Sciences, B-1000 Brussels, Belgium.
Alex Parker (25,163) Department of Zoology and School of Marine Sciences, University of Maine, Orono, Maine 04469.
E. O. Wiley (75) Museum of Natural History and Department of Systematics and Ecology, University of Kansas, Lawrence, Kansas 66045.
Ruth B. Phillips (145) Department of Biological Sciences, University of Wisconsin - Milwaukee, Milwaukee, Wisconsin 53201.
Jonathan M. Wright (53) Marine Gene Probe Laboratory, Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1, Canada.
f , ce
ber, Allyson N. Hubers, Mark D. Chandler, Rachel A. Bartholomew, Rachael A. Callcut, and Gary R. Kutsikovich reviewed the entire volume at various stages. We owe special thanks to Rachel A. Bartholomew and Rachael A. Callcut for helping to prepare the indices, Karen L. Carleton for work on the references, and Craig Albertson for the artwork on the cover jacket. Our work on the molecular systematics of fishes has been generously funded by grants from the National Science Foundation, the Alfred Sloan Foundation, the National Geographic Society, the National Research Council, and the NOAA Sea Grant Program. We especially thank our families and students for their patience and understanding during the many periods that our work has required us to be elsewhere--in body or in thought. This volume is dedicated to our mentors (especially Richard Rosenblatt, David Hillis, Allan Wilson, and Jeff Mitton) who encouraged, critiqued, and shaped our ideas in molecular systematics. We hope that this volume will contribute to the preservation of fish species so that future generations will be able to wonder at the beauty and diversity of fishes in their natural habitats.
Fishes are the most diverse group of extant vertebrates, and yet our knowledge of the evolutionary relationships among them is largely incomplete. Over the past few years, molecular genetic methods, particularly PCR amplification and DNA sequencing, have become widely used to study the evolutionary history of fishes. Because of the strong tradition of morphological systematics of fishes, this group is uniquely suitable for testing and evaluating the efficacy of different approaches to elucidating the relationships among taxa. This book surveys the use of these new methods at many taxonomic levels, from the structure of local populations to the relationships among the deepest branches of the piscine family tree. The authors bring a diversity of experience and approaches to their analyses, and the result is a collective evaluation of the utility of these techniques for understanding evolutionary patterns and processes. Although this book focuses on fishes, the conclusions should be broadly applicable to the molecular systematics of other groups. We thank the authors for seeing this project through to completion. We are indebted to a host of anonymous individuals for constructive critical reviews of each chapter in manuscript form. In an increasingly busy world, it was a delight to see that many careful reviewers are still willing to take the time to coax a higher quality manuscript from their colleagues. In addition to these reviewers, Raymond R. Wilson, Joseph E. Fa-
Thomas D. Kocher, University of New Hampshire Carol A. Stepien, Case Western Reserve University
xi
This Page Intentionally Left Blank
C H A P T E R
1 Molecules and Morphology in Studies ofFish Evolution CAROL A. STEPIEN
T H O M A S D. KOCHER
Department of Biology Case Western Reserve University Cleveland, Ohio 44106
Department of Zoology University of New Hampshire Durham, New Hampshire 03824
I. I n t r o d u c t i o n
Fishes are the most diverse group of living vertebrates, with more than 24,600 extant species currently known (Nelson, 1994). For more than a century, systematists have sought to organize this diversity by studying aspects of their external and internal morphology. Their patient counting and dissection have achieved remarkable success in identifying groups of evolutionarily related species and provide the foundation and starting point for all current work on the systematics of fishes (for summaries of present status of morphological systematics of fishes see Nelson, 1994; Stiassny et al., 1996). The development of molecular techniques has helped invigorate studies of fish systematics. The realm of methods developed for molecular systematics (Hillis, et al., 1996; Ferraris and Palumbi, 1996) offer new suites of characters for analyzing relationships among fishes (Carvalho and Pitcher, 1995) and have been effectively applied from the level of populations to orders. It is hoped that this book illustrates the broad utility of molecular approaches for addressing fish systematic questions. Morphological studies have been especially successful in defining species and in organizing these species into genera. These groupings have usually been confirmed when examined with molecular approaches. Molecular characters have revealed some cryptic species (reviewed by Avise, 1994) and identified some inMOLECULAR SYSTEMATICS OF FISHES
correctly split groups (e.g., species in the clinid kelpfish genus Gibbonsia by Stepien and Rosenblatt, 1991; Stepien et al., Chapter 15). In general, the overall concordance between morphological and molecular studies has been good. Testing for congruence of relationships derived from independent data sets is a particularly robust approach to systematic problems (Miyamoto and Fitch, 1995). Although morphological studies have generally been successful in defining genera, it is rare to find studies which present a hypothesis of relationship above the level of the species comprising a genus, primarily due to a lack of congruence of characters. Fortunately, this is one of the strengths of molecular data, and inter- and intrageneric relationships are now being rapidly tested and elucidated. Molecular data are also the primary means used to assess the phylogeographic relationships among populations, examining questions of zoogeographic subdivision and relationships among areas (see Chapter 5 by Nielsen et al., Chapter 8 by Bermingham et al., and Chapter 9 by Faber and Stepien). Studies at these lower systematic levels are shedding more light on the mechanisms underlying the diversity of fishes. Both morphological and molecular studies have had particular difficulty discerning higher-level relationships. In both types of data, the central problems are identifying homologous characters and finding a sufficient number of synapomorphies to identify lineages with statistical confidence. Although great strides have Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
2
CAROL A. STEPIEN A N D THOMAS D. KOCHER
been made in identifying appropriate molecules and refining analytical techniques, interpreting relationships among the deepest clades of the piscine phylogeny are still problematic. This book is arranged in approximate order of primary phylogenetic problems addressed, ranging from lower (relationships among populations and closely related species) to higher-level systematic questions. The first set of chapters primarily focus on discerning population and species level problems in relation to phylogeography and include Chapter 3 by Kornfield and Parker (mbuna species flock), Chapter 4 by S~iltmann and Mayer (cichlid adaptive radiation), Chapter 5 by Nielsen et al. (Pacific trout Oncorhynchus), Chapter 6 by Wiley and Hagen (sand darters Ammocrypta), Chapter 7 by Sturmbauer et al. (cichlids), and Chapter 8 by Bermingham et al. (biogeographic patterns involving fishes of the Panamanian Isthmus). The next set of chapters address resolution of DNA for testing middle-level systematic problems (species through family-level questions) and discriminating among morphology-based hypotheses, including Chapter 9 by Faber and Stepien (Percidae), Chapter 10 by Phillips and Oakley (Salmoninae), Chapter 11 by Parker (Cyprinodontiformes), and Chapter 12 by Bernardi (Fundulidae, Cyprinodontiformes). The final set of chapters focus on the resolution power of genes to address higher-level systematic questions and evaluating the level of maximum phylogenetic utility. These include Chapter 13 by Naylor et al. (lamniform sharks), Chapter 14 by Orti (Characiformes), Chapter 15 by Stepien et al. (Blennioidei), Chapter 16 by Klein et al. (Cichlidae), and Chapter 17 by Lydeard and Roe (Actinopterygii).
II. History of Molecular Techniques An increasingly sophisticated realm of techniques has been developed since the mid-1970s to study the molecular similarities of organisms. Although preceded by protein sequencing and immunology, the widespread use of molecular techniques in fish systematics really began with the discovery of allozyme polymorphisms. A. A l l o z y m e S t u d i e s
Allozyme/isozyme studies involve identifying protein polymorphisms by comparing their similarities and differences in net electric charge. Allozyme and isozyme studies have been one of the most popular approaches in examining population genetic and stock
divergence questions in fishes. They have also been especially useful in identifying cryptic species and in testing biogeographic hypotheses. Allozyme/isozyme electrophoresis has the advantage of being relatively rapid, cost effective, and efficient. Another advantage is that the sampling is spread over a variety of presumably independent gene loci. The chief disadvantage of using an allozyme approach is that bands (alleles) that have the same electric charge and migrate to the same point in the gel may not be homologous (i.e., evolutionary convergence). The scoring of gels is often somewhat subjective and bands are difficult to interpret when weak or close together. Variants have traditionally been assumed to be selectively neutral, enabling hypotheses of separation time to be tested. However, several studies have shown that some allozyme variants are not neutral markers and are under selection (Avise, 1994; Pogson et al., 1995; Powers and Shulte, 1996). Our view is that increasing evidence shows that most (if not all) "neutral" genetic markers, including allozymes, mtDNA, and microsatellites, are indeed subject to varying amounts of selective constraint. The possibility that loci are under selection does not eliminate their utility in systematics, however. For example, morphologists regularly utilize characters that are the products of selection. In this volume, Nielsen et al. (Chapter 5; Salmonidae) and Stepien et al. (Chapter 15; Blennioidei) examine the congruence of hypotheses derived from allozyme data with other molecular data sets. B. M i t o c h o n d r i a l D N A
The mitochondrial (rot) genome has many properties that make it useful for reconstructing recent phylogenetic history (reviewed by Wilson et al., 1985; Avise, 1994; Simon et al., 1994). The most important feature is its clonal inheritance. Fish mitochondrial genomes are haploid and apparently nonrecombining. The evolution of the molecule therefore corresponds exactly to the model of bifurcating evolutionary trees. Second, mtDNA evolves more quickly than most nuclear genes, allowing the identification of informative phylogenetic characters among even closely related species and populations. Two other features of mtDNA are typically listed as advantages for phylogenetic analysis. First, mtDNA is maternally inherited. Although it is true that mtDNA is predominantly maternally inherited, several instances of heteroplasmy of distinct mitochondrial lineages suggest that this is not strictly, or universally correct (Magoulas and Zouros, 1993). Second, it may no longer be appropriate to consider that substitutions in mtDNA accumulate according to a strictly neutral process.
1. Molecules and Morphology in Studies of Fish Evolution
Patterns of sequence differentiation suggest that selective sweeps may be common (Ballard and Kreitman, 1994), and laboratory experiments have suggested competitive differences among mitochondrial haplotypes (Hutter and Rand, 1995). Whether these departures from neutral evolution invalidate the concept of molecular clocks remains to be seen. Many studies of mtDNA have analyzed restriction fragment length polymorphisms (RFLPs). Whole mtDNA can be digested with specific endonucleases, and the products are then separated by size using gel electrophoresis. In the most comprehensive studies, restriction sites are mapped and their presence or absence (rather than mere sharing of fragment lengths) is scored (Dowling et al., 1990). RFLP studies have been a popular approach in quantifying the degree of divergence within and among populations. In applying this approach to species and higher-level systematic questions, the homology of restriction site characters becomes less certain. A better approach for these comparisons involves direct analysis of DNA sequences.
C. Polymerase Chain Reaction and DNA Sequencing Until the development of the polymerase chain reaction (PCR) (Saiki et al., 1988), sequencing of genes for phylogenetic analysis was rarely performed because of the huge investment required to clone homologous genes from multiple samples. The introduction of primer sequences with wide phylogenetic utility ("universal primers"; e.g., Kocher et al. 1989) allowed the rapid amplification of particular sequences from a large number of samples and helped create an explosion of studies using DNA sequences to examine phylogenetic questions. DNA sequence data have a number of inherent advantages over other kinds of systematic data. First, an essentially unlimited number of sequence characters are potentially available. Fish genomes typically contain on the order of a billion nucleotide pairs, each of which is potentially informative for phylogenetic analysis. Second, these characters are useful for studying relationships among both close and distant relatives. Each gene, as well as individual sites within a gene, evolves at a unique rate because of variation in the level of functional constraint. Slowly evolving genes such as nuclear 18S rDNA may be useful for discerning relationships among highly divergent groups (Hillis and Dixon, 1991). More rapidly evolving areas, such as the mtDNA control region, may be useful for discerning lower-level systematic relationships, such as among populations and species, as shown for percid relationships in the study by Faber and Stepien
(Chapter 9). In coding regions, the variation in DNA sequences may be evaluated among first, second, and third codon positions and at the amino acid level in order to increase potential phylogenetic utility at higher systematic levels. The relative strength of the phylogenetic signal with codon position and between the nucleotide and amino acid levels are critically evaluated by Naylor et al. (Chapter 13) and Lydeard and Roe (Chapter 17).
D. Mitochondrial DNA Sequence Regions Mitochondrial DNA regions have been well studied in fishes, and knowledge of universal primer sequences (e.g., Kocher et al., 1989; Meyer ef al., 1990, Simon et al., 1994; Palumbi, 1996) for amplification by PCR and sequencing has made them very accessible. As illustrated in this volume, they can be effectively used to address many different levels of taxonomic questions, depending on the region sequenced and the use of various correction factors for types and positions of substitutions. Silent sites of mitochondrial protein-coding genes and the nontranscribed control region are shown to be particularly useful for analyzing relationships of recently diverged taxa, such as among populations, species, and genera. In the case of higherlevel systematic questions, silent sites and rapidly evolving regions may have experienced multiple substitutions, obscuring phylogenetic signal. At higher taxonomic levels, more slowly evolving regions, such as the 12S and 16S ribosomal RNA genes may be useful. Alternatively, because substitutions in nonsynonymous nucleotide sites (which alter the encoded amino acids) occur more rarely, these changes may provide a higher signal/noise ratio for deep comparisons. The sequence evolution of mtDNA has been relatively well studied in fishes. Base substitution events occur relatively rapidly. MtDNA structure, gene order, and secondary structure are largely conserved in fishes, as well as in other vertebrates. It is inherited as a single unit and thus has been characterized as sampling a single gene, which is a possible disadvantage that may particularly affect population genetic studies. Because the evolutionary history of a single gene can be different from the average history of an entire genome (discussed by Avise, 1994), caution must be used in interpreting mitochondrial gene trees as reflecting the history of populations. The cytochrome b gene is probably the best-studied mitochondrial gene in fishes (e.g., Kocher et al., 1989; Meyer et al., 1990; Carr and Marshall, 1991; Block et al., 1993; Zhu et al., 1994; Carr et al., 1995). Like most mitochondrially encoded proteins, it is a transmembrane protein important in the respiratory chain of cellular
4
CAROL A. STEPIEN AND THOMAS D. KOCHER
metabolism. Although it has been widely used, some have questioned the ability of this sequence (especially short subsets of the gene) to resolve phylogenies (Martin et al., 1990; Graybeal, 1993; Meyer, 1994). In this volume, mtDNA sequences from the cytochrome b gene are used to analyze a variety of levels of relationships ranging from population genetics to higher-level systematics. For example, Bermingham et al. (Chapter 8) use cytochrome b data to assess population genetic and phylogeographic questions in tropical damselfishes of the Abudefduf saxatilis species group. Cytochrome b sequences are used to analyze relationships among species and groups of sand darters (family Percidae) (Wiley and Hagen, Chapter 6), among species of salmonids (Phillips and Oakley, Chapter 11), among members of the family Fundulidae (Cyprinodontiformes) (Bernardi, Chapter 12), and among lamniform sharks (Naylor et al., Chapter 13). At higher taxonomic levels, Lydeard and Roe (Chapter 17) test the use of cytochrome b to analyze relationships among actinopterygian fishes, revealing strong phylogenetic signal. By examining their data using different codon positions, Lydeard and Roe achieve greater utility at higher taxonomic levels than does Bernardi (Chapter 12). Mitochondrial ribosomal genes (12S and 16S rDNA subunits) are often used to study more distantly related taxa. Substitutions in the small subunit (12S) accumulate relatively slowly, approximating the average for the entire mitochondrial genome, whereas those in the large subunit (16S) evolve even more slowly (Simon et al., 1994). The 12S rDNA gene is used by Stepien et al. (Chapter 15) to examine relationships among species, genera, tribes, families, and suborders of blenniiform fishes, showing strong utility at these different levels and congruence with morphological-based hypotheses. Stepien (12S; Chapter 15), Orti (12S and 16S, Characiform fishes; Chapter 14), and Parker (16S, Cyprinodontiformes; Chapter 11) evaluate differences in the amount of phylogenetic signal among stem and loop regions of the ribosomal genes, reporting a greater retention of the phylogenetic signal at higher taxonomic levels in the more slowly evolving stem regions and more useful characters at lower taxonomic levels in the more rapidly changing loop regions. The mtDNA control region is involved in the control of mtDNA replication and RNA transcription. It is also called the displacement loop (D-loop) because one of the two strands of the helix is displaced by the synthesis of a new strand during replication. The highly variable left domain region has been believed to be largely selectively neutral, which may account for its very rapid rate of variation. In fishes, the control region is usually long (e.g., 888 to 1223 bp in percids; Faber and Stepien, Chapter 9) and often contains tandemly repeated segments. There is a set of conserved se-
quence blocks that are probably involved in controlling mtDNA replication and transcription, which may be useful for some systematic studies (see Attardi, 1985; Lee et al., 1995; Faber and Stepien, Chapter 9). The highly variable control region has thus been a popular sequence for examining population structure and relationships among closely related species of fishes (e.g., Meyer et al., 1990; Arnason and Rand, 1992; Sturmbauer and Meyer, 1992, 1993; Brown et al., 1993; Stepien, 1995; Lee et al., 1995). In this volume, Sturmbauer et al. (Chapter 7) employ sequence data from the control region to address phylogenetic questions and models of adaptive radiation and biogeography of cichlid fishes in Lake Tanganyika, Africa. Nielsen et al. (Chapter 5) utilize control region variation to discern patterns of geographic structure in the Pacific trout Oncorhynchus mykiss. The utility of control region sequences for discerning higher-level relationships is critically evaluated by Phillips and Oakley (Chapter 11) and by Faber and Stepien (Chapter 9). Although some areas of this rapidly evolving sequence are alignable even among distantly related fishes (see Lee et al., 1995), the high rate of evolution of this sequence appears to preclude analyses beyond the level of closely related species and perhaps genera. E. N u c l e a r
DNA Sequences
Several nuclear DNA regions have been used to address systematic questions among fishes. One of these is the major histocompatibility complex (MHC) used by Klein et al. (Chapter 16) to examine evolutionary hypotheses of the haplochromine flock of cichlids in Lake Victoria, East Africa. MHC molecules are believed to play a central role in the vertebrate immune system by presenting peptides to T lymphocytes, thereby initiating immune response cascades. Because MHC molecules are well known due to their role in the immune system and are highly variable, they also offer a wealth of potential systematic information. There are two classes of MHC molecules (I and II), which each consist of two polypeptide chains (a and b), but differ in structure and function (Bjorkman and Parham, 1990). Klein et al. (Chapter 16) use examples from classes I and II to test phylogenetic utility among recently diverged fish species as well as at higher phylogenetic levels. They also address whether selection causes sequence and allele frequency convergence in MHC genes. Stepien et al. (Chapter 15) compare sequence-based trees of blennioid fishes derived from the nuclear internal transcribed spacer (ITS)-1 region of the ribosomal array (Stepien et al., 1993) with trees produced from mitochondrial 12S rDNA gene sequences. A much greater number of variable characters is obtained using mtDNA 12S gene than was found from the nuclear
1. Molecules and Morphology in Studies ofFish Evolution
ITS-1 region (Stepien et al., 1993), suggesting that nuclear ITS sequences are best used for studying deeper divergences. In contrast, Phillips and Oakley (Chapter 10) find nuclear rDNA spacers to be most useful at lower taxonomic levels (interspecific and subspecific levels). These results suggest that the ITS-1 region may evolve at different rates in different fish groups. Other chapters explore the utility of new genes for phylogenetic analysis. Parker (Chapter 11) tests the relative degree of phylogenetic signal among first, second, and third codon positions of the nuclear tyrosine kinase gene X-src sequences for resolving relationships among the cyprinodontid killifishes. Orti (Chapter 14) compares nuclear DNA sequences from the protein-coding gene ependymin (a major glycoprotein component of the extracellular fluid in the brain of fishes) with mitochondrial 12S and 16S rDNA sequences to test the evolution of characiform fishes at various hierarchical levels. Much work remains in identifying a standard set of nuclear genes for phylogenetic analysis of fishes. F. O t h e r N u c l e a r Techniques
The introduction of PCR opened other avenues for the analysis of genome sequences. We touch here on two popular methods: randomly amplified polymorphic DNAs (RAPDs) and microsatellite polymorphisms. The RAPD method primarily detects sequence changes within the annealing sites of PCR primers, resulting in the presence or absence of amplification products from a particular locus. RAPD polymorphisms usually have a pattern of dominant inheritance (Williams et al., 1990) and can be used to screen for differences among individuals, populations, and species. Sultmann and Mayer (Chapter 4) employ RAPDs to identify polymorphic loci in cichlid groups, followed by locus-specific DNA amplification and sequence determination of the fragments. In this way, they avoid problems with determining homology of fragments among species. They find a large number of insertions and deletions (some of which are species specific) that can be treated as characters along with nucleotide substitutions. Their phylogenies show considerable congruence with morphological hypotheses and other molecular studies. They conclude that RAPDs are able to detect polymorphisms among closely related taxonomic groups, ranging from populations to genera. Microsatellite DNAs are highly variable, tandemly repeated DNA sequences with unit repeats one to six bases in length. Length polymorphisms arising from variation in the number of repeats are quantified by sizing PCR-amplified copies of the locus on a polyacrylamide gel. Microsatellites are abundantly distrib-
5
uted throughout the nuclear genome and are highly polymorphic. They follow a Mendelian codominant inheritance pattern. Microsatellites have been widely used to analyze mating systems and population genetic structure (Queller et al., 1993), despite the fact that their pattern of mutation is still poorly understood (Jarne and Lagoda, 1996). In Chapter 5, Nielsen et al. examine the biogeographic variation of nuclear microsatellite repeats in Pacific trout, O. mykiss, in comparison with mtDNA control region sequences. Although their mtDNA data show significant latitudinal and longitudinal correlations, microsatellite data are only weakly associated with longitude (and not at all with latitude). These differences suggest that the evolutionary processes resulting in phylogeographic patterns of genetic variation differentially affect the mitochondrial and nuclear genomes. Kornfield and Parker (Chapter 3) test the utility of microsatellite loci for examining relationships within a rapidly evolving species flock (the mbuna of Lake Malawi), in comparison with results from allozyme, mtDNA RFLP, mtDNA sequence, nuclear DNA sequence, and RAPDs data sets. They conclude that microsatellites are the first class of molecular markers to possess sufficient power to elucidate that level of evolutionary history. Sultman and Mayer (Chapter 4) compare microsatellite allele size frequencies among cichlid species from Lake Victoria. In total, these results suggest that microsatellite loci are applicable to species- and population-level work in rapidly evolving groups, as exemplified by the adaptive radiations of the Cichlidae. G. A L o o k to the Future
Although new kinds of polymorphisms will be identified as we come to understand the structure of genomes, there is some hope that the techniques used to study these polymorphisms have stabilized. Most investigators are now directly examining DNA sequence polymorphisms, the most fundamental unit of molecular variation. PCR and DNA sequencing will likely be the primary tools of molecular systematics in the foreseeable future. We anticipate that the major differences will be increases in length of sequence examined and the number of genetic loci scored.
III. Controversy over Analytical Methods Systematic biology is well known for its vigorous and highly polarized methodological debates. Although much of the acrimony has subsided, strong proponents of distance and cladistic approaches remain. This polarization is strongly correlated with the type of
6
CAROL A. STEPIEN A N D THOMAS D. KOCHER
data sets studied by individual scientists. Morphologists have generally rejected distance approaches. Molecular systematists appear relatively flexible in the approaches taken to recover phylogenetic relationships from their data and have found that the evolution of sequences is often most easily modeled with distance methods. Still, character-state analyses of molecular data abound, and we should be careful not to equate molecular studies with distance analyses or morphological studies with cladistic analyses.
A. Cladistic Approaches The rise of cladistic methodology, as proposed by Hennig (1950, 1966) and popularized by Wiley (1981), has greatly contributed to the development of systematics from a collection of ad hoc procedures to a respectable science. Cladistics has markedly increased objectivity for interpreting the evolutionary history of characters and testing the relative strength of competing systematic hypotheses. This standard methodology has facilitated the comparison of hypotheses proposed by various investigators and support for different types of data sets. Examples of such comparisons occur in almost every chapter of this volume.
B. Distance Approaches Along with the development of molecular techniques, such as allozyme-isozyme electrophoresis, emerged the use of genetic distances and clustering algorithms which describe the degree of similarity or genetic relatedness among pairs of taxa and summarize this information in a "tree." Distance methods differ from cladistics in that they reduce the difference among each pair of taxa to a single number. Some workers argue that distance methods lose information inherent in the character-state matrix. Others argue that distance methods allow the evolution of the sequence to be more easily modeled. This allows accurate correction for unobserved multiple substitutions (homoplasy) in sequence data that is not possible with other methods. Like character-state methods, distance methods can be bootstrapped to evaluate the internal consistency of data. Recent theoretical work has focused on the calculation of standard errors of distances and branch lengths. Most types of distance trees are constructed with branch lengths that are proportional to the amount of divergence, making it possible to estimate relative times of separation.
C. Distance Corrections, Weighting,
and Clustering Genetic distances may be corrected for the effects of multiple substitutions per site. Methods for correcting
these include the Jukes-Cantor equation (Jukes and Cantor, 1969), which uses a Poisson model to calculate the probabilities of multiple substitutions, assuming equal probability of the type of substitution, no nucleotide bias (same proportions of G, A, T, and C), and that all sites along a sequence have an equal probability of change. Because some or all of these assumptions are violated by most DNA sequence data sets, additional correction factors are often used. The Kimura twoparameter method (Kimura, 1980) allows differential weighting of transition and transversion probabilities. Tamura and Nei's (1993) distance correction is based on the gamma distribution and corrects for nucleotide frequency differences, transition:transversion biases, and variation of substitution rate among different sites. Gamma distances are discussed at length by Kocher and Carleton (Chapter 2). Kumar et al. (1993) suggest that if various distance correction methods give similar results, then the simplest possible model should be used in order to minimize variance of the estimates. They suggest using the Jukes-Cantor or simple pairwise distances in cases when genetic distances are low, as long as substitution rates do not vary among lineages. Differential weighting of characters has been widely discussed (Wheeler, 1986; Swofford et al., 1996). It is clear that data for different nucleotide positions in coding regions, i.e., first, second, and third codon positions, should be analyzed separately because of their distinct patterns of selective constraint. Weighting is a relatively crude way to correct for the variation in rate among sites in noncoding sequences, especially as the pattern of selective constraint for these sequences is poorly understood. Weighting has also been used to model the relative frequency of different types of nucleotide substitution in parsimony analyses (Fitch and Ye, 1991). The advantage of this approach relative to the use of an appropriate distance method is not clear. Clustering algorithms have greatly improved in recent years. Neighbor joining (Saitou and Nei, 1987) is a widely used distance clustering algorithm that allows unequal rates of divergences among lineages. It is no longer necessary (or desirable) to assume that rates of sequence change are constant throughout a phylogeny.
D. Molecular Clocks Use of molecular characters has also been associated with the assumption of a "molecular clock," i.e., that mutations arise at relatively regular, predictable rates (Zuckerkandl and Pauling, 1962, 1965). Today, it is unlikely that any proponents of a universal clock, that ticks at a regular rate across all taxa, remain. Still, most workers accept the idea of local clockswthat rates of evolution within a particular group are relatively
1. Molecules and Morphology in Studies ofFish Evolution
similar. Clocks may be calibrated based on comparisons with taxa having known divergences, using wellcorroborated geological events (such as the linkage of the Isthmus of Panama as a barrier between the Atlantic and Pacific aquatic fauna; see Vawter et al, 1980; Grant, 1987; Stepien and Rosenblatt, 1996; Chapter 8 by Bermingham et al.), or with the fossil record. Dating divergences to the fossil record is complicated by the fact that the actual divergence usually predates its first fossil appearance by an unknown amount of time. Problems with clock calibration are discussed by Bermingham et al. (Chapter 8) and by Stepien et al. (Chapter 15).
E. Combining Data and Testing f o r Congruence There are two primary schools of thought among systematic biologists regarding combining morphological and molecular data. The first is the "total evidence" approach (Mickevich and Johnson, 1976; Kluge and Wolf, 1993) which states that phylogenetic analysis should be performed on a combined data set using all possible evidence. The null hypothesis for this approach is that there are no significant differences or partitions within the data set, i.e., that there is only one evolutionary history for the clade in question. Huelsenbeck et al. (1996) raise the point that estimates from total evidence have less sampling error as separate analyses of data partitions are based on fewer characters. It is advocated that total evidence tests should examine whether different sets of data have significantly different signals and these possible partitions should be tested against the combined data set (de Queiroz, 1993; Bull et al., 1993, Ballard, 1996). The other school of thought states that data sets should be analyzed separately (see Bull et al., 1993; Miyamoto and Fitch, 1995). Relationships among taxa that are congruent in separate analyses are regarded as strongly supported. In other words, the congruence of data from separate sources (such as separate analyses using different genes, or between morphological and molecular data sets) indicates increased support that the relationships are likely to be true. Miyamoto and Fitch (1995) suggest that relationships among taxa that are supported by different independent data sets are particularly robust, equivalent to obtaining independent verification of an experimental hypothesis from a different experimental source. This independent type of verification may be lost in combining data sets. An explicit assessment of congruence versus total evidence approaches is discussed in Chapter 11 by Parker. Parker analyzes problems in systematics of the Cyprinodontiformes by combining morphological characters from Parenti (1981, 1984) along with mo-
7
lecular data, including the nuclear tyrosine kinase gene X-src (Meyer and Lydeard, 1993) and mt16S rDNA sequences (Parker and Kornfield, 1995). He evaluates the methodology for combining data sets and comparing trees, including T-PTP (Faith, 1991) and bootstrap tests (Rodrigo et al., 1993). His conclusions argue for the utility of both combination and congruence approaches. Many of the authors in this volume compare taxonomic congruence between molecular-based and morphological-based hypotheses (e.g., Chapter 9 by Faber and Stepien, Chapter 10 by Phillips and Oakley, Chapter 12 by Bernardi, and Chapter 17 by Lydeard and Roe). Phillips and Oakley (Chapter 10) compare results from morphological and molecular studies of salmonid relationships and conclude that morphological traits suggesting one clade are unreliable. Bernardi (Chapter 12) discerns considerable concordance between molecular data and the definition of subgenera, but is unable to resolve higher-level relationships within the family. Lydeard and Roe (Chapter 17) also find greatest concordance of the two types of data at the lowest levels of the taxonomic hierarchy.
IV. Achievements and Failures of Molecular Systematics The greatest achievement of molecular systematics is the consistent and large set of characters generated for the analysis of phylogenies. The availability of these data has allowed the resolution of many intrageneric phylogenies that had not been previously addressed. Molecular studies have been spectacularly successful at the lowest taxonomic levels, particularly the analysis of relationships among populations or intraspecific phylogeography (Avise et al., 1987; see Chapters 3 through 9 of this volume). Molecular data offer an abundance of characters for studies at this level. Molecular studies have not yet fulfilled their promise for resolving deep relationships. There are two problems holding up progress in this area. First, it can become difficult to identify homology in highly diverged sequences. Alignments of characters becomes more difficult as the sequences diverge, particularly for hypervariable regions of rDNA genes. Hillis and Dixon (1991) have suggested that rDNA sequences beyond about 30% sequence difference should be discarded as unalignable. A better understanding of the relationship between rRNA structure and function would help in the identification of homologous sites. The second problem is "saturation": the equilibrium value of sequence difference that is reached when mul-
8
CAROL A. STEPIEN A N D T H O M A S D. KOCHER
tiple substitutions erase the record of previous substitutions at a site. For DNA sequence data there are only four nucleotide character states, G, A, T, and C, thus base substitutions at single nucleotide sites are often obscured by multiple substitutions at sites (multiple hits). As with morphological data sets, apparent synapomorphies may be the result of homoplastic convergence rather than shared common ancestry. Saturation is apparent in many molecular systematic studies. Claims that a group of taxa radiated rapidly at some time in the past should be scrutinized. It may be that molecular data are saturated and therefore uninformative as to the timing of particular branching events. This problem may be lessened either by examining more slowly evolving sites or by considering the codon as the character (rather than the individual nucleotides; Goldman and Yang, 1994; see Chapter 13 by Naylor et al. and Chapter 17 by Lydeard and Roe). Further studies of mutational processes, and the selective forces underlying variation in rate among sites, are needed. Alternatively, new kinds of data, such as the analysis of positional data, may be needed. Patterns of SINE insertion (Murata et al., 1993) or the order of homologous loci (Boore et al., 1995) provide another approach for resolving deep relationships. Molecular studies have also failed to resolve the phylogeny of some rapidly speciating groups. Even an accurate phylogeny of a gene may not be informative as to the relationships of the species under study. If the gene pools are isolated more rapidly than polymorphisms can be fixed in a lineage, then the reconstructed gene trees may not parallel the evolution of the species (Moran and Kornfield, 1993; Parker and Kornfield, 1997; Chapter 3 by Kornfield and Parker; Chapter 7 by Sturmbauer et al.). Instead, the polymorphisms may be carried through the speciation event and be randomly fixed in the descendant populations (see discussion by Avise, 1994). The solution of this problem may require brute force; the construction of many independent gene trees may uncover the relationships among populations.
1. Integration of Intraspecific Biogeographic Patterns with Studies of Speciation The study of the phylogenetic histories of populations in relation to biogeography has been termed "intraspecific phylogeography" (Avise et al., 1987). Several chapters in this volume specifically address testing these types of phylogeographic questions using fishes. Specifically, Wiley and Hagen (Chapter 6) test geographic distribution and likely histories of vicariance in a southeastern United States percid group, the sand darters. Faber and Stepien (Chapter 9) test for geographic relationship among spawning populations of walleye, Stizostedion vitreum, addressing whether gene flow is decreased due to natal homing. The evolution of species flocks, models of adaptive radiation, and biogeographic barriers are tested by Sturmbauer et al. (Chapter 7) for the cichlids of Lake Tanganyika, Africa. In studies of Panamanian freshwater fishes, Bermingham et al. (Chapter 8) describe very high levels of genetic divergence among populations, postulating that very high levels of phylogeographic structuring may be common in species exhibiting distributions that span large distances across physically isolated drainages. These studies are beginning to shed light on the role of geographic processes in speciation.
2. Reconstruction of Phylogenies among Congeners The now standard methodology of sequencing short stretches of the mitochondrial genome will continue to bear fruit in the analysis of relationships within genera. As outlined by Kocher and Carleton in Chapter 2, these efforts will be most successful for divergences within the last 5 million years. The steady accumulation of these sequences will allow the construction of intrageneric phylogenies for many groups of fishes and will lay the groundwork for studies attempting to understand relationships further back in time.
3. Reconstruction of Higher-Level Relationships Using Longer Sequences V. Eight Promising Directions for Future Research Molecular systematists have been working with DNA sequences for most of the last decade. The basic techniques of PCR and DNA sequencing are firmly established, but how will they be applied in the future? The following areas of molecular systematics may prove especially rewarding in the future.
Continuing advances in DNA sequencing technology suggest that it will be practical to analyze increasingly longer segments of DNA. Up to a point, longer sequences will allow the resolution of more ancient divergences. Hillis (1996) has suggested that sequences only 5000 bp long may be sufficient to accurately reconstruct even complex phylogenies. This seems a good intermediate goal, although additional complete mitochondrial sequences and many more nuclear sequences would be useful for some questions.
1. Molecules and Morphology in Studies ofFish Evolution
4. Analysis of Developmental Homologies at the Molecular Level Developmental biologists are beginning to focus on the analysis of fish development. A recent mutant hunt resulted in the isolation of more than 1500 mutations affecting development of the zebrafish (Haffter et al., 1996; Driever et al., 1996). We suspect that the genetic basis for many morphological differences will be revealed in the near future. Although the impact on the systematics of fishes is difficult to predict, the elucidation of molecular mechanisms generating morphological differences is sure to have an impact on the analysis of such characters. Where it is possible to cross species, it may be possible to identify the number of genes responsible for morphological differences (e.g., Doebley, 1992), quantifying for the first time the number of characters scored in morphological analyses.
5. Interpretation of Hybridization and Species Boundaries Using Abundant Nuclear Markers Habitat disturbance and continued introductions of exotic species will create new opportunities for the hybridization of species. The analysis of introgression in such hybrid swarms will be facilitated by the abundance of new genetic markers now available. Where the taxonomy of natural species has been in debate, these markers will provide new data on the extent of differentiation across the whole genome. The analysis of hybrids may also shed light on selective constraints and the interaction of genes (Kilpatrick and Rand, 1995; Rieseberg et al., 1996).
6. Analysis of the Evolution of Repetitive DNA Families Although most systematic analyses have focused on sequence variation in single-copy genes, there is some indication that repetitive DNA families offer new and useful tools for identifying relationships (Franck et al., 1994; Elder and Turner, 1994). Sequence variation in tandem and dispersed repetitive DNA may provide new insights in some groups.
9
8. Genomic Organization The increasing availability of genome maps, and even complete DNA sequences, is creating opportunities for the analysis of new characters. For example, Boore et al. (1995) used the pattern of gene arrangements in arthropod m t D N A to study arthropod relationships. O'Brien et al. (1993) proposed the use of a standard set of reference loci in the analysis of genomes, which would make it easy to identify such rearrangements in the nuclear genome. These types of characters may offer the best hope for resolving relationships among ancient lineages and need to be comprehensively addressed in fishes.
VI. A N e w Age of Synthesis Although morphological and molecular traditions have frequently collided in the past, we argue for a more synergistic approach that recognizes the peculiarities and limitations of each kind of data and in which there is an interplay between morphological and molecular studies. All inherited morphological characters have their origin in molecular characters. A record of the history of evolutionary change can be found in both the structure and the genes of organisms. At this point, analytical methods are rapidly increasing in sophistication, enabling us to better quantify rates of evolution and constraints on molecular changes through time. This understanding will lead to more accurate and consistent phylogenetic analyses. When combined with traditional approaches, these data promise to reveal much about the evolutionary forces that have produced the great diversity of modern fishes. This volume illustrates the beginning stages of this process, which is sweeping the field of fish systematics and paving the way to a new understanding of the interplay of genes, development, and selection. This new age of synthesis promises to continue to revolutionize systematics in the 21st century. References
7. Studies of the Molecular Clock in Fishes The mechanisms governing the speed and regularity of molecular clocks are poorly understood. The great diversity of habitat and life history among fishes, coupled with their excellent fossil record, makes this an excellent group with which to study molecular clocks. New insights will arise as rigorous accountings of substitution rate are made in groups of fishes varying in population size, environment, and life history.
Amason, E., and Rand, D. M. 1992. Heteroplasmy of short tandem repeats in mitochondrial DNA of Atlantic cod, Gadus morhua. Genetics 132:211- 220. Attardi, G. 1985. Animal mitochondrial DNA: An extreme example of genetic economy. Int. Rev. Cytol. 93:93-145. Avise, J. C. 1994. "Molecular Markers, Natural History, and Evolution." Chapman and Hall, New York. Arise, J C., Arnold, J., Ball, R. M., Bermingham, E., Lamb, T., Neigel, J. E., Reeb, C. A., and Saunders, N. C. 1987. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics.Annu. Rev. Ecol. Syst. 18: 489-522.
10
CAROL A. STEPIEN AND THOMAS D. KOCHER
Ballard, J. W. O., and Kreitman, M. 1994. Unraveling selection in the mitochondrial genome of Drosophila. Genetics 138: 757-772. Ballard, J. W. O. 1996. Combining data in phylogenetic analysis. Trends Ecol. Evol. 11:334. Bjorkman, P.J., and Parham, P. 1990. Structure, function, and diversity of class I major histocompatibility complex molecules. Annu. Rev. Biochem. 59:253-288. Block, B. B., Finnerty, J. R., Stewart, A. F. R., and Kidd, J. 1993. Evolution of endothermy in fish: Mapping physiological traits on a molecular phylogeny. Science 260:210- 214. Boore, J. L., Collins, T. M., Stanton, D., Daehler, L. L., and Brown, W. M. 1995. Deducing the pattern of arthropod phylogeny from mitochondrial DNA rearrangements. Nature 376:163-165. Brown, J. R., Beckenbach, A. T., and Smith, M. J. 1993. Intraspecific DNA sequence variation of the mitochondrial control region of white sturgeon (Acipenser transmontanus). Mol. Biol. Evol. 10: 326-341. Bull, J. J., Huelsenbeck, J. P., Cunningham, C. W., Swofford, D. L., and Waddell, P. J. 1993. Partitioning and combining data in phylogenetic analysis. Syst. Biol. 42:384-397. Carr, S. M., and Marshall, H. D. 1991. Detection of intraspecific DNA sequence variation in the mitochondrial cytochrome b gene of Atlantic cod (Gadus morhua) by the polymerase chain reaction. Can. J. Fish. Aquat. Sci. 48:48-52. Carr, S. M., Snellen, A. J., Howse, K. A., and Wroblewski, J.S. 1995. Mitochondrial DNA sequence variation and genetic stock structure of Atlantic cod (Gadus morhua) from bay and ofshore locations on the Newfoundland continental shelf. Mol. Ecol. 4:79-88. Carvalho, G. R., and Pitcher, T. J. (eds.) 1995. "Molecular Genetics in Fisheries." Chapman and Hall, New York. de Queiroz, A. 1993. For consensus (sometimes). Syst. Biol. 42: 368-372. Doebley, J. 1992. Mapping the genes that made maize. Trends Genet. 8: 302- 307. Dowling, T. E., Moritz, C., and Palmer, J.D. 1990. Nucleic acids. II. Restriction site analysis. In "Molecular Systematics" (D. M. Hillis and C. Moritz, eds.), pp. 250-317. Sinauer Associates, Sunderland, MA. Driever, W., Solnica-Krezel, L., Schier, A. F., Neuhauss, S. C. E, Malicki, J., Stemple, D. L., Stainier, D. Y. R., Zwartkruis, F., Abdelilah, S., Rangini, Z., Belak, J. and Boggs, C. 1996. A genetic screen for mutations affecting embryogenesis in zebrafish. Development 123: 37-46. Elder, J. F., Jr., and Turner, B. J. 1994. Concerted evolution at the population level: Pupfish HindIII satellite DNA sequences. Proc. Nat. Acad. Science USA 91:994-998. Faith, D. P. 1991. Cladistic permutation tests for monophyly and nonmonophyly. Syst. Zool. 40:366-375. Ferraris, J. D., and Palumbi, S. R. (eds.) 1996. "Molecular Zoology." Wiley-Liss, New York. Fitch, W. M., and Ye, J. 1991. Weighted parsimony: Does it work? In "Phylogenetic Analysis of DNA Sequences" (M. M. Miyamoto and J. Cracraft, eds.), pp. 147-154. Oxford University Press, New York. Franck, J. P. C., Kornfield, I., and Wright, J. M. 1994. The utility of SATA Satellite DNA sequences for inferring phylogenetic relationships among the three major genera of tilapiine cichlid fishes. Molec. Phyl. Evol. 3:10-16. Goldman, N., and Yang, Z. 1994. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11: 725- 736. Grant, W. S. 1987. Genetic divergence between congeneric Atlantic and Pacific Ocean fishes, In "Population Genetics and Fishery Management" (N. Ryman and F. Utter, eds.), pp. 225-246. Univ. Washington Press, Seattle, WA.
Graybeal, A. 1993. The phylogenetic utility of cytochrome b: Lessons from bufonid frogs. Mol. Phyloget. Evol. 2:256-269. Hafter, P., Granato, M., Brand, M., Mullings, M. C., Hammerschmidt, M., Kane, D. A., Odenthal, J., Van Eeden, F. J. M., Jiang, Y.-J., Heisenberg, C.-P., Kelsh, R. N., Furutani-Seiki, M., Vogelsang, E., Beuchle, D., Schach, U., Fabian, C., and N~issleinVolhard, C. 1996. The identification of genes with unique and essential function in the development of the zebrafish, Danio rerio. Development 123:1-36. Hennig, W. 1950. "Grundzuege einer Theorie der phylogenetischen Systematik." Deutscher Zentralverlag, Berlin. Hennig, W. 1966. "Phylogenetic Systematics." University of Illinois Press, Urbana, IL. Hillis, D. M., and Dixon, M. T. 1991. Ribosomal DNA: Molecular evolution and phylogenetic inference. Quart. Rev. Biol. 66: 411-453. Hillis, D. M., 1996. Inferring complex phylogenies. Nature 383: 130-131. Hillis, D. M., Moritz, C., and Mable, B. K. (eds.) "Molecular Systematics," 2nd. ed. Sinaver Assoc., Sonderland, Massachusetts. Huelsenbeck, J. P., Bull, J. J., and Cunningham, C. W. 1996. Combining data in phylogenetic analysis. Trends Ecol. Evol. 11(4): 152-158. Hutter, C. M., and Rand, D. M. 1995. Competition between mitochondrial haplotypes in distinct nuclear genetic environments: Drosophila pseudoobscura vs. D. persimilis. Genetics 140(2):537-548. Jarne, P., and Lagoda, P. J. L. 1996. Microsatellites, from molecules to populations and back. Trends Ecol. Evol. 11(10):424-429. Jukes, T. H., and Cantor, C. R. 1969. Evolution of protein molecules. In "Mammalian Protein Metabolism" (H. N. Munro, ed.), pp. 21132. Academic Press, New York. Kilpatrick, S.T., and Rand, D.M. 1995. Conditional hitchhiking of mitochondrial DNA: Frequency shifts of Drosophila melanogaster mtDNA variants depend on nuclear genetic background. Genetics 141:1113-1124. Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120. Kluge, A. G., and Wolf, A. J. 1993. Cladistics: What's in a word? Cladistics 9:183 - 199. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., Paabo, S. E, Villablanca, E X., and Wilson, A. C. 1989. Dynamics of mtDNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86:6196-6200. Kumar, S., Tajura, K., and Nei, M. 1993. "MEGA: Molecular Evolutionary Genetics Analysis, Version 1.0." Pennsylvania State University, University Park, PA. Lee, W., Conroy, J., Howell, W. H., and Kocher, T. D. 1995. Structure and evolution of teleost mitochondrial control regions. J. Mol. Evol. 41:54-66. Magoulas, A., and Zouros, E. 1993. Restriction-site heteroplasmy in anchovy (Engraulis encrasiocholus) indicates incidental biparental inheritance of mitochondrial DNA. Mol. Biol. Evol. 10(2):319-325. Martin, A. P., Kessing, B.D., and Palumbi, S. R. 1990. Accuracy of estimating genetic distance between species from short sequences of mitochondrial DNA. Mol. Biol. Evol. 7:485-488. Meyer, A. 1994. Shortcomings of the cytochrome b gene as a molecular marker. Trends Ecol. Evol. 9:278-280. Meyer, A., Kocher, T. D., Basasibwaki, P., and Wilson, A. C. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature 347:550-553. Meyer, A., and Lydeard, C. 1993. The evolution of copulatory organs, internal fertilization, placentas, and viviparity in killifishes (Cyprinodontiformes), as inferred from a DNA phylogeny of the tyrosine kinase gene X-src. Proc. Royal. Soc. Lond. B 254:153-162.
1. Molecules and Morphology in Studies ofFish Evolution
Mickevich, M. F., and Johnson, M. S. 1976. Congruence between morphological and allozyme data in evolutionary inference and character evolution. Syst. Zool. 25:260-270. Miyamoto, M. M., and Fitch, W. M. 1995. Testing species phylogenies and phylogenetic methods with congruence. Syst. Biol. 44: 64-76. Moran, P., and Kornfield, I. 1993. Were population bottlenecks associated with the radiation of the Mbuna species flock (Teleostei: Cichlidae) of Lake Malawi. Mol. Biol. Evol. 10:1015-1029. Murata, S., Takasaki, N., Saitoh, M., and Okada, N. 1993. Determination of the phylogenetic relationships among Pacific salmonids by using short interspersed elements (SINEs) as temporal landmarks of evolution. Proc. Natl. Acad. Sci. USA 90:6995-6999. Nelson, J. S. 1994. "Fishes of the World," 3rd. ed. Wiley, New York. O'Brien, S. J., Womack, J. E., Lyons, L. A., Moore, K. J., Jenkins, N. A., and Copeland, N. G. 1993. Anchored reference loci for comparative genome mapping in mammals. Nat. Genet. 3:103-112. Palumbi, S. R. 1996. Nucleic acids II. The polymerase chain reaction. In "Molecular Systematics" (D. M. Hillis, C. Moritz, and B. K. Mable, eds.), pp. 205-221. Sinauer Assoc., Sunderland, MA. Parenti, L. R. 1981. A phylogenetic and biogeographic analysis of cyprinodontiform fishes. Bull. Am. Mus. Nat. Hist. 1658:341-557. Parenti, L. R. 1984. A taxonomic revision of the Andean killifish genus Orestias. Bull. Am. Mus. Nat. Hist. 178:110-214. Parker, A., and Kornfield, I. 1995. A molecular perspective on evolution and zoogeography of cyprinodontid killifishes. Copeia 1995:8-21. Parker, A. and Kornfield, I. 1997. Evolution of the mitochondrial DNA control region in the mbuna (Cichlidae) species flock of Lake Malawi, East Africa. J. Mol. Evol. in press. Pogson, G. H., Mesa, K. A., and Boutilier, R. G. 1995. Genetic population structure and gene flow in the Atlantic cod Gadus morhua: A comparison of allozyme and nuclear RFLP loci. Genetics 139: 375-385. Powers, D. A., and Schulte, P. M. 1996. A molecular approach to the selectionist/neutralist controversy. In "Molecular Zoology" (J. D. Ferraris and S. R. Palumbi eds.), pp. 327-352. Wiley-Liss, New York. Queller, D. C., Strassmann, J. E., and Hughes, C. R. 1993. Microsatellites and kinship. Trends Ecol. Evol. 8:285-288. Rieseberg, L. H., Sinervo, B., Linder, C. R., Ungerer, M. C., and Arias, D. M. 1996. Role of gene interactions in hybrid speciation: Evidence from ancient and experimental hybrids. Science 272: 741-745. Rodrigo, A. G., Kelly-Borges, M., Bergquist, P. R., and Bergquist, P. L. 1993. A randomisation test of the null hypothesis that two cladograms are sample estimates of a parametric phylogenetic tree. New Zeal. J. Bot. 31:257-268. Saiki, R. K., Gelfand, D. H., Stoffel, S., Scharf, S. J., Higuchi, R., Horn, G. T., Mullis, K. B., and Erlich, H. A. 1988. Primer-directed enzymatic amplification of DNA with a thermostabile DNA polymerase. Science 239: 487-491. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Simon, C., Frati, F., Beckenbach, A., Crespi, B., Liu, H., and Flook, P. 1994. Evolution, weighting, and phylogenetic utility of mitochondrial gene sequences and a compilation of conserved polymerase chain reaction primers. Ann. Entomol. Soc. Am. 87(6): 651-701.
11
Stepien, C. A. 1995. Population genetic divergence and geographic patterns from DNA sequences: Examples from marine and freshwater fishes. In "Evolution and the Aquatic Ecosystem: Defining Unique Units in Population Conservation" (J. Nielsen, ed.), pp. 263-287. American Fisheries Soc. Symposium, Bethesda, MD. Stepien, C. A. and Rosenblah, R. H. 1991. Patterns of gene flow and genetic divergence in the Northeastern Pacific Clinidae (Teleosteii Blennioidei), based on allozyme and morphological data. Copeia. 1991(4): 873-896. Stepien, C. A., Dixon, M. T., and Hillis, D. M. 1993. Evolutionary relationships of the blennioid fish families Clinidae, Labrisomidae, and Chaenopsidae: Congruence between DNA sequence and allozyme data. Bull. Mar. Sci. 52(1): 873-513. Stepien, C. A., and Rosenblatt, R. H. 1996. Genetic divergence in antitropical pelagic marine fishes (Trachurus, Merluccius, and Scomber) between North and South America. Copeia 1996(3): 586-598. Stiassny, M. L. J., Parenti, L. R., and Johnson, G. D. (eds.) 1996. "Interrelationships of Fishes." Academic Press, San Diego. Sturmbauer, C., and Meyer, A. 1992. Genetic divergence, speciation and morphological stasis in a lineage of African cichlid fishes. Nature 358:578-581. Sturmbauer, C., and Meyer, A. 1993. Mitochondrial phylogeny of the endemic mouthbrooding lineages of cichlid fishes of Lake Tanganyika, East Africa. Mol. Biol. Evol. 10: 751-768. Swofford, D. L., Olsen, G. J., Waddell, P. J., and Hillis, D. M. 1996. In "Molecular Systematics" (D. M. Hillis, C. Moritz, and B. K. Mable, eds.). 2nd Ed., pp. 407-514. Sinauer Assoc., Sunderland, MA. Tamura, K., and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512-526. Vawter, A. T., Rosenblatt, R. H., and Gorman, G. C. 1980. Genetic divergence among fishes of the Eastern Pacific and the Caribbean: Support for the molecular clock. Evolution 34: 705m711. Wheeler, W. C. 1986. Character weighting and cladistic analysis. Syst. Zool. 35:102-109. Wiley, E. O. 1981. "Phylogenetics: The Theory and Practice of Phylogenetic Systematics." Wiley Interscience, New York. Williams, J. G. K., Kubelik, A. R., Livak, K. J., Rafalski, J. A., and Tingey, S. V. 1990. DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res. 18: 6531-6535. Wilson, A. C., Cann, R. L., Carr, S. M., George, M., Jr., Gyllensten, B., Helm-Bychowski, K., Higuchi, R. C., Palumbi, S. R., Prager, E. M., Sage, R. D., and Stoneking, M. 1985. Mitochondrial DNA and two perspectives on evolutionary genetics. Biol. J. Linnean Soc. 26: 375-400. Zhu, D., Jamieson, B. G. M., Hugall, A., and Moritz, C. 1994. Sequence evolution and phylogenetic signal in control region and cytochrome b sequences of rainbowfishes (Melanotaeniidae). Mol. Biol. Evol. 11:672-683. Zuckerkandl, E. and Pauling, L. 1962. Molecular disease, evolution and genic heterogeneity. In "Horizons in Biochemistry" (M. Kasha and B. Pullman, eds.), pp. 189-225. Academic Press, New York. Zuckerkandl, E. and Pauling, L. 1965. Evolutionary divergence and convergence in proteins. In "Evolving Genes and Proteins" (V. Bryson and H. J. Vogel, eds.), pp. 97-166. Academic Press, New York.
This Page Intentionally Left Blank
C H A P T E R
2 Base Substitution in Fish Mitochondrial DNA: Patterns and Rates THOMAS D. KOCHER and KAREN L. CARLETON Department of Zoology University of New Hampshire Durham, New Hampshire 03824
I. Introduction
differences can be observed in comparisons among species. Probably more is known about evolutionary patterns in animal mitochondrial genomes than for any other DNA sequence. Although some aspects of the substitutional pattern (e.g., the high proportion of transitions) are unique to animal mtDNA, this molecule is still an excellent model system to illustrate the analytic method needed to reconstruct phylogenies from DNA sequence data. This chapter focuses on patterns of mtDNA evolution in cichlid fishes. Examples are drawn from continuing studies of the sene encoding NADH dehydrogenase subunit 2 (ND2) in East African cichlids (Kocher et al., 1995). This data set is particularly useful because it includes a large number of closely related molecules which provide insights into the pattern of substitution usually obscured in comparisons among more highly diverged sequences.
Many of the authors in this volume use mitochondrial DNA (mtDNA) sequences because they are easily accessible, have high rates of evolution, and generally follow a clonal pattern of inheritance well suited to phylogenetic reconstruction (Wilson et al., 1985). This chapter is about the natural history of these sequences. Just as morphological systematists strive to analyze characters for which the pattern of development and effects of the environment are well known, so molecular systematists should begin by understanding the biology underlying the characters they use for inferring phylogenies. By understanding how changes accumulate in sequences, accurate models of substitution can be developed for use in phylogenetic inference. Molecular sequences are deceptively simple in structure. There are just four bases common in DNA. These bases appear to be freely interchangeable, but in fact, mutation interconverts some nucleotides more frequently than others. Selection and drift then act on this spectrum of mutations in such a way as to prevent most substitutions from becoming fixed in the population. Neither mutation nor selection is homogeneous along a sequence of nucleotides; close examination reveals important differences in the pattern of mutation and selective constraint among nucleotide sites. Additional
MOLECULAR SYSTEMATICS OF FISHES
II. S i m p l e M o d e l s of Substitution A. M u t a t i o n a l Models At the core of most phylogenetic reconstruction algorithms is a simplified mutational model of the sub-
13
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
14
THOMAS D. KOCHER A N D KAREN L. CARLETON
A
C
~
G
Purines
T Pyrimidines
FIGURE 1 The substitution model of Kimura (1980) in which the rate of transitions (or) is usually higher than the rate of transversions (13).
stitution process. The simplest model (Jukes and Cantor, 1969) assumes an equal probability of interconversion among all four nucleotides. A consequence of this model is a twofold excess of transversional change (purine ~ pyrimidine) because there are twice as many paths for transversions as for transitions. This model is not adequate for animal mtDNA because a much larger excess of transitions, relative to transversions, is typically observed. Kimura (1980) introduced a two parameter model to accommodate the higher rate of transitions (Fig. 1). This model is also inadequate, as it predicts that sequences at equilibrium will contain equal frequencies of all four nucleotides. A modified Kimura model (Felsenstein, 1986) adjusts the relative rates of transitions or transversions to accommodate the unequal frequency of bases seen in real sequences. More complex models are possible, but the need for a fully elaborated model, with a separate rate parameter for each of the 12 possible kinds of substitution, has not yet been demonstrated (but see Rzhetsky and Nei, 1995).
B. Multiple Hits and Saturation As mutations occur over time, a pair of homologous sequences will become increasingly different. The observed number of differences between these sequences accumulates almost linearly at first. Gradually, however, as some nucleotide sites experience more than one substitution, the observed sequence difference becomes a poor indicator of the actual divergence which has occurred. Eventually, the rate at which new differences arise is equal to the rate at which identical nucleotides arise by multiple substitution. At this point the sequences cannot display greater sequence difference (the sequences have reached "saturation"), even
though additional substitutions continue to occur. The true evolutionary rate is hidden by the occurrence of multiple substitutions at a site. Appropriate statistical corrections can be applied to transform the observed differences into a measure of the total number of changes that have occurred (total divergence, or evolutionary distance). These corrections can be derived for any of the mutational models, but are accurate only in the early stages of differentiation, before saturation has been closely approached. Furthermore, these corrections are accurate only if all of the nucleotide sites are evolving according to the same substitutional model.
C. Selectional Filter Although mutational models have been widely used to describe the process of substitution, they ignore the influence of selection, which may be the dominant force regulating change in real sequences. It is easy to show, by comparison of nucleotide substitution rates at silent and amino acid replacement sites, that selection filters out more than 90% of all mutations which occur in mtDNA. Any concordance between the predictions of mutational models and the evolution of real sequences is therefore fortuitous. Most simple models assume that substitutions occur randomly among sites following the Poisson distribution. Numerous demonstrations of the inadequacy of this model have been published (Fitch and Markowitz, 1970; Uzzell and Corbin, 1971; Kocher and Wilson, 1991). Substitutions do not occur with equal probability at each site. Instead, selection resists substitution at some sites, while allowing mutations at other sites to become fixed. A better model of this process uses a gamma distribution (Bliss and Fisher, 1953; Tamura and Nei, 1993), or a covarion model (Fitch and Markowitz, 1970; Miyamoto and Fitch, 1995), to allow rates of substitution to vary among nucleotide sites. The gamma distribution models have been mathematically formulated so that it is straightforward to correct distances for multiple hits (Tamura and Nei, 1993), but this is not yet possible for the covarion model. Few studies have attempted to estimate either the gamma parameter or the size and exchange rate of the covarion. It is important to remember that estimates of these parameters must be made from close relatives, as they provide the best information to quantify the process of substitution, free from the effects of multiple substitution. For protein-coding sequences, it is possible to classify sites a priori according to the known selective constraints of the coding function. At the very least, it is recognized that first, second, and third positions of co-
2. Base Substitution in Fish mtDNA dons evolve at different rates, because of the redundant structure of the genetic code, and the grouping of functionally similar amino acids according to the second base of the codon. Because the functional constraints on rRNA sequences are poorly understood, it is more difficult to assign sites to particular rate classes a priori. Models of evolution for these genes typically resort to a purely statistical representation of the sites.
III. Evolution of Real Sequences To evaluate which theoretical models provide the most appropriate basis for phylogenetic reconstruction, the evolution of real sequences must be quantified. Here we examine a set of 56 mitochondrially encoded NADH subunit 2 (ND2) sequences (348 codons) obtained from 45 species of cichlid fish, mostly from East Africa. The most divergent comparisons involve New World species which presumably diverged from the African lineages more than 60 million years (MY) ago. The most closely related sequences are intraspecific polymorphisms differing by just a few nucleotides. Those sequences not already reported in Kocher et al. (1995) are deposited in GenBank. Ideally, we would plot the divergence of molecules with respect to geologic times of divergence. For these fishes, however, few reliable divergence times are available. Instead we will use the proportion of third position sites which have experienced a transversion as a measure of divergence. Transversions occur relatively rarely and in a nearly Poisson fashion (Irwin et al., 1991). These divergences are corrected for multiple substitution using a two-state model [d = -0.5 in (1-2Q), where Q is the observed proportion of transversions].
A. Changes in the Third P o s i t i o n o f C o d o n s Many substitutions at the third positions of codons are synonymous (i.e., do not change the amino acid sequence of the encoded protein) and thus escape selection on protein structure. These sites therefore provide the most direct view of the mutational process. Although these sites are often thought to evolve according to a purely mutational model, some selective constraint does exist (Perna, 1996; Xia et al., 1996). While it would be inappropriate to equate substitutions at third positions with mutation, these sites approximate the underlying mutational spectrum more closely than the first or second positions. The dominant feature of mtDNA evolution is the high rate of transition substitutions relative to transver-
15
sions. At third positions the ratio of transition: transversion differences is at least 5:1 initially (Fig. 2), consistent with a strong transition bias in the underlying mutation process. As transitions begin to occur repeatedly at the same sites, the ratio of transitions: transversions observed in pairwise comparisons drops. At a 10% transversion difference, the ratio is only 2.5:1, and in the deepest comparisons it drops to 1:1. At a 10% transversion difference, the actual number of transition substitutions that have occurred is at least twice as great as the observed number of differences. The transition: transversion ratio is thus one way to quantify the degree of multiple substitution that has occurred since the common ancestor of two sequences. Base composition influences the maximum observed difference. Figure 3 shows the accumulation of the two kinds of transitions possible: those involving the purines (A and G) and those involving the pyrimidines (C and T). It is interesting to note that the initial rate of transitions is the same for the two types of nucleotides. The purines, however, show saturation at a lower level of divergence than the pyrimidines. This pattern arises because the frequencies of A and G are much more unequal than the frequencies of C and T. At third positions the proportions are A,G,C,T: 0.32, 0.05, 0.38, 0.26. The maximum divergence of two sequences is calculated as I - probability of chance identity. For the purines described earlier, where only two states are possible (e.g., A or G), this is calculated as
(f6)2 (fA) 2 dma x
-
1 -
fc +fA
fC +fA
o
(1)
The very unequal frequencies of A and G allow a maxim u m difference of just 23% instead of the 50% that would be expected given equal frequencies of the two nucleotides. For C and T, the maximum difference is higher, about 48% (Kocher et al., 1995). These differences explain why the purine transitions reach saturation before the pyrimidine transitions. These mitochondrial sequences approach saturation rapidly. Evidence of multiple substitutions is quite apparent at only 2% transversion difference. The mammalian fossil record suggests that this corresponds to about 2 MY of divergence (Irwin et al., 1991). The fossil record of cichlids is more difficult to interpret, but a similar rate does correlate well with the geologic history of East Africa (Kocher et al., 1995). The fact that saturation effects begin to arise after just 2 MY of divergence underscores the importance of corrections for multiple substitution when constructing phylogenies of more distantly related taxa.
16
THOMAS D. KOCHER AND KAREN L. CARLETON 30 ~e~
9
9
9
-
~o
9 o9 9
"#-
9
~Z --
i n m i n e
r~ tO
9
9
9 9
GO
Gig 9O ; ~ 9 9
9
9
8o
9
9
9
9
9149
9
9
9 ~
9
~
9
9
e
OO
9 O OOOO
9
9
-00O N ~ O O ~ I
9O
9
O ~ ~ m . gOB ~ O ~ go ~ ~ ~ 9~ ~ ~
20-
~1761769
mOO
9 0 9 ooo H a D O9 9~ O ~ O O O ~ ~ O O O O~ ~ ~ U~ O o
o~
alJo ~176176
9
%
9
9 9
9 Oo
9
9
~
iSiSlX~
~ 9 O N B
~ i o
i i
o
~o~o
o
o ~~ , o ooo~9
r.f) t-
m0 U
0
o 9
9 1 4 9 aBOO 9
z..
,~o"
tO
o~ooO 9 ~IN~OO
U) 0
gmmB
"0
o')
m O0~O
10-
9
080
.o 9 So 9 4, 8, o% 9 m
T ;O O R
o #B m o m D
o o
I
I
~ c~
Corrected
I
~ c~
3rd
~ d
position
I
~. 6
d
transversions
FIGURE 2 The accumulation of transition differences at the third position of codons in the ND2 gene. The pairwise differences among 56 sequences representing 4 5 species of cichlid fish are plotted. The x axis is the observed proportion of third position differences corrected for multiple hits according to a Poisson model [(x = - 0 . 5 l n ( 1 - 2(proportion of differences))].
B. Changes in First and Second Positions and Amino Acid Substitution At the first and second position of codons, selection dominates the substitution process. This is apparent from the rate of transition substitution, which is 6and 15-fold slower at first and second positions, respectively, than the rate at third positions (Fig. 4). Because there is no reason to suspect a slower mutation rate at these sites, the difference must arise because selection prevents fixation of most mutations. Selection also constrains the maximum amount of difference that is observed between two sequences. Second positions plateau at approximately 3% transition differ-
ence, while first positions plateau at about 8%. The comparable value at third positions is 25%. Selective constraint has a strong effect on base composition, which differs among the three codon positions. First positions are relatively rich in GC because of the high leucine and alanine content of the ND2 protein. Second positions show a high proportion of T and C (37.9 and 30.4%, respectively), probably because hydrophobic amino acids required for this membranespanning protein are encoded by either C or T at the second position (Naylor et al., 1995). Probably the most important characteristic of selective constraint is that it varies from site to site along the molecule according to the structural function of the
0.45
0
0 ~ ~176176176 o @ o o o o
0.35 + 0 ~ 0 0
0
0.3 r~
= 0.25
o TC
r~
r..,
9 AG
0.2
0.15
9 *~ p-w.,,.
0.1
~
9
0.05
04 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Corrected T v FIGURE 3 Differences in the accumulation of transition differences involving purines ( 0 ) and pyrimidines ( 0 ). Because the frequencies of A and G are very unequal, saturation for sequence difference occurs earlier, and at a lower value, than for C and T.
0.15
0.08
E
.2 o C x_.
o
0.10 1 o
E
o
.2 m
0
E . 01 "~ E t~ x_
0.06-
C
0.04 -
0 ~0
0
O~
013
I O ~0
m:) aD CD~OO O Or ~300 O CXD~I31D OD OQ3 Or O(D O D O O3~UU30 O ~3
0
O O O O OCDO
O
ID Ol1~
0.05 O r'~
0 r'~
0.02- 1 D O
"O c"
0.00 - ~
o
o
,
CM I
I
0.00
I
3rd
position
transversions
3rd
0.I00
I
I
/
position
transversions
0.04
.2
C 0
o
o
X--
0.075 -
> r c
oo
Or
0.03 -
0 C~330 0
C. . . . . . . . . . ~ 0
0
0.02 C . n0 .n
0
0.025
o
0 GI.
0
0 0 ~ 0
L.
0.050 C 0 .I 0
I
o
C
> o'} c
- '"~ ' v " ' ~
o
0
O0
~
0
C~ O
O0
0
l
0.01 -
I O 0
o
O0 0
0.000
I
3rd
position
I
I
transversions
0.00
i
3rd
I
position
I
I
transversions
FIGURE 4 Accumulation of sequence difference at the first and second codon positions. The rate of accumulation of transition substitutions is 6- and 15-fold slower, respectively, than at the third positions.
18
THOMAS D. KOCHER AND KAREN L. CARLETON
.
4 2
Position
in
ND2
FIGURE 5 Proportion of variable sites in each segment of the ND2 protein. An index of probable helical structure is presented above the graph. Variable sites are clustered in the membrane-spanning helices of the protein.
encoded protein. Some positions in the protein can accept substitution relatively easily, whereas selection acts to prevent substitution at others. In ND2, this pattern of constraint can be visualized by plotting the proportion of variable sites along the molecule (Fig. 5). For many mitochondrially encoded proteins, the membrane-spanning helices of the protein experience high levels of substitution, whereas the turns between these helices appear relatively conserved (e.g., Irwin et al., 1991). The structure of ND2 is less well known, but it appears that regions of the protein outside the membrane may be constrained by selection for appropriate contacts with other proteins. If selection did not alter substitutional probabilities among sites we might expect to model the occurrence of multiple hits using a Poisson distribution. In fact, real molecules rarely fit a Poisson. Table I shows the number of changes per site estimated by parsimony over a tree for 10 closely related cichlids. The 72 inferred substitutions are distributed among just 53 of the 348 positions in the sequence. These data do not fit a model in which substitutions occur randomly along the sequence. Rather, the significant deviation from Poisson expectation demonstrates that the substitutions are clustered at a relatively small number of sites. It appears that selection is preventing substitutions at a large fraction of sites. Because the magnitude of selective constraint is likely to be a continuously distributed variable, it is most appropriate to model this variation as a gamma distribution. The gamma distribution uses an additional parameter to quantify the variance in rate among sites. Most real sequences can be modeled rather well with this distribution.
Fitch and Markowitz (1970) coined the term covarion to describe the set of "concomitantly variable codon positions." In their view, only a small proportion of the codons for a protein can experience an amino acid substitution at any given instant (the covarion). Substitution is thought to slowly alter the selective constraints so that the sites making up the covarion change over time (Fitch, 1971). The data in Table I can be made to fit a Poisson model, if it is assumed that the covarion con-
TABLE I Estimated Distribution of the N u m b e r of A m i n o Acid Substitutions per Site Estimated over a Parsimony Tree of 10 East African Cichlid Taxa a,b N u m b e r of changes per site over tree
Assuming 348 variable sites Observed Poisson expectation ,u = 22.32, p << 0.001 Assuming 100 variable sites Observed t ,2 = 0.1713, p > 0.9
0
1
2+
295 282.96
37 58.54
16 6.48
47 48.68
37 35.05
16 16.28
aThe 10 species and their assumed relationships are (((((Rhamphochromis sp., (Lethrinops aurita, Pseudotropheus zebra)), Gnathochromis pfefferi), (((Callochromis macrops, Opthalmotilapia ventralis) , Xenotilapia flavipinnus), Perissodus microlepis)), Limnochromis auritus), and Paracyprichromis brieni). bThe total length of the tree was 72 substitutions, including three sites that experienced at least 3 substitutions. Data fit a Poisson distribution of substitutions among sites, if it is assumed that only 100 of the 348 sites can accept a substitution.
2. BaseSubstitution in Fish mtDNA sists of only 100 of the amino acid sites. Selection may be acting to prevent subsitutions at the other 48 sites in the molecule. Miyamoto and Fitch (1995) demonstrate that the covarion model is more biologically correct than models based on the gamma distribution. It remains to be seen whether the difference between the covarion and the gamma models is important to phylogenetic analysis and whether the covarion model can be mathematically formulated for convenient use.
IV. Implications for Phylogenetic Reconstruction A. Choosing a Model There are two important decisions to be made when attempting to recover phylogenetic information from an analysis of molecular sequences. First, a substitutional model must be chosen which accurately models the evolution of the molecule under study, including whether rates vary among sites. Second, a particular tree-building algorithm must be selected. Although much has been written about the accuracy of various tree-construction algorithms, the choice of a substitutional model probably has a far greater impact on the phylogenetic conclusions. Since the pattern of evolution varies among genes (e.g., nuclear versus mitochondrial), it is important to adjust the substitution model for each analysis. Some tree-building algorithms allow this modeling more easily than others because the assumptions are more explicit or the model more easily adjusted. In general, it is easier to adjust the evolutionary model using methods which first calculate a distance statistic. Probably the best is the gamma distance developed by Tamura and Nei (1993), which incorporates transition/transversion bias, compositional inequalities, and variance in rate among sites. Once accurate distances are calculated, almost all tree algorithms will yield the correct topology (Huelsenbeck and Hillis, 1993). The neighborjoining algorithm (Saitou and Nei, 1987), for example, can construct trees from distance matrices calculated according to many different models. There are several advantages to focusing on the development of a distance matrix prior to initiating tree construction. First, calculation of the distance matrix focuses attention on the model of sequence evolution to be used. Second, it is usually more efficient to estimate topologies from a distance matrix than by searching for minimum evolution trees among a universe of possible topologies.
19
Third, it is easier to calculate standard errors for each divergence than to evaluate the support for nodes using a bootstrap approach. This is particularly true for sequences that are near saturation.
B. Power A n a l y s i s for m t D N A Phylogenetics Because of the heterogeneous nature of evolution along a DNA sequence, sites differ in the amount of information they carry about a particular phylogenetic relationship. It is important to determine which sites and classes of change give the greatest signal/noise ratio for testing a particular phylogenetic hypothesis. Characters must be chosen to match the time scale of the divergences being studied. An analysis using third position transitions to evaluate basal (60-80 MYBP) relationships in the Cichlidae would be fundamentally flawed, since these substitutions are completely saturated after about 10 MY of divergence (Fig. 2). Neither would it be appropriate to focus exclusively on third position transversions to recover relationships among close relatives because of the large amount of information that would be lost by not including transitions. We wish to know which characters from the ND2 sequence give the greatest power for resolving relationships among cichlids at various divergences. First, we must specify the model for the evolution of bases in the sequence. Because of the importance of both base composition and variance in rate among sites, the TamuraNei gamma-distance model is the most appropriate for these data. This model describes how sequences diverge through time by calculating the expected proportion of transitional differences for purines (P1) and pyrimidines (P2) and the expected proportion of transversional differences (Q). The mean values of these expected differences are used, assuming that rates vary among sites according to a gamma distribution [Tamura and Nei (1993) Eqs. (12-14)]:
12gAgG{ XR
gR --
[
a
q- gy ( ~2
]
__ a a + 2(gR ~1 + gy~)t a )a} a + 2~t
2gwgc{ _ [ a ] gy gY a + 2(gy a 2 -~-gR~)t
a
-Jr-XR( a + a2~t )a} I (a + a2~t )al'
-- 2gRgY 1 --
(2)
(3)
(4)
20
THOMAS D. KOCHER AND KAREN L. CARLETON
mulating during a time interval, At, from
~dPl dP2~ aP = \
B
C
1,
~
2,
3, or
(8)
dt + dt / a t
dQ A Q = -d-~ At .
IOMY
X
FIGURE 6 Hypothetical phylogeny used in the power analysis, Taxa B and C share a period of common ancestry (1, 2, 5, or 10 MY) after their divergence from taxon A. Subsequent analyses examine the power of different classes of mtDNA character to resolve this relationship after different periods of terminal divergence (x).
(9)
Typical parameters for vertebrate evolution can be obtained from the cichlid ND2 sequences as follows. Gene frequencies are determined from the sequence data. If we assume m
n
m
we can solve for a and fl using Eq. (5) plus Eq. (6) and Eq. (7) to get the following equations: where gA, go, gc, and gT are the nucleotide frequencies, a is the g a m m a parameter, C~l and ~2 are the rates for transitional changes for purines or pyrimidines, and fl is the rate for transversional change. These equations give the t i m e - d e p e n d e n t differences between two sequences. We use the equations for P and Q rather than s and v as they clearly represent data regarding the saturation of sequence differences resulting from multiple substitutions. We consider the power of the first, second, or third position transitions or transversions to identify a 1 MY period of shared ancestry between sequences B and C, after their divergence from sequence A (see Fig. 6), at various times in the past. This is done by first calculating the n u m b e r of sequence differences accumulating in that 1 MY interval, at various times in the past. In order to calculate the n u m b e r of differences, AP or AQ which accumulate within some time interval we need to calculate the derivatives of the previously described equations:
m
4gAge f ~(XR~I 4- gyfl) gR [
dP1 dt
[
a
a + 2(gR~l
]a+l
+ gyfl)t
_ -
[
gwfl
dt
a a + 2(gy~2 4- gR-fi)t
dQ dt = 4gRgYfl
--gRfl
[
a a 4- 2fit
a ]a+l} a + 2-fit
]al
"
and
Using Figs. 2 through 4, we can estimate d P / d t and d Q / d t at time t = 0. The parameters determined from the ND2 data are given in Table II. G a m m a values are taken from an analysis of m a m m a l i a n cytochrome b genes (Irwin et al., 1991). Figure 7 plots the n u m b e r of sequence differences accumulating in the 1 MY interval of interest at various times in the past. We have calculated the average value of Ap or AQ over the I MY time interval. Clearly, third position transitions contribute the greatest n u m b e r of changes in the first few million y e a r s m m o r e than all the other characters combined. With time and the effects of saturation, the accumulation of third position transition differences decreases. After only 5 MY, the combination of changes at first and second positions with third position transversions
a + 2 f- i t
__ [
]a+X
4(gAga + gwgc)
TABLE H
__ 4XTXCxy{(XY~2 4- XR~) [
=
dQ _ dt ( t = O) fl = . 4gRgY
Parameters for the Estimation of Expected Sequence Divergence and Associated Errors a
]a+l}a
(5) dP2
dP d t ( t = O)
(6) (7)
Based on these derivatives, we can calculate the n u m ber of transitional or transversional differences accu-
Parameter freq A, gA freq G, ga freq C, gc freq T, XT gamma, a dP/dt (t = 0) dQ/dt (t = 0) ]~
First position
Second position
Third position
0.295 0.196 0.331 0.178 0.4 1.7 0.5 3.64 0.50
0.160 0.113 0.354 0.373 0.25 0.7 0.2 1.17 0.25
0.318 0.048 0.377 0.257 5 10 1 22.3 1.08
aIn accordance with the Tamura-Nei gamma distance model discussed in the text.
2. Base Substitution in Fish mtDNA
netic signal uniting taxa B and C will be hidden by the variance associated with the long terminal branches. These crossing points have been calculated for various periods of common ancestry (1, 2, 5, and 10 MY) for data sets of various sizes (102 codons representing typical cytochrome b fragment data, 348 codons representing the cichlid ND2 data set, 1000 codons representing approximately three mitochondrial genes, and 3754 codons representing the complete coding regions for human mitochondrial DNA) as a function of the terminal branch length. Figure 9 summarizes the calculations in a format useful for planning phylogenetic studies. To use the figure, first determine the approximate time of divergence for the taxa under study (either through fossil evidence or by estimating third position transversion difference from a pilot study). Next, identify the curve associated with the desired degree of resolution (1, 2, 5, or 10 MY between divergences). Finally, read the number of codons which must be sequenced from the y-axis.
9 Total
<> l A P 9 1AQ 2AP | 2AQ 3AP
9
9 3AQ
15
Z 9
9 9
9
9
9
"".,-.:;;.......... 9
o
"
9
9
9
*"
|
~''
:
:
5
~ 10
15
21
20
40
--
35
--
30
--
Time (MYA)
FIGURE 7 The number of sequence differences accumulating in a 1 MY interval at various times in the past and observed at time zero. Third position transitions contribute the greatest n u m b e r of differences over the recent past, but the sum of transitions and transversions at first and second positions soon dominates.
AP
-"
G(P)
contributes a greater number of differences in a 1 MY interval than do third positions transitions. This plot represents only the signal, however. There will also be variability in the differences between the sequences resulting from the variance contributed by the differences estimates from the long terminal branches of the tree. This variability will contribute noise which, if large enough, will obscure the signal we hope to observe. The noise is determined from the standard deviation which is simply related to the variance. The binomial variance and standard deviations for P and Q are given by var(P) = var(Q) -
P(1
-
P)
Q(1 - Q)
o-(P) = )nP(1
-
25
t,., o
20
15
10
P) o
tr(Q) = )nQ(1 - Q)
~
0
I
I
I
I
J
2
4
6
8
10
Time(MYA)
Figure 8 plots both the number of changes contributed in a I MY interval at various times in the past and the standard deviation associated with differences accumulated on the terminal branches. The crossing points of the curves give an indication of when the phyloge-
FIGURE 8 The number of differences attributable to a 1 MY period of common ancestry ( 9 compared with the standard error associated with the changes on the terminal branches (el). The noise accumulated on the terminal branches will s w am p the signal arising from shared ancestry after only 2.5 MY. Plotted is 1.96 times the standard error for third position transition differences.
22
T H O M A S D. KOCHER A N D K A R E N L. C A R L E T O N
o //~
IT o
:
3000
e
;
" :"
l
' ;
I
:J
i
;I
:
:
i
;
;'
/
"
:
.~
"
-: .9
i ,'
i
i
"
,
,'
...
.:"
~
I
,
:"'"
'
I
i
.
2000
/
'
~
o
,
;
/
l
I
/
:9 I
,,:
i
/
I
I
I
i
"
0
i
I
/"
;
/
/
/'
I
I
I
I
I
I
I
4000
~
3000
II! ,~0
:
.:i
-
,
i :
2000
//
9
I
1
: '
!
/
/ ,
;
:
/
I
I..~: /
~a...~ ~
2P
O l
I
I
I
,a" I
I
.,,
2000
lil
_.~,~
:I
1ooo
I
! :i
0 O
I O
I O
r
I O
Oh
I O
~l"
I O
~
I O
~
I O
I~
Terminal
I ~
O0
3" I I O
O~
I 0
0
branch
5
/ 6
O
,-~
10
/
:tti I I O
I
9
2
l
". : ::,l
-
I
9
"::
3000
2Q
I
I O r
length
'
. ,a
:
I O
O~
I O
~
I O ~
' I O
~
I O
I~-
I' ~
O0
0
O~
(MY)
FIGURE 9 Power analysis for mitochondrial sequence data. The situation considered is shown in Fig. 6. The n u m b e r of codons required to detect 1, 2, 5, or 10 MY periods of c o m m o n ancestry, at v a r i o u s times in the past. The base composition and rates of evolution of the cichlid ND2 gene (Table II) are assumed. The curves represent the n u m b e r of codons required so that I SD of the estimate of terminal branch length is equal to the m e a n n u m b e r of differences arising along the internal branch of the tree. Data are shown for transitions (AP) and transversions (AQ) in first (1), second (2), and third (3) positions. The x axis assumes a transversion divergence of 1% per million years. A spreadsheet is available from the authors for calculating crossing values for taxa with radically different base compositions or rate matrices.
2. Base Substitution in Fish mtDNA Several findings can be gleaned from these graphs. First, it is quite surprising that, even with the complete ND2 sequence, none of the characters have power to resolve a 1 MY difference beyond about 2.5 MY (only transitions at third positions are effective over this period). More slowly accumulating differences (e.g., changes at first and second positions), although not saturated (Fig. 4), are not effective in resolving the relationship because of the large variance associated with the small number of substitutions. Second, the greatest statistical power, even for deeply divergent lineages, is found in the third position transversion data. This is because third position transversions accumulate relatively rapidly and do so in a nearly Poisson fashion. The slower rates, and greater variation in rate among sites for other characters, sharply reduce their power. The third surprising finding is that mtDNA data have the potential to resolve rather deep divergences. Complete mitochondrial coding sequences should be able to resolve a 10% difference in divergence times, even 75 MY in the past. However, 100 codons give only coarse resolution for very recent events. We expect that this figure will be accurate for most vertebrate mtDNAs. Our preliminary examination of other taxa suggests that base composition is an important determinant of the power for resolving short time periods of shared ancestry. However, it is clear that a variation in rates among sites is the single most important factor affecting the performance of a given sequence in phylogenetic analysis. Differences in the assumed gamma parameter can have a dramatic effect on the accuracy of distance estimation.
V. Conclusions A. Choice o f m t D N A Sequence Characters It is well known that third position transitions, although useful for recent divergences, saturate rapidly. It is somewhat surprising that transversions at the third position outperform changes (both transitions and transversions) at the first and second codon positions. The advantage seems to arise because third position transversions approximate a Poisson distribution, whereas strong selection at first and second positions creates a large variance in substitution probability among sites.
B. Power Analysis Poor resolution of branching order deep in the tree is a common result in phylogenetic analyses. Investi-
23
gators frequently attribute this result to rapid radiation of the taxa in question. The alternative, that data are simply insufficient to resolve even moderately spaced speciation events, is rarely considered objectively. The graphs in Fig. 9 are a good tool for gauging the power of a particular data set to resolve closely spaced bifurcations. The steep slopes of the curves suggest that many published studies may have failed to gather sufficient data to detect even a 10% difference in divergence times.
C. A d e q u a c y o f M u t a t i o n a l Models Selection dominates the substitution process at almost all sites in the mitochondrial genome. It is therefore surprising that the models of sequence evolution most often used in phylogenetic analysis still focus on the mutation process. Use of mutation-based models is probably not appropriate for anything but third position transitions (Xia et al., 1996). Approaches that explicitly model the selection filter should be pursued, as they may allow modeling of sequence divergence further back in time.
D. Reality of Rate Variation Suggestions have been made that the rate of mitochondrial DNA evolution varies among lineages (e.g., Martin et al., 1992; Bermingham et al., 1997). Attention has begun to shift toward identifying physiological correlates which might account for this rate variation (Martin and Palumbi, 1993; Rand, 1994). While rate variation may well exist, the search for the causes of such variation may be premature. The rate of substitution is the result of a delicate interplay of forces, including mutation, selective constraints arising at both molecular and organismal levels, and population-level events. We should be careful not to jump from poorly supported correlations to hypotheses of causation. It is hoped that this chapter will promote the development of a consistent set of evolutionary rate estimates, incorporating corrections for both unequal base composition and patterns of selective constraint. When used in relative-rate tests, or well-substantiated absolute calibrations of evolution rate, these estimates may lead to new hypotheses on both the generality of mitochondrial clocks and the forces which regulate their ticking. A further benefit is that the refinement of substitution models will improve our ability to reconstruct phylogenetic relationships among species.
Acknowledgments Many thanks to those who helped focus our thinking, including N. Perna, and several classes of Zoology 715. This work was supported in part by NSF Grant BSR-9007015.
24
THOMAS D. KOCHER A N D KAREN L. CARLETON
References Bermingham, E., McCafferty, S. S., and Martin, A. P. 1997. Fish biogeography and molecular clocks: Perspectives from the Panamanian Isthmus. In "'Molecular Systematics of Fishes'" (T. D. Kocher and C. A. Stepien, eds.). Academic Press, San Diego. Bliss, C. I., and Fisher, R. A. 1953. Fitting the negative binomial distribution to biological data. Biometrics 9:176-200. Felsenstein, J. 1986. DNAML-DNA Maximum Likelihood Program. PHYLIP Manual, version 3.2. Department of Genetics, University of Washington, Seattle. Fitch, W. M. 1971b. The nonidentity of invariable positions in the cytochromes c of different species. Biochem. Genet. 5:231-241. Fitch, W. M., and Markowitz, E. 1970. An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem. Genet. 4:579-593. Huelsenbeck, J.P., and Hillis, D. M. 1993. Success of phylogenetic methods in the four-taxon case. Syst. Biol. 42:247-264. Irwin, D. M., Kocher, T. D., and Wilson, A. C. 1991. Evolution of the cytochrome b gene of mammals. J. Mol. Evol. 32:128-144. Jukes, T. H., and Cantor, C. R. 1969. Evolution of protein molecules. In "Mammalian Protein Metabolism" (H. N. Munro, ed.), Vol. III, pp. 21-132. Academic Press, New York. Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitution through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120. Kocher, T. D., Conroy, J. A., McKaye, K. R., Stauffer, J. R., and Lockwood, S. F. 1995. Evolution of the ND2 gene in East African cichlids. Mol. Phylogenet. Evol. 4: 420-432. Kocher, T. D., and Wilson, A. C. 1991. Sequence evolution of mitochondrial DNA in humans and chimpanzees: Control region and a protein-coding region. In "Evolution of Life: Fossils, Molecules, and Culture" (S. Osawa and T. Honjo, eds.), pp. 391-413. Springer-Verlag, Tokyo.
Martin, A. P., Naylor, G. J. P., and Palumbi, S. R. 1992. Rates of mitochondrial DNA evolution in sharks are slow compared with mammals. Nature 357:153-155. Martin, A. P., and Palumbi, S. R. 1993. Body size, metabolic rate, generation time, and the molecular clock. Proc. Natl. Acad. Sci. USA 90: 4087-4091. Miyamoto, M. M., and Fitch, W. M. 1995. Testing the covarion hypothesis of molecular evolution. MoI. Biol. Evol. 12:503-513. Naylor, G. J., Collins, T. M., and Brown, W. M. 1995. Hydrophobicity and phylogeny. Nature 373:565-566. Perna, N. T. 1996. Patterns of base composition within and between animal mitochondrial genomes. Ph.D. Thesis, University of New Hampshire. Rand, D. M. 1994. Thermal habit, metabolic rate and the evolution of mitochondrial DNA. Trends Ecol. Evol. 9:125-131. Rzhetsky, A., and Nei, M. 1995. Tests of applicability of several substitution models for DNA sequence data. Mol. Biol. Evol. 12: 131-151. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406 -425. Tamura, I., and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512-526. Uzzell, T., and Corbin, K. W. 1971. Fitting discrete probability distributions to evolutionary events. Science 172:1089-1096. Wilson, A. C., Cann, R. L., Carr, S. M., George, M., Jr., Gyllensten, U. B., Helm-Bychowski, K. M., Higuchi, R. G., Palumbi, S. R., Prager, E. M., Sage, R. D., and Stoneking, M. 1985. Mitochondrial DNA and two perspectives on evolutionary genetics. Biol. J. Linn. Soc. 26:375-400. Xia, X., Hafner, M. S., and Sudman, P. D. 1996. On transition bias in mitochondrial genes of pocket gophers. J. Mol. Evol. 43:32-40.
C H A P T E R
3 Molecular Systematics of a Rapidly Evolving Species Flock: The mbuna of Lake Malawi and the Search for Phylogenetic Signal IRV KORNFIELD and ALEX PARKER Department of Zoology and School of Marine Sciences University of Maine Orono, Maine 04469
I. I n t r o d u c t i o n
Lake Malawi, an ecologically diverse assemblage of approximately 300 species, most still to be formally described. Geological studies (Scholz and Rosendahl, 1988; Gasse et al., 1989; Owen et al., 1990) and comparative molecular analyses (Kornfield, 1978; see later) strongly suggest that the mbuna are of extremely recent vintage. This chapter briefly describes the natural history of this endemic fauna, reviews previous molecular studies designed to understand relationships within the group, considers microsatellite markers, and evaluates their utility for exploration of mbuna interrelationships. Initial recognition of the special nature of the mbuna was made by Regan (1921) and Trewavas (1935), who defined numerous genera based on earlier collections made by Christy. The mbuna are a diverse assemblage of lithophilous, algae-grazing fishes, currently allocated among 13 genera (Moran and Kornfield, 1993). A general overview of the unique trophic attributes of some of these species was provided by Fryer (1959a,b) and later expanded by Fryer and Iles (1972) to include the first integrated evolutionary treatment of the fauna. Numerous workers have subsequently described new taxa, occasionally discussing variation within particular lineages (e.g., Marsh et al., 1981; Lewis, 1982; Rib-
One of the most difficult problems in phylogenetic analysis is the reconstruction of rapid radiations involving large numbers of taxa. If sequential speciation events occur quickly, even assuming that divergence occurs solely via bifurcation (versus, e.g., hybridization), only minimal signals or synapomorphic traces will be present. Given sufficient time, of course, independent lineages so created will become recognizable due to the acquisition of divergent character states. However, although derived diagnostic characters may be of taxonomic utility, as autapomorphies they are phylogenetically uninformative. Most difficult are those systems in which rapid radiation has occurred very recently; in such cases, even autapomorphies may be absent. Reconstruction of the phylogenetic history of several of the species flocks (Mayr, 1984) of Old World haplochromine cichlids presents precisely such a challenge. This chapter considers the mbuna 1 cichlid fauna of 1The Malawi rockfishes are known as mbuna (Chichewa) in the territorial waters of Malawi and Mozambique,and as vindongo (Swahili) in Tanzanianwaters.
MOLECULAR SYSTEMATICS OF FISHES
25
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
26
IRV KORNFIELD A N D A L E X PARKER
bink et al., 1983a; Reinthal, 1990a). General understanding of evolutionary dynamics and ecology followed the detailed field studies of Ribbink et al. (1983b) who, for the first time, described the extent of lakewide variation present in putative conspecifics. Subsequent descriptive accounts by Konings (1990) and DeMason (1993) have suggested the presence of numerous additional taxa in previously unexplored regions of the lake. Although it is clear that they form a monophyletic group (Moran et al., 1994), few detailed phylogenetic hypotheses regarding relationships among mbuna have been articulated (Fryer and Iles, 1972; Oliver, 1984; Reinthal, 1987). The principal reasons for the absence of such hypotheses are the paucity of morphological synapomorphies (Reinthal, 1987) and the general propensity for convergent and/or parallel evolution among cichlids (Eccles and Trewavas, 1989; Lazzaro, 1991). A statistically robust phylogeny is critical to meaningful discussion of patterns of species diversification and ecological radiation in the mbuna; without such a phylogeny, all evolutionary patterns, and the mechanisms suggested to have engendered them (Liem, 1980; Dominey, 1984; Greenwood, 1984; Mayr, 1984; Turner, 1994), remain hypothetical. For example, homogeneity of coloration is taken by some authors to imply conspecificity among allopatric populations (see Lewis, 1981; Ribbink et al., 1983b); coloration, however, may be a convergent characteristic (Eccles and Trewavas, 1989; McElroy et al., 1991) or may be constrained by species packing in isolated communities (D. McElroy and I. Kornfield, unpublished observations). Similarly, many morphological correlates of trophic specialization (Reinthal, 1990a,b) may be convergent.
II. Molecular Investigations Reconstructing the evolutionary history of the mbuna by molecular means (Kornfield, 1991) has proved extremely difficult, due principally to the recency of their radiation. Isozyme analyses have provided compelling evidence of reproductive isolation among extremely similar forms (Kornfield, 1978; McKaye et al., 1982, 1984), but for a variety of reasons, allozymes cannot be used to reconstruct evolutionary relationships. In particular, the degree of variation present at individual loci is limited; most exhibit only a few electromorphs and no taxon-specific (diagnostic) alleles are present. Importantly, the limited number of characters provided by allozymes (as either loci or alleles) severely constrains objective statistical evaluation of these data.
III. Mitochondrial D N A and Ancestral Polymorphisms With the advent of mitochondrial DNA (mtDNA) analysis in evolutionary studies (reviewed in Avise, 1994), it was hoped that basic problems of cichlid phylogenetics would be resolved. This hope was founded on properties of mtDNA that were expected to allow greater resolution than previously achieved in molecular systematic studies; particularly important are high mutation rates (Brown et al., 1979) and selective neutrality or near-neutrality of many nucleotide positions (see Niki et al., 1989; Rand et al., 1994). Indeed, some systematic problems, intractable using allozyme analysis, yielded solutions with mtDNA (Avise et al., 1986; Seyoum and Kornfield, 1992a). Comparative studies of relationships within the Malawi cichlid fauna by restriction (Moran et al., 1994) and sequence analysis (Meyer et al., 1990) of mtDNA provided an unambiguous definition of portions of the endemic radiation. In particular, the basic dichotomy between mbuna and most of the remaining haplochromines was supported, as were the affinities of a few well-differentiated lineages (summarized in Meyer, 1993). Additionally, previously unsuspected relationships between the mbuna and some taxa of very divergent morphology and habitat (e.g., Alticorpus and Lethrinops) were revealed (Moran et al., 1994). Despite these successes, resolution of relationships within the mbuna remained problematic. Two major mtDNA lineages (or and/3), differing by 1.5% sequence divergence, were identified within the mbuna, but the affinities of the taxa involved appeared aberrant; several species were more closely related to taxa in other genera than to congeners. The solution to this problem was obtained when sample sizes were increased: the mbuna retained an ancestral mtDNA polymorphism (Moran and Kornfield, 1993). Comprehensive mtDNA restriction analysis demonstrated that both c~ and/3 mtDNA haplotypes were present in geographically defined populations of a number of common mbuna species. Thus, the relationships among these taxa based on mtDNA (Moran et al., 1994) represent a gene tree (Pamilo and Nei, 1988) in which lineage sorting (Avise et al., 1984) is incomplete; clearly these data should not be interpreted as a species tree. The extent of this polymorphism among the mbuna has been examined using sequence analysis of the dloop region (Parker and Kornfield, 1997). This study confirms the existence of the polymorphism and defines several divergent gene lineages based on site-specific transitions (Table I). With a few interesting exceptions (see later), most mbuna species are polymorphic. It is instructive to consider what the consequences
3. Molecular Systematics of mbuna Polymorphic Nucleotide Positions in the mtDNA Control Region of Six mbuna Species a
TABLE I.
Nucleotide position b
Pseudotropheus zebra "BB" 1 P. zebra "BB" 2 P. zebra "black dorsal" Labeotropheus trewavasae L. fuelleborni I L. fuelIeborni 2 Melanochromis auratus M. parallelus I M. parallelus 2
1
2
2
3
7 9
4 7
7 0
4 2
T C C C T C C C C
T C C C C C C T C
C T T T C T T C C
C T T T C T T C C
aFrom Parker and Kornfield, 1997. bN u m b e r i n g of nucleotide positions follows Bowers et al. (1994).
of complete lineage sorting would be for phylogenetic analysis of this fauna. Two species of mbuna possess only one of the major haplotype lineages: all studied individuals (n = 30) of Pseudotropheus zebra "black dorsal," an undescribed species endemic to the Maleri Islands, exhibit the ]~3 mtDNA haplotype (Moran and Kornfield, 1995; Parker and Kornfield, 1997, as do all individuals (n = 99) of Melanochromis auratus (Bowers et al., 1994; Parker and Kornfield, 1997). Both cr and ]~J3 mtDNA lineages are present in populations of Pseudotropheus tropheops "black" (Moran and Kornfield, 1993). If P. tropheops "black" were to lose the ]3 lineage, perhaps by genetic drift (Moran and Kornfield, 1995), subsequent mtDNA comparisons would unite M. auratus and P. zebra "black dorsal" to the exclusion of P. tropheops "black," resulting in a phylogenetic hypothesis entirely at odds with available morphological and ecological data. Thus, if lineage sorting was complete in the mbuna, the reconstructed tree would provide completely misleading results. McMillan and Palumbi (1995) reached a similar conclusion regarding Pacific butterflyfishes, suggesting that the random assortment of shared ancestral lineages may blur the evolutionary history of newly formed species. Additionally, variation in rates of population growth can generate significant differences in lineage diversity among taxa (Penny et al., 1995). For mtDNA to be phylogenetically informative in such cases, lineages must independently acquire new mutations in addition to simple reduction, via genetic drift, of the number of ancestral haplotypes present. Given the recent derivation of this fauna, and presumed rates of mtDNA mutation, such differentiation is unlikely within the mbuna.
27
IV. A l t e r n a t e M o l e c u l a r A p p r o a c h e s
If mtDNA cannot provide sufficient information to resolve relationships within the mbuna, are there other molecular systems that can be exploited? It is important, of course, that such alternate approaches provide information useful beyond simple taxonomic discrimination. At least three classes of molecular markers might potentially be so employed. Variation in randomly amplified polymorphic DNAs (RAPDs) (Williams et al., 1990) has been exploited to identify subspecies of Oreochromis niloticus by Bardakci and Skibinski (1994). These taxa can not be differentiated by allozyme analysis, but do possess diagnostic mtDNA restriction profiles (Seyoum and Kornfield, 1992a,b). Although both RAPDs and mtDNA provide diagnostic markers for subspecies, neither data set provides unambiguous statistical support for relationships among the taxa. The extent of sequence divergence between subspecies (Seyoum and Kornfield, 1992a) is about half of that observed between a and ~ lineages in the mbuna; it is thus possible that RAPDs could be used to infer relationships among mbuna species. This approach is compelling from a statistical perspective because the products of a large number of random primers can be characterized. However, the extent of RAPD variation within natural populations of cichlids has not yet been examined and may be considerably greater than that seen in the inbred aquaculture strains examined to date. Further, the technical problems associated with RAPD analysis (e.g., Ellsworth et al., 1993), as well as ambiguity of locus homology between more distantly related taxa, suggest a cautious approach to the use of such data. As an alternative, DNA sequences associated with individual RAPD primers can be characterized. However, an initial examination of such data suggests that the phylogenetic signal thus obtained may be limited in very closely related taxa such as mbuna (Sultmann et al., 1995). Extensive sequence variation has been demonstrated in mbuna at major histocompatibility complex (Mhc) loci (Klein et al., 1993; Ono et al., 1993). This extremely variable system has been useful in the population genetic analysis of mammalian taxa (Klein, 1986). Significant impediments exist, however, in applying this system to mbuna phylogenetics. As in mtDNA, some portion of Mhc variation can be thought of as a retained ancestral polymorphism; allelic variants are shared by multiple taxa and predate speciation. More importantly, because much of this variation may be maintained by selection (Hughes and Nei, 1989), polymorphism will be retained. Finally, sequence variation
28
IRV KORNFIELD A N D ALEX PARKER
will be evolutionarily interpretable only if Mhc loci can be distinguished individually; previous studies have identified up to eight alleles per individual (Klein et al., 1993; Ono et al., 1993), indicating coamplification of a minimum of four loci. Assuming identification of individual loci and examination of only selectively neutral (fourfold degenerate) codon positions, rates of substitution must still have been sufficiently high to outpace speciation, if they are to allow discrimination of mbuna interrelationships. At this time it is unknown whether the Mhc system will fulfill all of the necessary criteria. More generally, sequence variation in other portions of the nuclear genome could potentially provide information sufficient for phylogenetic analysis; mutational mechanisms and methods for analysis of DNA sequences are well understood for nuclear genes. Variation at protein coding genes would, in general, be expected to provide insufficient sensitivity for the examination of closely related species, given selective constraints on molecular function and the paucity of variation in structural gene products revealed by electrophoretic studies. Although exon sequences are thus expected to be conservative, intron sequences may potentially exhibit more informative variation. Phylogenetic examinations of intron sequences are few in number to date, but interspecific differentiation in introns may be limited among closely related taxa. For example, comparisons of intron variation among humpback whale populations, which have very distinctive mtDNAs, were noninformative (Palumbi and Baker, 1994). In Old World cichlids, sequence comparison of the second intron of the growth hormone gene in Old World cichlids enabled discrimination of the three major tilapiine genera, but provided no information relevant to relationships among congeners (I. Kornfield and R. Mulrenin, unpublished observations). Use of anonymous single-copy DNA markers constitutes an additional approach to capturing genetic variation (Karl et al., 1992). However, observed allelic variation has been limited, even in extensively studied populations (Karl and Avise, 1993), and substantial methodological difficulties (Hare et al., 1996) may compromise their routine application. More generally, it is not clear whether introns (or exons) of other nuclear genes possess the high rates of mutation required for successful phylogenetic analysis of closely related taxa like the mbuna. Indeed, because mbuna mtDNA retains ancestral polymorphisms as described earlier, it seems unlikely that autosomal nuclear loci, with fourfold greater effective population sizes (Avise, 1994), would have undergone allelic fixation within lineages. Various classes of repetitive DNA (Charlesworth et al., 1994) have been examined in cichlids to assess
their phylogenetic potential. In tilapiines, several families of minisatellite repeats make up a considerable portion of the nuclear genome (Wright, 1989) and exhibit sequence variation both within and among lineages (Franck et al., 1992). Again, however, the phylogenetic signal appears limited to lineages that are demonstrably divergent by conventional molecular methods (Franck et al., 1994). It would therefore seem improbable that this class of markers will offer sufficient sensitivity for phylogenetic analyses within the mbuna.
V. Microsatellite Loci More promising is variation at microsatellite loci. Such loci can exhibit extremely large numbers of alleles, differing in the number of a simple nucleotide repeat unit present (Queller et al., 1993). Allelic variation is thought to be generated by intraallelic polymerase slippage during DNA replication (Bruford and Wayne, 1993), but details of the exact mechanism are not well understood. The fundamental appeal of microsatellite loci for systematic studies of the mbuna is that mutation rates are extremely high, in the range of 10 -3-10-4 per locus per generation (Dallas, 1992; Edwards et al., 1992). Consequently, variation in microsatellite allele frequencies is strongly influenced by mutation, in addition to genetic drift. Thus mutation-driven variation in population allele frequencies can greatly augment that engendered by genetic drift. In contrast, mutations in other classes of polymorphic nuclear markers, such as allozymes or Mhc alleles, occur by the relatively slow process of nucleotide substitution, whereas interpopulation variation in allele or electromorph frequencies occurs principally as a result of genetic drift. In light of the fact that insufficient time has passed for genetic drift to produce complete lineage sorting in mtDNA and the fourfold larger effective population size of diploid, biparentally inherited nuclear genes (versus haploid, maternally inherited mtDNA), the potential for informative variation in these markers may be limited. Mutation of microsatellite alleles has been modeled as a stepwise process, in which an allele consisting of a given number of repeat units may change by either loss or gain of a single repeat (Valdes et al., 1993). This model implies that pairs of alleles differing by a small number of repeats share a more recent common ancestor than those which differ greatly in repeat number; this potential information on relatedness of alleles is utilized by a novel, microsatellite-specific metric of ge-
3. Molecular Systematics of mbuna
netic distance (see later). The stepwise mutation model (SMM) also illustrates the potential for convergent evolution of alleles, as there is no obvious way to determine whether an allele of a given size is descended from an ancestor possessing a greater or lesser number of repeat units. More recent studies, however, suggest that the SMM incompletely describes the mutational process at microsatellite loci. Using data on the human population of Sardinia, DiRienzo et al. (1994) demonstrated that observed allele frequency distributions are better explained by a two-phase model, incorporating infrequent mutational changes of two or more repeat units, than by a model including only single-step mutation. If multistep mutations are common, and microsatellite alleles are therefore often dissimilar in size to their immediate ancestors, than the infinite alleles model (IAM) may be more appropriate. A result consistent with this possibility was obtained by Estoup et al. (1995), who found that observed heterozygosities and allelic diversities in honeybee populations agreed more closely with IAM expectations than with the stepwise model. In contrast to the conclusions of DiRienzo et al. (1994), however, Estoup et al. (1995) suggested that their data deviated from stepwise expectations because most of the loci they studied consisted of complex repeats, whereas the SMM assumes that microsatellite loci are unbroken tracts of a single sequence motif. Determination of which model better describes variation at microsatellite loci has important implications for analysis. If the IAM is superior, then use of a classical measure of genetic distance, such as Nei's distance (Nei, 1987), would be appropriate. If the SMM realistically approximates microsatellite evolution, however, the implication is that there may be more information available in microsatellite data than is embodied in simple allele frequencies. Because closely related alleles are similar in repeat number under the SMM, it follows that the mean number of repeats at a locus should be more similar in recently isolated species or populations than in more distantly related ones. This observation motivated development of the delta-# distance statistic (Goldstein et al., 1995), which incorporates repeat number as well as frequency of alleles. Goldstein et al. (1995) showed analytically that an increase in delta-/~ is linear with time and independent of populations size, as long as populations are in mutation-drift equilibrium. They also empirically evaluated the behavior of delta-#, Nei's distance, and allele sharing in analysis of both human and primate relationships. Their results indicated that delta-/~ is superior for reconstruction of more distant evolutionary patterns (e.g., h u m a n chimpanzee-gorilla relationships), probably because it
29
remains linear over a greater range of divergence, but that the classical distance metrics better reconstructed more recent relationships [e.g., among human populations; see e.g., Bowcock et al. (1994)]. Regardless of the method employed, several characteristics of microsatellite loci may complicate analysis. Most importantly, because microsatellite loci appear to be bounded in size, exhibiting an upper limit in repeat number (Bowcock et al., 1994), any genetic distance metric will eventually reach an asymptote. For the human data considered by Goldstein et al. (1995), the average time to asymptotic distance was estimated to be approximately 20,000 generations; for any particular locus, the duration of linearity prior to the asymptote is proportional to the range of repeat numbers present. Additionally, Rubensztein et al. (1995) suggest that comparison of human microsatellite data to that for other primates reveals the presence of directional mutation pressure at some loci. Ellegren et al. (1995), however, have argued convincingly that this result instead reflects biases in sampling of loci. Finally, the potential presence of null (nonamplifying) alleles (Paetkau et al., 1995; Pemberton et al., 1995) may be problematic, although investigators who are aware of this possibility can take steps to minimize its influence (see later). Two studies have demonstrated the enormous utility of microsatellite variation for analysis of population biology of individual mbuna species (Kellogg et al., 1995; Parker and Kornfield, 1996). In these studies, genotypes at hypervariable microsatellite loci were used to examine the magnitude of polyandry in several species. Although allelic distributions can be used to discriminate among mbuna species (A. Parker and I. Kornfield, unpublished observations), it has not yet been experimentally demonstrated that microsatellite variation can be used for phylogenetic reconstruction in cichlid fishes. This idea is evaluated next.
VI. A Test of the Phylogenetic Potential of Microsatellites Before initiating investigations, it is critical to provide, a priori, a fair test of the ability of microsatellites to provide phylogenetic data relevant to the mbuna. Evolutionary trees can be generated using virtually any data set, so unless relationships among the taxa examined have previously been defined, any result might be viewed as a positive outcome. However, given the rapid mutation rate of this class of markers, it is improbable that their performance could be validated us-
30
IRV KORNFIELDAND ALEX PARKER
ing comparisons among taxonomic groups that have been successfully discriminated using other molecular methods. For example; significant divergence between the mbuna and the other major haplochromine clade of Lake Malawi is recognizable with mtDNA (Moran et al., 1994); these groups would be expected to also differ in microsatellite allele frequencies. However, a simple demonstration of differences does not constitute evidence of phylogenetic information, as microsatellite primers often amplify across very wide taxonomic spectra (Schl6tterer et al., 1991; Levin et al., 1995; P6pin et al., 1995; see later); the major Malawian haplochromine clades are likely far too divergent to retain signal in the form of differences in microsatellite allele frequency or composition. Similarly, comparisons between the major tilapiine lineages, or among species of Oreochromis with previously defined relationships, would probably provide diagnostic allele distributions but not phylogenetic information.
H y p o t h e s i s Formulation A conservative test is used to assess the potential significance of microsatellite variation for phylogenetic reconstruction in the mbuna. Although no explicit phylogeny is available, a number of lineages are unambiguously distinctive. Such lineages were originally recognized on the basis of substantive morphological differences and define the endemic genera (Regan, 1921; Trewavas, 1935). Three distinctive genera which have received considerable morphological and ecological study have been selected (Trewavas, 1935; Ribbink
et al., 1983a,b; Reinthal, 1990a,b): Labeotropheus Ahl, 1926; Melanochromis Trewavas, 1935; and Pseudotropheus Regan, 1921. The major features of these genera are summarized in Table II; representative species are illustrated in Fig. 1 (see color plate). In addition to dentition and osteological characters, note the well-defined differences in color patterns among these taxa; such characters are viewed as central in defining haplochromine lineages (Eccles and Trewavas, 1989). Given that these genera of mbuna are well defined, all contained species should be more similar to congeners than to species of any other mbuna genus. This forms the basis for our test of phylogenetic information content. From each of the three genera, two species were chosen for study: Labeotropheus fuelleborni Ahl, 1926; L. trewavasae Fryer, 1956; Melanochromis auratus (Boulenger, 1899); M. parallelus Burgess and Axelrod, 1976; Pseudotropheus zebra "BB" (Boulenger, 1899), and P. zebra "black dorsal." The two species of Melanochromis belong to the same infrageneric complex (Ribbink et al., 1983b). The genus Pseudotropheus contains a number of well-defined infrageneric complexes, some of them certainly worthy of generic rank (Ribbink et al., 1983b); the species of Pseudotropheus examined here are both members of the P. zebra species complex. P. zebra "BB" is the blue-barred form collected from Thumbi Island West, whereas P. zebra "black dorsal" is an undescribed taxon endemic to the Maleri Islands in southern Lake Malawi (Ribbink et al., 1983b; Konings, 1990). The null hypothesis tested is that all six taxa are equidistant from each other and form a "star" phylogeny (Fig. 2A). This situation would obtain if microsatellite
TABLE H Morphological and Ecological Characteristics of Three mbuna Genera a,b Pseudotropheus Regan, 1921 Oral dentition Ethmovomer Orientation Relative width Gut length (relative to SL) Mouth position Color pattern Feeding posture c Mean niche breadth (algae) Depth distribution Territoriality
Melanochromis Trewavas, 1935
Labeotropheus Ahl, 1927
Bicuspid
Bicuspid
Tricuspid
Horizontal Medium 4.5 + 0.8 Terminal Vertical b a r s Perpendicular 4.21 3-25 m Continuous
Horizontal Small 3.5 + 0.6 Terminal Horizontal stripes Acute 6.65 4-35 m Weak/intermittent
Vertical Medium 5.2 + 0.8 Subterminal Vertical bars Parallel 6.05 0-18 m Variable
The genus Pseudotropheus as currently defined includes a variety of morphologically distinct clades; several are worthy of generic status. Characters typical of the clade are presented, including P. zebra [see Reinthal (1987)for discussion]. bData compiled from Trewavas (1935),Fryer (1959b),Ribbink et al. (1983b),Reinthal (1987),I. Kornfield (personal observations). cBodyposition relative to rock surface when grazing on epilithic algae. a
3. Molecular Systematics of mbuna
A
maximum information content; clearly weaker alternatives could also be constructed.
R zebra 'black dorsal'
L. fuelleborni
31
R zebra "BB'
VII. Materials and Methods L. trewavasae
M. auratus M. parallelus
B
R zebra 'black dorsal'
L. fuelleborni
R zebra "BB'
L. trewavasae
M. auratus M. parallelus
C
R zebra 'black dorsal' R zebra "BB'
L. fuelleborni
L. trewavasae
M. auratus M. parallelus
FIGURE 2 Hypothetical and actual relationships among three congeneric pairs of mbuna. (A) The null hypothesis (H0): no definable relationships exist among the taxa. (B) The stringent alternative hypothesis (H1): all congeners unite in lineages distinct from other genera. (C) The relationships among six mbuna species based on neighbor-joining analysis of genetic distances inferred from two hypervariable microsatellite loci. Numbers indicate the percentage of bootstrap replicates (out of 100) in which associations were supported.
mutation rates are either excessive or insufficient to provide phylogenetic information. Alternatively, if a phylogenetic signal is present in microsatellite data, each pair of congeneric species should cluster to the exclusion of other tested taxa (Fig. 2B). This formulation of the alternative hypothesis is consistent with
Microsatellite Characterization
All specimens used were wild-caught fishes. Tissue samples of P. zebra "black dorsal" and P. zebra "BB" were collected in August 1988 and December 1990 and field frozen in liquid nitrogen (Moran and Kornfield, 1995); nuclear and mitochondrial DNAs were prepared by density gradient ultracentrifugation (Dowling et al., 1990). All other species were collected in September (L. fuelleborni, M. auratus; Chipoka) or November 1994 (L. trewavasae, M. parallelus; Nkhata Bay) and preserved in ethanol. DNA was prepared from muscle tissue of these specimens using proteinase K digestion followed by phenol/chloroform extraction and ethanol precipitation. Two microsatellite loci were examined in all individuals: UME002 and UME003 (GenBank accessions U14396 and U14397); these loci had been isolated previously from a P. zebra "gold" genomic library using conventional methods (Parker and Kornfield, 1996). These two loci are composed of complex repeats [allele 280 at locus UME002: (GT)12(93bp)(TCTA)19; allele 195 at locus UME003:(AT)6AAA(AT)2ACA(TG)6 (GCGT)13] and are hypervariable, each exhibiting greater than 23 alleles even in small population samples (see later discussion). Microsatellite genotypes for all individuals were determined from 32 P-labeled polymerase chain reaction products separated on 6% polyacrylamide sequencing gels; a 35S-labeled M13 sequencing ladder was used as a size standard on all gels. Allelic compositions for all species were examined for fit to Hardy-Weinberg expectations using a modified version of BIOSYS (Swofford and Selander, 1981; P. Moran, personal communication). Tests of genotypic homogeneity were conducted using the resampling programs of Zaykin and Pudovkin (1993). Genetic distances were calculated using the delta-/_t distance of Goldstein et al. (1995), allele sharing, and Nei's distance; all distance calculations were performed using MICROSAT vl.4 (Minch, 1995). Phylogenetic relationships were inferred from these distances using the neighbor-joining method (Saitou and Nei, 1987), implemented by PHYLIP v3.5 (Felsenstein, 1993); statistical support for elements of the resultant phylogenetic tree was estimated by resampling across individuals (Van Dongen, 1995).
32
IRV KORNFIELD A N D ALEX PARKER
VIII. Results Extensive variation was uncovered for all species at both loci. At UME002, samples of individual taxa possessed approximately 25 alleles, with 64 alleles detected in total (Fig. 3A); UME003 was almost as variable, with each species exhibiting approximately 21 of the total of 48 alleles observed (Fig. 3B). Frequency distributions were multimodal, with extensive gaps separating some size classes of alleles. Heterogeneity of genotypic frequencies among taxa was highly significant for both loci. After adjustment of significance criteria for multiple tests (Rice, 1989), two species deviated significantly from Hardy-Weinberg expectations at UME002 (M. parallelus and L. trewavasae), as did one at UME003 (M. parallelus). In M. parallelus, a null al-
mately 0.398 in M. parallelus. No presumptive homozygous null genotypes were observed at this locus for L. trewavasae, although the sampled population exhibited a large heterozygote deficiency. Although it is possible that this taxon also possesses a null allele, apparent heterozygote deficiencies are not implausible given the sample sizes employed (see later). The topology of the neighbor-joining tree obtained using the delta-/z statistic (Fig. 2C) allows the null hypothesis to be rejected and suggests that phylogenetic
signal is present in the microsatellite data. The pattern of relationship among species was neither a star nor a random combination of taxa. The Labeotropheus lineage was the most supported; the Pseudotropheus lineage had significant definition, but the two species of Melanochromis were not consistently united to the exclusion of other taxa. The same topology was recovered
lele(s) is suspected to be present at locus UME002; a number of individuals showed no amplification at
using allele sharing, but with reduced bootstrap support for both the Pseudotropheus and the Labeotropheus
UME002, but simultaneously expressed genotypes at UME003. Assuming Hardy-Weinberg equilibrium, the overall null allele frequency at UME002 is approxi-
clades. Using Nei's distance, only the Labeotropheus clade received bootstrap support in excess of 50% of replications. In light of the results of Goldstein et al.
0.2I
i}
N=36
Melanochromis paraflelus
!
,, ,, ~
,ihl,h,, ,
,lh,.
i
i ,,,
i
i
i
0.20.H_ 1
,
,
,,,,,,,,iI,,I,, 1I,1II,,
,
i
i-it
'
oo~
i
,
, ,,I,
~
'
~,
~
'
,
,
.,, ,,,...,l,li,l.,,.,..,,, ,
,
,
-
I
i
,
,
Jl
,,
I I I I I l l
I,I,I, ,
,, ,
,
, ,
,
,
,
N=44
i
,, II,I,l,,I i,I ,,I,,, o
Allele
o
size (basepairs)
,
,I,
i
,
,
i
,
I .hd..ll.l,llll.,,, i
,1,
,..
I
i
, .,,II .,I
N=47
JlllJ,,ll., ,
,
,
,
. . . . .
II, ,
,5
f
,
j
,
,, i,l,li,i,,i,,l,hl ,, ,
,
,
'BB'N=26
Psuedotropheus zebra
,
.,
'blackdorsal'N=25
Psuedotropheus zebra
'BB'N=26
,
i
~. I,
2
t ....................... ,,,i e
'blackdorsal'N=25
,
Psuedotropheus zebra
,,I,
i
,,11,,,,, ~
Labeotropheus trewavasae
,
Psuedotropheus zebra
,,
i
Melanochromis auratus
N=47
Labeotropheus trewavasae
o o~
,
,,,,, ll,,,lll, I
,
~
i
o..,
o
,
o
,,,I,1,,, ,,,,li,liI,, ,
Allelesize(basepairs)
FIGURE 3 Allele frequency distributions at two hypervariable microsatellite loci in six mbuna species. Dashed lines enclose saltines separated by large gaps (see text). (A) Distribution of alleles at locus UME002. (B) Distribution of alleles at locus UME003.
i
,
3. Molecular Systematics of mbuna
(1995; see earlier), this suggests that the intergeneric comparison may be at the outer limit of phylogenetic utility for microsatellite data.
33
zygote deficiencies are observed in the absence of null homozygotes.
A. Sampling Considerations IX.
Discussion
This chapter has provided evidence that hypervariable microsatellite loci contain phylogenetic information potentially capable of resolving relationships within the mbuna. This is the first class of molecular markers shown to possess sufficient power to elucidate the evolutionary history of that fauna. The successful definition of relationships clearly depends on the number of loci studied; neither UME002 nor UME003 alone provided resolution equal to that obtained from the two loci considered together. This is consistent with the recommendations of Pamilo and Nei (1988) and with the results obtained by Bowcock et al. (1994), who produced a tree of human racial groups using 30 microsatellite loci; reanalysis using small subsets of these data does not result in consistent resolution (A. Parker and I. Kornfield, unpublished observations). Because the number of loci characterized was small, and the number of mbuna species examined was limited, it is not yet clear at what taxonomic level microsatellites will prove most useful in this assemblage, i.e., microsatellite loci may be of value for defining relationships among congeners or allopatric populations of conspecifics rather than relationships among genera. The current debate over appropriate models of microsatellite evolution (Shriver et al., 1993; Valdes et al., 1993) should not be viewed as a significant impediment to the phylogenetic application of these loci. For example, even if rate heterogeneity is present in the production of different classes of alleles, the relative relationships among taxa should be resolvable, even if absolute distances are biased. Further, the lack of consensus regarding appropriate methods to quantify divergence (Queller and Goodnight, 1989; Goldstein et al., 1995; Shriver et al., 1995; Van Dongen, 1995), and construct evolutionary trees (Swofford and Berlocher, 1987; Crother, 1990), should not proximately inhibit the use of microsatellite variation for phylogenetic study of the mbuna. The presence of null alleles at some loci should not compromise the validity of evolutionary inferences if data analysis is based on genotypic rather than allelic frequencies. When presumptive null homozygotes are observed, null allele frequencies may be estimated by maximum likelihood methods (Raymond and Rousset, 1995); it is at present unclear how allele frequencies should be estimated when hetero-
Of great practical importance is the number of microsatellite loci and individuals that must be sampled to provide reliable resolution of phylogenetic relationships. It is obvious from theoretical considerations (Pamilo and Nei, 1988; Weir, 1990) that resolution is improved as greater numbers of loci are examined; this is particularly true if loci exhibit only moderate levels of allelic diversity (and thus lower information content). The number of individuals that must be sampled depends principally on the number of alleles present at each locus. In addition, large sample sizes may be essential for the detection of outliers (see later). When allelic diversity is low, relatively few individuals need be examined. However, informative variation may exist principally at hypervariable loci, and it is here that large sample sizes are essential to the determination of allele frequencies with statistical confidence (Fig. 4). For example, with an average of 25 alleles per population at UME002 (Fig. 3A), simulations suggest that approximately 115 individuals should be examined from each population (A. Parker, unpublished data; see Fig. 4). These simulations involved iterative sampling of alleles with replacement from randomly generated frequency distributions, with sample-population identity (Nei's unbiased genetic identity; Nei, 1987) calculated after each additional allele was sampled. As individuals were added, sample-population identity was observed to increase rapidly to a value of approximately IN = 0.97, beyond which point improvement in identity was extremely gradual; thus this value (IN = 0.97) was inter-
600
. ~N
9 400
. 9
00
~. E cf)
;
2oo! .4
9 : . .,..:" 9
| ! ] O i 0
9
" ",.:;,., . . . . .
9"
~ .
. 1: |~ q l i ' ! : :~: o - e i.,|,!:..-.,: ;:
9
:"
, . ~ : .
"
" 1 1 1 . " 9
.~-:-,: .-.
.
:....: : : ';~"
9 9" . 0 , , , 0 8 " : ~ .. 0' ,
, ~ "
9
":
|
" "t
~
, : .:: " : " .~
~
9
9 .
. . . . .
"'.
9
9
20
40
60
N u m b e r of Alleles
Simulation of sample size required to produce allele frequency estimates of similarity IN -->0.97 [Nei's unbiased genetic identity; Nei (1987)] to actual population allele frequencies (A. Parker, unpublished data). The empirical relationship derived from these simulations is: Sample Size = 1.5(N alleles)135. FIGURE 4
34
IRV KORNFIELD AND ALEX PARKER
preted to represent the point beyond which additional sampling effort is no longer worthwhile. In the present case, however, only 25 to 47 individuals were sampled per taxon. Thus, the resolution of relationships based on these two loci may be incomplete.
0.t3
0.25
0.15
~ ~t
B. Evolutionary Signals in Microsatellites
011 0.05
There are three independent classes of phylogenetic information that can potentially be gleaned from microsatellite loci. The authors anticipate that these classes will be appropriate for examining relationships at different taxonomic levels. First, allele frequency distributions may be compared using genetic distance metrics based on the stepwise mutation model of microsatellite evolution (Slatkin, 1995; Goldstein et al., 1995). A second class of phylogenetic information, however, may be present in microsatellite allele frequency distributions. Major gaps in allele size distributions may signify unique mutational events. For example, in M. parallelus, alleles of size 150-180 bp are recognized at locus UME002 as representing expansion, via stepwise mutation, from a single allele produced by a distinct, saltatory mutational process (Fig. 3A). This class of alleles is separated from the next smallest allele by 75 bp; it thus conforms to the two-phase mutation model presented by DiRienzo et al. (1994), wherein divergent repeat classes are generated by infrequent large jumps. Machado-Joseph disease also conforms to this model; this pathology appears when a trinucleotide repeat increases by at least 75 bp to form a new allelic class (Maciel et al., 1995). As Maciel et al. (1995) noted, "clustering of expanded repeat sizes is also suggestive of a unique ancient founder mutation." A cladistic perspective, recognizing such novel classes of alleles as discrete characters, is adopted here; in light of the saltatory nature of the mutational events hypothesized to generate them, such characters are called saltines to distinguish them from standard patterns of microsatellite allele variation. Thus, the allelic class centered at 178 bp for UME002 in M. parallelus constitutes an autapomorphic saltine; if shared among independent lineages, saltines may be treated as synapomorphies. In this manner, some aspects of microsatellite allele distributions can be analyzed by standard cladistic methods (Swofford, 1990) rather than by distance approaches. Indeed, if a large number of loci were examined, it would be anticipated that saltines would permit construction of robust phylogenetic trees. However, like ancestral mtDNA polymorphisms, saltines can be retained or lost in multiple lineages. For example, in the human microsatellite data analyzed by Bowcock et al. (1994), locus ms164 has two
0
,
o
-
I
.
I
9
- -
.
Allelesize(bp) FIGURE 5 Distribution of alleles at human microsatellite locus
ms164 (E. Minch, personal communication);this locus was included in the study of Bowcocket al. (1994).The two divergentallelicclasses depicted are shared by a number of diverse human lineages. Aggregate sample size is 250.
allelic classes separated by 16 bp which are present in diverse human lineages (Fig. 5). Inspection of allele distributions at UME003 (Fig. 3B) reveals the presence of two major expansion classes centered around 149 and 201 bp; the smaller expansion class has probably been lost from both M. auratus and P. zebra. Genetic drift may play a major role in molding the distribution of rare expansion classes. If drift were to eliminate relatively infrequent alleles associated with major allelic clusters, e.g., the class centered around 300 bp at locus UME002 (Fig. 3A), such alleles could be regenerated rapidly by mutation. In contrast, if eliminated by drift, variation embodied in saltines would not be regenerated. For example, the absence of the saltatory class centered at 178 bp at UME002 from M. auratus could be due to drift. Indeed, mtDNA diversity is observed to be relatively low in this taxon (Bowers et al., 1994), consistent with the possibility of a recent population bottleneck. Note that it is critical that sample sizes be large enough to reliably detect the presence of saltines that occur at low absolute frequencies in some populations. To date, no one has exploited this class of information to construct phylogenetic trees. Finally, similar to saltines, the ability (or inability) of a given microsatellite primer pair to amplify DNAs from certain taxa can be treated as a cladistically informative binary character and forms a third potential information class. Again, such characters may constitute synapomorphies and can thus be used to infer relationships from a cladistic perspective, although no empirical information about the prevalence of these characters in cichlid fishes can be found. If null alleles are to be employed in this fashion, it is imperative that new flanking primers be designed and used to dem-
3. Molecular Systematics of mbuna
onstrate, by sequencing, that all observed null alleles are due to homologous changes in the original priming sites.
X. Summary The classical methods of molecular phylogenetic investigation, allozyme electrophoresis and mtDNA restriction or sequence analysis, have failed to resolve relationships among members of rapidly evolving species flocks such as the mbuna (Cichlidae) of Lake Malawi. Several classes of nuclear DNA markers may, however, provide greater resolution; most promising are microsatellite markers. The extremely high mutation rates at these loci render them fundamentally different from other nuclear DNA polymorphisms, as changes in allele frequency are influenced by mutation as well as genetic drift. Analysis of two microsatellite loci in three congeneric pairs of mbuna species strongly suggests that these markers can provide phylogenetic information relevant to these recently diverged taxa.
Acknowledgments We are exceedingly grateful to S. Grant, Salima, Malawi, for providing specimens and supporting our research. L. DeMason also supplied critical logistical support. E. Minch kindly shared unpublished human microsatellite data and provided a copy of his program to calculate delta-/z. A. Konings generously permitted reproduction of his mbuna photographs. M. Stiassay is inspirational. We are grateful to the editors and two anonymous referees who provided comments which helped improve this manuscript. This work was supported by NSF EHR91-08766 and NOAA Sea Grant NA36RG0110.
References Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman and Hall, New York. Avise, J. C., Helfman, G. S., Saunders, N. C., and Hales, L. S. 1986. Mitochondrial DNA differentiation in North Atlantic eels: Population genetic consequences of an unusual life history pattern. Proc. Natl. Acad. Sci. USA 83:4350-4354. Avise, J. C., Neigel, J. E., and Arnold, J. 1984. Demographic influences on mitochondrial DNA lineage survivorship in animal populations. J. Mol. Evol. 20:99-105. Bardakci, F., and Skibinski, D. O. F. 1994. Application of the RAPD technique in tilapia fish: Species and subspecies identification. Heredity 73:117-123. Bowcock, A. M., Ruiz-Linares, A., Tomfohrde, J., Minch, E., Kidd, J. R., and Cavalli-Sforza, L. L. 1994. High resolution of human
35
evoutionary trees with polymorphic microsatellies. Nature 368: 455 -458. Bowers, N., Stauffer, J. R., Jr., and Kocher, T. D. 1994. Intra- and interspecific mitochondrial DNA sequence variation within two species of rock-dwelling cichlids (Teleostei: Cichlidae) from Lake Malawi, Africa. Mol. Phylogenet. Evol. 3:75-82. Brown, W. M., George, M. Jr., and Wilson, A. C. 1979. Rapid evolution of mitochondrial DNA. Proc. Natl. Acad. Sci. USA 76:19671971. Bruford, M. W., and Wayne, R. K. 1993. Microsatellites and their application to population genetic studies. Curr. Opin. Genet. Dev. 3: 939-943. Charlesworth, B., Sniegowski, P., and Stephan, W. 1994. The evolutionary dynamics of reptitive DNA in eukaryotes. Nature 371: 215-220. Crother, B. I. 1990. Is "some better than none" or do allele frequencies contain phylogenetically useful information? Cladistics 6:277-281. Dallas, J. F. 1992. Estimation of microsatellite mutation rates in recombinant inbred strains of mouse. Mamm. Genome 5: 32- 38. DeMason, L. 1993. Into Africa: Exporting the Tanzanian coast of Lake Malawi. Cichlid News 2: 22- 23. DiRienzo, A., Peterson, A. C., Garza, J. C., Valdes, A. M., Slatkin, M., and Freimer, N. B. 1994. Mutational processes of simple-sequence repeat loci in human populations. Proc. Natl. Acad. Sci. USA 91: 3166-3170. Dominey, W. J. 1984. Effects of sexual selection and life history on speciation: Species flocks in African cichlids and Hawaiian Drosophila. In "Evolution of Fish Species Flocks," (A. A. Echelle and I. L. Kornfield, eds.), pp. 231-249. University of Maine Press, Orono, ME. Dowling, T. E., Moritz, C., and Palmer, J. D. 1990. Nucleic acids. II. Restriction site analysis. In "Molecular Systematics" (D. M. Hillis and C. Moritz, eds.), pp. 250-317. Sinauer, Sunderland, MA. Eccles, D. H., and Trewavas, E. 1989. "Malawian Cichlid Fishes: The Classification of Some Haplochromine Genera." Lake Fish Movies, Herten, West Germany. Edwards, A., Hammond, H. A., Jin, L., Caskey, C. T., and Chakraborty, R. 1992. Genetic variation at five trimeric and tetrameric tandem repeat loci in four human population groups. Genomics 12:241-253. Ellegren, H., Primmer, C. R., and Sheldon, B. C. 1995. Microsatellite "evolution": Directionality or bias? Nat. Genet. 11:360-362. Ellsworth, D. L., Rittenhouse, K. D., and Honeycutt, R. L. 1993. Artifactual variation in randomly amplified polymorphic DNA banding patterns. Biotech. 14:214-217. Estoup, A., Garnery, L., Solignac, M., and Cornuet, J. M. 1995. Microsatellite variation in honeybee (Apis mellifera L.) populations: Hierarchical genetic structure and test of the infinite allele and stepwise mutation models. Genetics 140:679-695. Felsenstein, J. 1993. "PHYLIP v3.5 (Phylogenetic Inference Package, computer software) Ver. 3.2." University of Washington, Seattle, WA. Franck, J. P. C., Wright, J. M., and McAndrew, B. 1992. Genetic variability of a family of satellite DNAs from tilapia (Pisces: Cichlidae). Genome 35:719-725. Franck, J. P. C., Kornfield, I., and Wright, J. M. 1994. The utility of SATA satellite DNA sequences for inferring phylogenetic relationships among the three major genera of tilapinne cichlid fishes. Mol. Phylogenet. Evol. 3:10-16. Fryer, G. 1959a. Some aspects of evolution in Lake Nyasa. Evolution 13: 440-451. Fryer, G. 1959b. The trophic interrelationships and ecology of some littoral communities in Lake Nyasa with special references to
36
IRV KORNFIELD AND ALEX PARKER
the fishes, and a discussion of the evolution of a group of rockfrequenting Cichlidae. Proc. Zool. Soc. Lond. 132:153-281. Fryer, G., and Iles, T. D. 1972. "The Cichlid Fishes of the Great Lakes of Africa." Oliver Boyd, Edinborough. Gasse, F., Ledee, V., Massault, M., and Fontes, J.-C. 1989. Water level fluctuations of Lake Tanganyika in phase with oceanic changes during the last glaciation and deglaciation. Nature 342:57-59. Goldstein, D. B., Linares, A. R., Cavalli-Sforza, L. L., and Feldman, M. W. 1995. Genetic absolute dating based on microsatellites and the origin of modern humans. Proc. Natl. Acad. Sci. USA 92:67236727. Greenwood, P. H. 1984. African cichlids and evolutionary theories. In "Evolution of Fish Species Flocks." (A. A. Echelle and I. L. Kornfield, eds.), pp. 141-154. University of Maine Press, Orono, ME. Hare, M. P., Karl, S. A., and Avise, J. C. 1996. Anonymous nuclear DNA markers in the American oyster and their implications for the heterozygote deficiency phenomenon in marine bivalves. Mol. Biol. Evol. 13:334-345. Hughes, A. L., and Nei, M. 1989. Nucleotide substitution at major histocompatibility complex class II loci: Evidence for overdominant selection. Proc. Natl. Acad. Sci. USA 86:958-962. Karl, S. A., Bowen, B. W., and Avise, J. C. 1992. Global population structure and male-mediated gene flow in the green turtle (CheIonia mydas): RFLP analyses of anonymous nuclear loci. Genetics 131:163-173. Karl, S. A., and Avise, J. C. 1993. PCR-based assays of mendelian polymorphisms from anonymous single-copy nuclear DNA: Techniques and applications for population genetics. Mol. Biol. Evol. 10:342-361. Kellogg, K. A., Markert, J. A., Stauffer, J. R., Jr., and Kocher, T. D. 1995. Microsatellite variation demonstrates multiple paternity in lekking cichlid fishes from Lake Malawi, Africa. Proc. R. Soc. Lond. B 260:79-84. Klein, J. 1986. "Natural History of the Major Histocompatibility Complex." Wiley, New York. Klein, D. H., Ono, H., O'Huigin, C., Vincek, V., Goldschmidt, T., and Klein, J. 1993. Extensive Mhc variability in cichlid fishes of Lake Malawi. Nature 364: 330-332. Konings, A. 1990. "Koning's Book of Cichlids and All the Other Fishes of Lake Malawi." TFH Publications, Inc., Neptune City, NJ. Kornfield, I. 1978. Evidence for rapid speciation in African cichlid fishes. Experientia 34: 335-336. Kornfield, I. 1991. Genetics. In "Cichlid Fishes: Behavior, Ecology and Evolution." (M. Keenleyside, ed.), pp. 103-128. Chapman and Hall, London. Lazzaro, X. 1991. Feeding convergence in South American and African zooplanktivorous cichlids Geophagus brasilensis and Tilapia rendalli. Environ. Biol. Fishes 31:283-293. Levin, I., Cheng, H. H., Baxter-Jones, C., and Hillel, J. 1995. Turkey microsatellite DNA loci amplified by chicken-specific primers. Anim. Genet. 26:107-110. Lewis, D. S. C. 1981. "Problems of Species Definition in Lake Malawi Cichlid Fishes (Pisces, Cichlidae)." J. L. B. Smith Inst. Ichthy. Spec. Publ. 23:1-5. Lewis, D. S. C. 1982. A revision of the genus Labidochromis (Teleostei: Cichlidae) from Lake Malawi. Zool J. Linn. Soc. 75:189-265. Liem, K. F. 1980. Adaptive significance of intra- and interspecific differences in the feeding repertoires of cichlid fishes. Am. Zool. 20: 295-314. Maciel, P., et al. 1995. Correlation between CAG repeat length and clinical features in Machado-Joseph disease. Am. J. Hum. Genet. 57:54-61. Marsh, A. C., Ribbink, A. J., and Marsh, B. A. 1981. Sibling species complexes in sympatric populations of Petrotilapia Trewavas (Cichlidae, Lake Malawi). Zool. J. Linn Soc. 71:253-264.
Mayr, E. 1984. Evolution of fish species flocks: A commentary. In "Evolution of Fish Species Flocks." (A. A. Echelle and I. Kornfield, eds.), pp. 3-11. University of Maine Press, Orono, ME. McElroy, D. M., Kornfield, I., and Everett, J. 1991. Coloration in African cichlids: Diversity and constraints in Lake Malawi endemics. Neth. J. Zool. 41:250-268. McKaye, K. R., Kocher, T., Reinthal, P., Harrison, R., and Kornfield, I. 1984. Genetic evidence for allopatric and sympatric differentiation among morphs of a Lake Malawi cichlid fish. Evolution 36: 658-664. McKaye, K. R., Kocher, T., Reinthal, P., and Kornfield, I. 1982. Sympatric sibling species complex of Petrotilapia Trewavas analyzed by enzyme electrophoresis (Pisces: Cichlidae). J. Linn. Soc. 76:9196. McMillan, W. O., and Palumbi, S. R. 1995. Concordant evolutionary patterns among Indo-West Pacific butterflyfishes. Proc. R. Soc. Lond. B 260: 229- 239. Meyer, A. 1993. Phylogenetic relationships and evolutionary processes in East African cichlid fishes. Trends Ecol. Evol. 8:279-284. Meyer, A., Kocher, T. D., Basasibwaki, P., and Wilson, A. C. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature 347:550-553. Minch, E. 1995. "MICROSAT vl.4 (computer software)." Stanford University, Stanford, CA. Moran, P., and Kornfield, I. 1993. Retention of an ancestral polymorphism in the mbuna species flock (Pisces: Cichlidae) of Lake Malawi. Mol. Biol. Evol. 10:1015-1029. Moran, P., and Kornfield, I. 1995. Were population bottlenecks associated with radiation of the mbuna species flock (Teleostei: Cichlidae) of Lake Malawi? Mol. Biol. Evol. 12:1085-1093. Moran, P., Kornfield, I., and Reinthal, P. 1994. Molecular systematics and radiation of the haplochromine cichlids (Teleostei: Perciformes) of Lake Malawi. Copeia 1994:274-288. Nei, M. 1978. Estimation of avaerage heterozygosity and genetic distance from a small number of individuals. Genetics 89:583-590. Nei, M. 1987. "Molecular Evolutionary Genetics." Columbia University Press, New York. Niki, Y., Chigusa, S. I., and Matsuura, E. T. 1989. Complete replacement of mitochondrial DNA in Drosophila. Nature 341:551-552. Oliver, M. K. 1984. "Systematics of African Cichlid Fishes: Determination of the Most Primitive Taxon, and Studies on the Haplochromines of Lake Malawi (Teleostei: Cichlidae). Unpublished Ph.D. dissertation, Yale University, New Haven, CT. Ono, H., O'Huigin, C., Tichy, H., and Klein, J. 1993. Major histocompatibility complex variation in two species of cichlid fishes from Lake Malawi. Mol. Biol. Evol. 10:1060-1072. Owen, R. B., Crossley, R., Johnson, T. C., Tweddle, D., Kornfield, I., Davison, S., Eccles, D. H., and Engstrom, D. E. 1990. Major low levels of Lake Malawi and implication for speciation rates in cichlid fishes. Proc. R. Soc. Lond. B 240:519-553. Paetkau, D., Calvert, W., Stirling, I., and Strobeck, C. 1995. Microsatellite analysis of population structure in Canadian polar bears. Mol. Ecol. 4: 347-354. Palumbi, S. R. and Baker, C. S. 1994. Contrasting population structures for nuclear intron sequences and mtDNA of humpback whales. Mol. Biol. Evol. 11:426-435. Pamilo, P., and Nei, M. 1988. Relationships between gene trees and species trees. Mol. Biol. Evol. 5:568-583. Parker, A., and Kornfield, I. 1996. Polygynandry in Pseudotropheus zebra, a cichlid fish from Lake Malawi. Environ. Biol. Fish., 47:345352. Parker, A., and Kornfield, I. 1997. Evolution of the mitochondrial DNA control region in the mbuna (Cichlidae) species flock of Lake Malawi, East Africa. J. Mol. Evol., in press. Pemberton, J. M., Slate, J., Bancroft, D. R., and Barrett, J. A. 1995. Non-
3. Molecular Systematics of mbuna amplifying alleles at microsatellite loci: A caution for parentage and population studies. Mol. Ecol. 4:249-252. Penny, D., Steel, M., Waddell, P. J., and Hendy, M. D. 1995. Improved analyses of human mtDNA sequences support a recent african origin for Homo sapiens. Mol. Biol. Evol. 12:863-882. P6pin, L., Amigues, Y., Le'Pringle, A., Berthier, J.-L., Bensaid, A., and Vaiman, D. 1995. Sequence conservation of microsatellites between Bos taurus (cattle), Capra hircus (goat) and related species: Examples of use in parentage testing and phylogenetic analysis. Heredity 74:53-61. Queller, D. C., and Goodnight, K. F. 1989. Estimating relatedness using genetic markers. Evolution 43:258-275. Queller, D. C., Strassmann, J. E., and Hughes, C. R. 1993. Microsatellites and kinship. Trends Ecol. Evol. 8:285-288. Rand, D. M., Dorfsman, M., and Kan, L. M. 1994. Neutral and nonneutral evolution of Drosophila mitochondrial DNA. Genetics 138: 741-756. Raymond, M., and Rousset, F. 1995. GENEPOP ver. 1.2 a population genetics software for exact tests and ecumenicism. J. Hered. 86: 248-249. Regan, C. T. 1921. The cichlid fishes of Lake Nyasa. Proc. Zool. Soc. Lond. 1921: 675- 727. Reinthal, P. N. 1987. "Morphology, Ecology, and Behavior of a Group of the Rock-Dwelling Fishes (Cichlidae: Perciformes) from Lake Malawi, Africa. Unpublished Ph.D dissertation, Duke University, Durham, NC. Reinthal, P. N. 1990a. Morphological analysis of the neurocranium of a group of rock-dwelling cichlid fishes (Cichlidae: Perciformes) from Lake Malawi, Africa. Zool. J. Linn. Soc. 98:123-139. Reinthal, P. N. 1990b. The feeding habits of a group of herbivorous rock-dwelling cichlid fishes (Cichlidae: Perciformes) from Lake Malawi, Africa. Environ. Biol. Fishes. 27:215-233. Ribbink, A. J., Marsh, A. C., Marsh, B. A., and Sharp, B. J. 1983a. The zoogeography, ecology and taxonomy of the genus Labeotropheus Ahl, 1927, of Lake Malawi (Pisces: Cichlidae). Zool. J. Linn. Soc. 79: 223- 243. Ribbink, A. J., Marsh, A. C., Ribbink, C. C., and Sharp, B. J. 1983b. A preliminary survey of the cichlid fishes of rocky habitats in Lake Malawi. S. Afr. J. Zool. 18:149-310. Rice, W. R. 1989. Analyzing tables of statistical tests. Evolution 43: 223-225. Rubensztein, D. C., Amos, W., Leggo, J., Goodburn, S., Jain, S., Li, S. H., Margolis, R. L., Ross, C. A., and Ferguson-Smith, M. 1995. Microsatellite evolution: Evidence for directionality and variation in rate between species. Nat. Genet. 10:337-343. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406 -425. Schl6tterer, C., Amos, B., and Tautz, D. 1991. Conservation of polymorphic simple sequence loci in cetacean species. Nature 354: 63-65.
37
Scholz, C. A., and Rosendahl, B. R. 1988. Low lake stands in Lakes Malawi and Tanganyika, East Africa, delineated with multifold seismic data. Science 240:1645-1648. Seyoum, S., and Kornfield, I. 1992a. Taxonomic notes on the Oreochromis niloticus subspecies complex (Pisces: Cichidae), with a description of a new subspecies. Can. J. Zool. 70:2161-2165. Seyoum, S., and Kornfield, I. 1992b. Identification of the subspecies of Oreochromis niloticus (Pisces: Cichlidae) using restriction endonuclease analysis of mitochondrial DNA. Aquaculture 102:29-42. Shriver, M. D., Jin, L., Boerwinkle, E., Deka, R., Ferrell, R. E., and Chakraborty, R. 1995. A novel measure of genetic distance for highly polymorphic tandem repeat loci. Mol. Biol. Evol. 12:914920. Shriver, M. D., Jin, L., Chakraborty, R., and Boerwinkle, E. 1993. VNTR allele frequency distributions under the stepwise mutation model: A computer simulations approach. Genetics 134:983-993. Slatkin, M. 1995. A measure of population subdivision based on microsatellite alleles. Genetics 139:457-462. Sultmann, H., Mayer, W. E., Figueroa, F., Tichy, H., and Klein, J. 1995. Phylogenetic analysis of cichlid fishes using nucler DNA markers. Mol. Phylogenet. Evol. 12:1033-1047. Swofford, D. L. 1990. "PAUP: Phylogenetic Analysis Using Parsimony, ver. 3.1.1." Computer program distributed by the Illinois Natural History Survey, Champaign, IL. Swofford, D. L., and Berlocher, S. H. 1987. Inferring evolutionary trees from gene frequency data under the principle of maximum parsimony. Syst. Zool. 36:293-325. Swofford, D. L., and Selander, R. B. 1981. BIOSYS-I: a FORTRAN program for the comprehensive analysis of electrophoretic data in population genetics and systematics. J. Hered. 72:281-283. Trewavas, E. 1935. A synopsis of the cichlid fishes of Lake Nyasa. Ann. Mag. Nat. Hist. 10:65-118. Turner, G. F. 1994. Speciation mechanisms in Lake Malawi cichlids: A critical review. Arch. Hydrobiol. 44:139-160. Valdes, A. M., Slatkin, M., and Freimer, N. B. 1993. Allele frequencies at microsatellite loci: The stepwise mutation model revisited. Genetics 133:737-749. Van Dongen, S. 1995. How should we bootstrap allozyme data? Heredity 74: 445-447. Weir, B. S. 1990. Sampling strategies for distances between DNA sequences. Biometrics 46:551-560. Williams, J. G. K., Kubelik, A. R., Livak, K. J., Rafalski, J. A., and Tingey, S. V. 1990. DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res. 18:65316535. Wright, J. M. 1989. Nucleotide sequence, genomic organization and evolution of a major repetitive DNA family in tilapia Oreochromis mossambicus/hornorum. Nucleic Acids Res. 17:5071-5079. Zaykin, D. V., and Pudovkin, A. I. 1993. Two programs to estimate /~,2 values using pseudo-probability tests. J. Hered. 84:152-153.
This Page Intentionally Left Blank
C H A P T E R
4 Reconstruction of Cichlid Fish Phylogeny Using Nuclear DNA Markers ~176
H O L G E R S U L T M A N N and WERNER E. MAYER
Max-Planck-Institut fiir Biologie Abteilung Immungenetik D-72076 Tiibingen, Germany
some of which gave rise to more recent groups in lakes Malawi (Kocher et al., 1993) and Victoria. A comparatively large genetic divergence between species of the genus Tropheus from Lake Tanganyika was found to be accompanied by small morphological changes (Sturmbauer and Meyer, 1992). However, high morphological plasticity was found within the single New World species Cichlasoma managuense (Meyer, 1987). In addition, although some cichlid species from different lakes resemble each other morphologically, molecular data indicate that this similarity is due to convergent evolution (Kocher et al., 1993). Most of the cichlid species of Lakes Malawi and Victoria are endemic (Kornfield, 1978; Meyer et al., 1990; Greenwood, 1991) and monophyletic (Meyer et al., 1990; Meyer, 1993). Taking into account the estimated ages of 2 MY for Lake Malawi and less than 1 MY for Lake Victoria (Fryer and Iles, 1972), questions arise as to the speed and mode of speciation leading to hundreds of different species. It has been shown by allozyme variation that speciation in Lake Malawi occurred rapidly (Kornfield, 1978). Allopatric speciation might have been promoted by considerable fluctuation in the water levels of Lakes Malawi and Victoria (Livingstone, 1980; Owen et al., 1990). However, microallo-
I. I n t r o d u c t i o n
The family Cichlidae constitutes a monophyletic group in the order Perciformes (Kaufman and Liem, 1982). Monophyly of the cichlid family is indicated by the presence of at least nine synapomorphic morphological characters (Stiassny, 1991; Zihler, 1982; Gaemers, 1984). Since the distribution of cichlids ranges from South and Central America and Mexico to tropical Africa, Madagascar, southern India, and Sri Lanka (Ribbink, 1991), the cichlid family must have arisen before the separation of Africa, South America, and India by continental drift more than 100 million years (MY) ago. The morphology of cichlid species has been studied for almost 100 years and various classification schemes have been proposed (e.g., Pellegrin, 1904; Regan, 1906; Vandewalle, 1971; Trewavas, 1973, 1983; Poll, 1986; Greenwood, 1987; Cichocki, 1976; Stiassny, 1987, 1991; Oliver, 1984). The cichlid taxa in the Great Lakes of East Africa (Lakes Victoria, Malawi, and Tanganyika) are of special interest, having undergone recent explosive adaptive radiations leading to hundreds of different species. Lake Tanganyika, which is approximately 12 MY old (Cohen et al., 1993), provides an ancient reservoir of polyphyletic taxa (Nishida, 1991),
MOLECULAR SYSTEMATICS OF FISHES
39
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
40
HOLGER SLILTMANN AND WERNER E. MAYER
patric or even sympatric speciation cannot be ruled out, particularly because habitats and niches are quite restricted for most of the species (Ribbink, 1991; Meyer, 1993). Before hypotheses regarding the speciation process can be postulated, the phylogenetic relationships among the various cichlid taxa must be elucidated. Two main difficulties have, however, hampered the reconstruction of cichlid phylogenies from morphological characters: paucity of synapomorphic characters, which hinders the recognition of taxonomic groups, and abundant parallelism, which makes it difficult to ascertain whether shared characters are synapomorphies or homoplasies. To circumvent these problems, molecular analyses have been initiated and used for the construction of phylogenies for cichlid species and species flocks (i.e., monophyletic groups of closely related species coexisting in the same area; Greenwood, 1984; Nishida, 1991; Sage et al., 1984).
II. Methods Used for Reconstructing Cichlid Phylogeny The present taxonomy of cichlids in the east African lakes is largely based on morphological characters, particularly the shape of the jaws and teeth as well as the trophic behavior (Greenwood, 1979, 1980). For a variety of allozymes, allelic frequencies have been estimated from the electrophoretic mobility patterns (Sage and Selander, 1975; Kornfield, 1978; Kornfield et al., 1979; McKaye et al., 1982; McAndrew and Majumdar, 1983, 1984; Sage et al., 1984; Nishida, 1991) and used to calculate genetic distances between sister groups of cichlids. These data allowed the subdivision of cichlids into genera and species flocks. For the Lake Victoria cichlids, however, despite their considerable morphological differences, genetic distances were too small (0.006 substitutions per locus; Sage et al., 1984) to evaluate interspecies relationships. The substitution rate at the mitochondrial control region and adjacent loci (cytochrome b and tRNA genes) has been shown to be higher than that of most nuclear DNA loci (Brown et al., 1979). Sequence analyses of mitochondrial DNA (mtDNA) (Meyer et al., 1990; Sturmbauer and Meyer, 1992, 1993; Kocher et al., 1993, 1995; Sturmbauer et al., 1994; Moran and Kornfield, 1993; Schliewen et al., 1994; Bowers et al., 1994) have extended the phylogenetic trees of the Lake Tanganyika and Malawi lineages and confirmed the monophyly of the Lake Victoria species flock. Discrep-
ancies between the restriction fragment length polymorphism (RFLP) pattern of mtDNA and the species tree based on morphological characters, however, led Moran and Kornfield (1993) to suggest an ancestral polymorphism in the founding populations of the Lake Malawi flocks, which hinders an accurate determination of their phylogenetic relationships. In addition to this problem of polymorphism predating species divergence, the low number of mtDNA markers available is a limiting factor. In contrast to the low genetic diversity among cichlids as revealed by allozyme data is the finding of high polymorphism at the Mhc (major histocompatibility complex) loci (Klein et al., 1993; Ono et al., 1993). Although some of this polymorphism is ancient (predating species divergence), the high number of different Mhc groups (loci) and alleles in cichlids might make the Mhc a useful genetic tool for studying cichlid phylogeny (Klein et al., 1997). The detection of a family of tandemly repeated satellite DNA elements in tilapia (Wright, 1989; Franck et al., 1992) has enabled Franck and co-workers (1994) to provide evidence for a close relationship of the mouthbrooding tilapiine genera Oreochromis and Sarotherodon in contrast to the substrate spawning genus Tilapia. In this report, nucleotide differences between the satellite consensus sequences for each genus were used for the construction of a phylogenetic tree. Using the molecular methods just described, the aim of elucidating the evolutionary history of cichlid species within the monophyletic groups of Lakes Malawi and Victoria has been achieved only partially, either because of the poor resolution achievable by the methods or because of the low number of polymorphic loci found. Thus, using more polymorphic nuclear DNA markers is the only means for making further progress in this field of research. The search for such new markers was greatly facilitated by the discovery of the polymerase chain reaction (PCR) (Saiki et al., 1988). This chapter describes and discusses the application of two PCR-based methods. First, S~iltmann et al. (1995) used the random amplification of the polymorphic DNA (RAPD) technique (Williams et al., 1990; Welsh and McClelland, 1990) to identify polymorphic genomic loci, followed by locus-specific DNA amplification and sequence determination of the fragments. In a second (unpublished) approach, locus-specific PCR primers were used to amplify microsatellite repetitive elements to determine allele size frequencies among cichlid species from Lake Victoria. Nucleotide substitutions and allele frequency differences between species were then used to calculate genetic distance matrices and to construct phylogenetic trees.
4. Reconstruction of Cichlid Phylogeny
III. Random Amplification of Polymorphic DNA (RAPD) The RAPD procedure (Welsh and McClelland, 1990; Williams et al., 1990) was originally developed as a method for fingerprinting genomes. PCR amplification is performed using a single oligonucleotide, typically a 10-mer primer, at low annealing temperatures (35-40 ~ Fig. 1A). Depending on its sequence, the primer randomly anneals to an unknown segment on one of the DNA strands. In some cases, another annealing site will be present on the complementary strand not too distant from the first site and amplification will occur. When two species, strains, or individuals are compared, polymorphism between them will be revealed on agarose or polyacrylamide electrophoresis gels by the presence or absence of an amplification product. This method has been applied to the discovery of genetic markers for mapping studies (Serikawa et al., 1992; Postlethwait et al., 1994) and to elucidate phylogenetic relationships between bacterial species and strains (Welsh and McClelland, 1990; Smith et al., 1994) and tilapiine cichlid species (Bardakci and Skibinski, 1994). In the latter case, three species of the genus Oreochromis and four subspecies of Oreochromis niloticus could be distinguished. However, analyses of the reaction conditions (Ellsworth et al., 1993; Muralidharan and Wakeland, 1993; Smith et al., 1994; Bowditch et al., 1994) have shown that RAPD is highly sensitive to a wide range of factors: the quality of the template DNA, minute contaminations of RNA, the primer/template ratio, and small changes of the magnesium concentration. In addition, it is prone to producing spurious fragment variation (as shown, for example, by comparison of F1 hybrid DNA with parental DNA; Ayliffe et al., 1994) and other artifacts. Therefore, the procedure has been supplemented by sequencing the differential RAPD fragment and designing primers for locus-specific amplification in standard PCR. Although the RAPD polymorphism is presumed to be located at the annealing site of the 10-mer primer, it has been shown that the primer-binding sites are often identical between two samples showing polymorphic bands (Bowditch et al., 1994). The most likely explanation for this is that the formation of different secondary structures of the DNA templates, due to nucleotide substitutions outside the annealing sites, affects the accessibility of the annealing sites. To examine variation at the RAPD primer annealing site, the "vectorette" technique described by Riley and co-workers (1990) was also applied. Genomic DNA was digested with restriction endonucleases, and so-
41
called vectorette linkers were ligated to all fragments (Fig. 1B). The vectorette linkers consisted of two oligonucleotides that were complementary to each other at their 5' and 3' ends, but contained a central mismatched region. In the subsequent PCR, the firststrand DNA synthesis primed by a locus-specific oligonucleotide was essential for the generation of the binding site of the so-called vectorette primer that specifically annealed to the complementary strand of the mismatched region of the vectorette linker. Thus, specific exponential amplification of the flanking region occurred. The PCR products were then cloned and sequenced by standard methods. Using two DNA samples (shown to be devoid of RNA by ethidium bromide staining) from Pseudotropheus zebra and Melanochromis auratus, the RAPD conditions that yielded the most reproducible results were determined and then these were kept constant in subsequent experiments. The conditions were as follows: 50-60% G +C content for the 10-mer primer (see Stiltmann et al., 1995), which was used at a concentration of 4/xM in the PCR (a combination of two 10-mer primers can also be used for the amplification); 100/xM each of dATP, dCTP, dGTP, and dTTP; 2.5 units of Taq polymerase; and 100 ng of template DNA in a total reaction volume of 25 #1 in 1• reaction buffer containing 1.5 mM magnesium chloride. The PCR program consisted of 45 sec at 93~ 15 sec annealing at 35-42~ and 10 min primer extension at 72 ~ followed by 35 to 45 cycles, each 15 sec at 93 ~ 15 sec annealing at 35 ~ 42~ and 3 min primer extension at 72~ The reaction was completed by a final primer extension step for 10 min at 72~ Only those cases which gave concordant banding patterns for two individuals of each species were examined further. Figure 2 shows an example of a typical result of the RAPD reaction where a polymorphic band of about 400 bp in size is present in P. zebra but absent from M. auratus. The fragments were subcloned in pUC18 or M13 vectors and sequenced (Sanger et al., 1977). From three of these sequences, specific primers for amplification were constructed. The corresponding loci were called DXTU1, DXTU2, and DXTU3 (for details, see Stiltmann et al., 1995). In the specific PCR, the following observations were made: 1. Polymorphism of locus-specific PCR products was frequently observed. The proportion of polymorphic versus monomorphic loci obtained by this procedure was estimated to be higher than 50%. 2. The applicability of the specific primers varied depending on the locus examined. At DXTU1, products were obtained from neotropical as well as from
42
HOLGER SfdLTMANN A N D WERNER E. M A Y E R
B
A
R
R
R
X
genomicDNA
10mer primer(X)
Restriction digest of genomic DNA
~ Denaturing, annealingat low stringency X
X
Species A
Species B
~
~
(RAPD) PCR underlow stringencyconditions
X
Ligation of vectorette linker containing a central mismatched segment
./"
Heat denaturation
no exponentialamplification fromspeciesA exponentiallyamplifiedDNA fragment from species B
~ Subcloning, Sequencing, Constructionof specific primersY andZ
1 st strand synthesis with target primer
~ StandardPCRusing manyspecies'DNA as templates
PCR with target and vectorette primers
"~LLLUdz
/
|
/
no priming
Z
~ Subcloning, Sequencing Phylogeneticanalysis
FIGURE I (A) Schematic outline of the RAPD method. See text for details. (B) Schematic outline of the vectorette approach. Genomic DNA is digested with a restriction enzyme (R). A vectorette linker, composed of two oligonucleotides that are complementary only at the ends and leave a central mismatched part, is ligated to the DNA fragments. In a PCR the synthesis of first-strand DNA is primed by an oligonucleotide specific for the target segment of a known sequence (shown as black box) and extended into the flankingparts and vectorette linker. This strand is used as a template in subsequent PCR cycles by the vectorette primer, which is located in the mismatched segment of the linker.
west and east African cichlid species, whereas at DXTU2, no products were found in cichlid species outside the Lake Victoria and Lake Malawi regions. These results are most likely due to different extents of conservation at the primer-binding sites. 3. Another notable feature of the specific PCR was the appearance of several by-products in addition to the band of the expected size. Since the possibility of amplification from multiple related loci (e.g., diversified repeats) could not be excluded in some cases, a third primer was used to prove the singularity of the amplified region in the cichlid genome. Sequence variability in the D N A fragments resulting from specific PCR can also be examined using single-stranded conformation polymorphism (SSCP;
Orita et al., 1989a,b) analysis. In this approach, distinct banding patterns of PCR products from different species indicate sequence differences between species at a single locus. The polymorphic locus DXTU1 was selected for a detailed sequence analysis using the GCG software package (Devereux and Haeberli, 1991) or the Clustal V program (Higgins et al., 1992). The following analysis of representative sequences at the DXTU1 locus is shown (contact author for raw data): 1. It is remarkable that insertions or deletions (indels) constitute about one-quarter of the total number of polymorphic sites found at the DXTU1 and the other genomic loci. Although nucleotide substitutions are commonly used in phylogenetic tree construction, in-
4. Reconstruction of Cichlid Phylogeny
12345
1358 bp 1078 bp 872 bp
-.91~
603 bp v
FIGURE 2 Products obtained by RAPD PCR with the 10-mer primer TU984 (5' GTGTGCCCCA3'). Products from Pseudotropheuszebra (lanes 1 and 2) and Melanochromisauratus (lanes 3 and 4) were separated on a 2% agarose gel. The left arrow indicates a 400-bp band present only in lanes I and 2. Lane 5 contains DNA size marker. The arrows on the right denote marker sizes.
(~ (~f
dels are normally excluded, yet the n u m b e r of possible ways by which nucleotides can be inserted, deleted, or rearranged is nearly unlimited, in contrast to the three possibilities by which nucleotides can be substituted at a single site. Thus, data could be analyzed by two different methods: the standard tree construction m e t h o d based on genetic distances and the neighbor-joining algorithm of Saitou and Nei (1987), and the cladistic analysis with the PAUP p r o g r a m version 3.1.1. (Swofford, 1993), in which shared indels were treated as synapomorphies. 2. There is considerable agreement between the distance tree (Fig. 3) and the cladogram (Fig. 4) based on the DXTU1 sequences. Although the evolutionary forces acting on the single loci m a y vary, the topologies of the neighbor-joining trees constructed for other loci than DXTU1 were congruent with the DXTU1 tree (see Stiltmann et al., 1995). However, low bootstrap values with respect to certain branching patterns in the neighbor-joining tree (e.g., the position of haplochromines in Lake Malawi, Fig. 3) suggest that longer sequences or more loci are required for a more precise
|
Cyphotilapia frontosa-143 L_ Cyphotilapia frontosa- 144
" ~ L a k e Tanganyika
~J
Cyphotilapia
Melanochromis auratus-1 Tyrranochromis macrostoma-1
~
Lake Malawi Haplochromis
Haplochromis xenognathus-ll4 Haplochromis velifer-602 ~'L-Hapl~176
~
Lake Victoria
~
~ _~ .
43
[ILCyrtocaramoorii-ZR216 L Pseudotropheus zebra- 1
Haplochromis ~
Lake Malawi
I,[
Haplochromis
(~ "-"
Lake Tanganyika genera
[ Neolamprologus brevis-63 ~ f Neolamprologus leleupi-135 ~ Julidochromis regani-60 Neolamprologus tretocephalus- 140 - - Astatoreochromis alluaudi-771 Alcolapia alcalicus-462 Oreochromis niloticus-LS7 Oreochromis urolepis-LSl O
Tylochromis leonensis-PR2 Thorichthys meeki-#55 Cichla-#15
non-endemic species Tilapiines from rivers and Lake Natron West African species ~
Neotropical species
Genetic distance I ! I I ! I I I 0.02 0.03 0.04 0.05 0.06 0.07 0 0.01 FIGURE 3 Neighbor-joining tree (Saitou and Nei, 1987) of the sequences at the DXTU1 locus. Genetic distances were calculated using Kimura's (1980) two-parameter method. The numbers at each node represent percentage recovery of the particular node in 1000 bootstrap replications.
44
HOLGER SCILTMANN AND WERNER E. MAYER
[
Melanochromis auratus-1 Pseudotropheus zebra- 1 Lake
Malawi
Haplochromis
Tyrannochromis macrostoma- 1 Cyrtocara moorii-ZR216 Haplochromis xenognathus-114
Lake Victoria
Haplochromis velifer-602
Haplochromis Haplochromis nigricans-268
Astatoreochromis alluaudi-771
non-endemic species
Cyphotilapia frontosa-143
Lake Tanganyika
Cyphotilapia frontosa-144
Cyphotilapia
Julidochromis regani-60
Lake Tanganyika
Neolamprologus leleupi-135
Julidochromis, Neolamprologus
Neolamprologus tretocephalus-140 Neolamprologus brevis-63 Alcolapia alcalicus-462
Oreochromis niloticus-LS7
Tilapiines
Oreochromis urolepis-LS l O
West African species
Tylochromis leonensis-PR2 Thorichthys meeki#55
~
Neotropicalspecies
Cichla#15 FIGURE 4 Cladogram of 20 representative taxa based on presence or absence of indels and substitutions at the DXTU1 locus. The tree resulted from 500 bootstrap replications using the heuristic search option of the PAUP program version 3.1.1 (Swofford, 1993). The numbers at each node represent percentage recovery of the particular branching order. Cichla No. 15 was used as an outgroup.
determination of species relationships. The trees (Figs. 3 and 4) of the DXTU1 sequences led to the following conclusions. First, the neotropical cichlid species Cichla sp. and Thorichthys meeki form a sister group to the African cichlids. The position of neotropical cichlids indicated by the molecular analysis is consistent with the results of morphological analysis (Cichocki, 1976; Oliver, 1984; Stiassny, 1991), which has revealed a set of derived characters uniting African cichlids (with the exception of Heterochromis) into a monophyletic group. Second, in the phylogram, the west African species Tylochromis leonensis is in a sister-group relationship with the east African species (the tilapiines, represented here by the genus Oreochromis from east African rivers and the Alcolapia alcalicus from Lake Natron; the Lake Tanganyika genera Neolamprologus, Julidochromis, and Cyphotilapia; and the Lake Malawi and Lake Vic-
toria species). In contrast, in the cladogram, Tylochromis appears as a sister group to the tilapiines. Third, the monophyly of the considered east African cichlids (tilapiines, haplochromines, and the Neolamprologus and Cyphotilapia genera of Lake Tanganyika) is indicated both by nucleotide substitution and by indel patterns. This branching order is also supported by mitochondrial DNA data (Meyer, 1993). Fourth, the tilapiines form a monophyletic sister group to the remaining east African Great Lake species and genera (haplochromines, Cyphotilapia, Astatoreochromis, and lamprologines). This result is concordant with morphological analyses (Regan, 1920; Trewavas, 1983) and other molecular studies (Kornfield et al., 1979; McAndrew and Majumdar, 1984; Seyoum, 1989; Sodsuk and McAndrew, 1991; Franck et al., 1994). Fifth, the species Astatoreochromis alluaudi, which is not endemic to Lake Victoria but is also found in other east African lakes and rivers, is a sister group of the
4. Reconstruction of Cichlid Phylogeny included east African lake genera. This result, as well as the sister-group placement of Julidochromis and Neolamprologus with respect to the Lake Malawi and Lake Victoria flocks, is also supported by mitochondrial DNA sequence data (Meyer et al., 1990). Sixth, the sister-group relationship of the Tanganyikan species Cyphotilapia frontosa to Lake Malawi haplochromines suggested by the NJ tree (Fig. 3) supports allozyme data (Kornfield, 1991), according to which Cyphotilapia is more closely related to the haplochromines of Lake Malawi than to those of Lake Tanganyika. This result further supports the polyphyletic structure of the Lake Tanganyika flocks. However, the cladogram (Fig. 4) favors Cyphotilapia in a sister group position to the Lake Malawi and Lake Victoria species on the one hand and the Lake Tanganyika Julidochromis and Neolamprologus genera on the other hand, thus supporting the sister group relationship of Cyphotilapia with other Lake Tanganyika cichlid flocks by using mtDNA control region data (Kocher et al., 1993). Finally, the monophyly of the endemic Lake Victoria haplochromines, as suggested by both trees, is consistent with the results of morphological studies (see Greenwood, 1978; Trewavas, 1983) and mitochondrial DNA analyses (Meyer et al., 1990; Sturmbauer and Meyer, 1992; Meyer, 1993; Moran and Kornfield, 1993). The finding of several indels, which are probably species specific, suggests that it may be possible to elucidate the relationships within species flocks using RAPD markers.
IV. Allele Size Frequencies at Dinucleotide Microsatellite Loci Microsatellites are tandemly repeated DNA sequences with repeat units of I to 6 bp in length and 10 to 100 units per locus (Charlesworth et al., 1994). They have been used for the construction of genetic maps in humans (Hearne et al., 1992; LeBlanc-Straceski et al., 1994) and other species. Variable repeat numbers have also been implicated in disease and cancer susceptibility (Wooster et al., 1994). The rate of mutations generating microsatellite repeat number variation is highest among all nuclear DNA markers; estimations for dinucleotide repeats range from 10 -2 to 10 -4 per generation (Jeffreys et al., 1988; Weber and Wong, 1993). This high mutation rate makes microsatellites a promising tool for population genetic analyses. Consequently, a number of studies have made use of microsatellites for determining relationships among populations of humans (Bowcock et al., 1994; Deka
45
et al., 1995), wolves (Roy et al., 1994), sheep (Buchanan et al., 1994), and toads (Scribner et al., 1994). Variability is believed to occur by the stepwise addition or subtraction of single repeat units after mispairing of the two DNA strands during the replication process (stepwise mutation model, SMM; Levinson and Gutman, 1987; Schl6tterer and Tautz, 1992). It has been shown, however (DiRienzo et al., 1994), that the SMM does not fully explain observed allele frequency distributions within populations: although allelic variation at dinucleotide repeat loci is predominantly due to single step mutations, rare changes of more than one repeat unit may occur as well. Furthermore, unequal crossing-over during meiosis may also contribute to the generation of polymorphism at the microsatellite loci. In cichlids, microsatellites have been used to study the mating behavior of Lake Malawi species (Kellogg et al., 1995). Cichlid fish phylogenies based on microsatellite data have not yet been published. However, the determination of allele size frequency distributions in distinct species from Lake Malawi and Lake Victoria, followed by the calculation of distance matrices, may provide the most promising means for reconstructing their phylogenies. In order to generate allele size data, subgenomic libraries with small insert sizes (200-1000 bp) from P. zebra (Lake Malawi) and Haplochromis nigricans (Lake Victoria) in the A gtl0 phage vector were constructed. The libraries were screened with the CA dinucleotide repeat-specific oligonucleotide (Ca)lsC, and hybridizing clones were isolated and sequenced (Sambrook et al., 1989). The clones contained stretches of CA(GT)repeated DNA with repeat numbers ranging from 8 to 90. Sequence-specific primers flanking the entire repetitive element at the particular locus were then taken for PCR amplification using genomic DNA from Lake Victoria cichlids as templates. One of the primers was labeled with fluorescein at its 5' end. The PCR products obtained from polymorphic loci were separated on a denaturing polyacrylamide gel in an automated sequencing apparatus. Bands were detected as fluorescence intensities of the labeled DNA strands, and their sizes were automatically determined by comparison with a size standard. From several microsatellite loci typed, this chapter shows the example of the DXTUCA15 locus. This locus was amplified from haplochromine genomic DNA with the primers MS16 (5' GCTGTGTAATCCCAAACTCC 3') and MS17 (5' GTATTTAGcTTTCCTCTG TGCT 3') by PCR with one 45-sec cycle at 93~ 15 sec at 55 ~ and 10 min at 72 ~ followed by 35 cycles, each 15 sec at 93~ 15 sec at 55~ and 1 min at 72~ The reaction was completed by a final primer extension
46
HOLGER S~ILTMANN AND WERNER E. MAYER
step for 10 min at 72~ As templates, genomic DNA samples from the Lake Victoria (and minor adjacent lakes like Lake Nabugabo, Kayugi, and Kayania) Haplochromis species H. beadlei (number of individuals, n = 12), H. cinctus (n = 19), H. laparogramma (n = 12),
H. nigricans (n = 15), H. nyererei (n = 40), H. plagiodon (n = 19), H. pyrrhocephalus (n = 43), H. sauvagei (n = 17), H. velifer (n = 81), and H. xenognathus (n = 29) were used. Individuals from each species were captured at two to six different locations in the wild. In the PCR, the primers amplified DNA fragments with sizes ranging from 75 to 93 bp. Size differences were due to variation of the number of CA repeat units, as determined by subcloning and sequencing random clones (data not shown). Although data are still preliminary, some interesting results have already been obtained from the specific amplification of cichlid microsatellite loci. First, in most of the amplifications, one or two products were visible (corresponding to homo- or heterozygosity of the individual at the particular locus). However, additional artifactual bands often appeared
Haplochromis beadlei
Haplochromislaparograrnrna
n=12
due to amplification at other loci. These by-products sometimes interfered with the precise determination of microsatellite size. Second, the size determination was also hampered by the occurrence of so-called "shadow bands" flanking the highest peak in a cluster of products. In the case of dinucleotide repeats, shadow bands usually differ by 2 bp in size. This observation suggests that they may have been generated by the insertion or deletion of repeat units during PCR amplification (Litt et al., 1993). Assuming that a similar mechanism generated shadow bands in different reactions, the largest peak area within a peak cluster for the determination of the allele size was used. Third, inhomogeneities within the polyacrylamide gel may lead to incorrect measurement of product sizes. In order to assess this possibility, the allele sizes at certain loci determined by gel electrophoresis were compared with those obtained by subcloning and sequencing of the same PCR products. From these data it was concluded that the error was no larger than one repeat unit. The summary of the allele size determination for each species is shown in Fig. 5. Allele frequencies (y
Haplochromis cinctus
n=12
Haplochromisnigricans
n=19
HapIochromis nyererei
n=15
n=40
0.4
g O.4 0.3
--e 0.3
0.3
0.2
0.2
0.2
0.3
0.2
o.111 I
0.1 o.,
8
9 10 11 12 13 14 15 16 17 repeat units
O.0
~
8
Hsplochromisplagiodon
~a 0.3
O 0
8
Hsplochromispyrrhocephalus
n=19
o>, 0.4 c
9 10 11 12 13 14 15 16 17 repeat units
0"0
9 10 11 12 13 14 15 1G 17 repeat units
8
Haplochromissauvagei
n=43
=-~. 04
o.,
-~ 0.3
.
.
.
.
.
.
.
.
O0
.
9 10 11 12 13 14 15 16 17 repeat units
8
Haplochromisvelifer
n=17
=~ o s]
.
0.1
Haplochromisxenognathus
n=81
n=29
~
0.41
Imi 9 10 11 12 13 14 15 16 17 repeat units
0.6
~0.5"
_== 0.4"
0.3 0.2 0.2 0.1
0.1
8
9 10 11 12 13 14 15 16 17 repeat units
0.0
8
9 10 11 12 13 14 15 16 17 repeat units
0.0
8
9 10 11 12 13 14 15 16 17 repeat units
8
9 10 11 12 13 14 15 16 17 repeat units
8
9 10 11 12 13 14 15 16 17 repeat units
FIGURE 5 Allele frequency distributions for 10 Haplochromis species from the Lakes Victoria, Kayugi, Kayania, and Nabugabo at the microsatellite locus DXTUCA15. Frequencies (y axis) are plotted against the number of repeat units (x axis) found in the fragment analysis (see text for details). The number of individuals included in each sample is given by n.
4. Reconstructionof Cichlid Phylogeny axis) are tabulated against the number of repeat units (x axis) calculated from the PCR product size by subtraction of the number of unique nucleotides in the fragment. Differences in frequency distributions between species are indicated by shape variations between the individual plots. Frequency data were used as the input for the microsat 1.4 computer program (written by Eric Minch; Goldstein et al., 1995), which calculates various distance measurements on the basis of allele frequencies (e.g., average squared difference in repeat numbers, Nei's identity; proportion of shared alleles). The basic assumption of the program is the validity of the stepwise mutation model (see also Valdes et al., 1993; Slatkin, 1995). It is important to note, however, that the algorithm is not dependent on the distribution of allele sizes within the species. Nei's identity method (Nei, 1972) was used for the calculation of a distance matrix (Fig. 6) which was then applied for the construction of a phylogenetic tree by the PHYLIP software package (Felsenstein, 1986-1993). The tree is shown in Fig. 7. It can be divided into two major branches, one of which is constituted by the Haplochro-
mis nyererei, H. nigricans, H. plagiodon, H. pyrrhocephalus, and H. laparogramma species, whereas H. beadlei, H. cinctus, H. sauvagei, and H. xenognathus appear on the second major branch. Haplochromis velifer is located at an intermediate position. An obstacle for a test of the reconstruction of the true phylogeny based on microsatellite allele frequencies in the closely related cichlid species from Lake Victoria is the low abundance of synapomorphic morphological characters. Most of the available studies have focused on the feeding habits, jaw morphology, and dentition (Greenwood, 1974, 1979, 1980; Witte and van Oijen, 1990). On the basis of these data, the species included in the phylogenetic tree (Fig. 7) can be subdivided into two major trophic groups (Witte and van Oijen, 1990), one of which is the planktivorous/algaeeating group Haplochromis cinctus (phytoplankton), H. laparogramma, H. pyrrhocephalus, H. nyererei (zoo-
i0
Habe Haci Hala Hani Hany Hapl Hapy Hasa Havl Haxe
-0.086 -0.030 -0.016 0.030 -0.050 0.088 -0.095 -0.008 0.072
0.051 0.088 0.158 0.024 0.117 -0.015 0.124 0.234
-0.078 -0.013 -0.079 -0.023 0.017 -0.018 0.248
-0.046 -0.078 0.040 0.009 -0.034 0.152
47
plankton), and H. nigricans (epilithic algae grazer). The other trophic group consists of the oral shell/mollusc crashers H. plagiodon, H. sauvagei, and H. xenognathus as well as H. beadlei, which is considered to be a sister species of Haplochromis plagiodon (Greenwood, 1980). In the phylogenetic tree generated by microsatellite data, this grouping is roughly reflected in the major branching pattern. The exceptions to this are H. plagiodon and cinctus, which are unexpectedly located on the opposite branches. Thus, data suggest that microsatellite data can be used to make a rough subdivision of some Lake Victoria cichlid species which corresponds to their feeding habits. Whether the congruence between phylogenetic position and trophic grouping is a rule for haplochromines in general remains to be examined. Certainly, multiple microsatellite loci will have to be analyzed in order to generate more reliable and independent data sets.
V. Critical Evaluation Using RAPD and Microsatellite Allele Frequencies for the Reconstruction of Cichlid Fish Phylogeny The recent adaptive radiation of cichlid fishes in Lake Malawi and Lake Victoria has produced closely related species flocks. The reconstruction of their phylogeny requires new methods capable of resolving genetic distances generated within short time spans. Because the available markers (mtDNA, allozymes, morphology) have achieved the goal of clarifying the Lake Victoria and Lake Malawi cichlid phylogeny only marginally, two different approaches that are both based on nuclear DNA markers were studied. The goal was to test the validity of current hypotheses on cichlid fish relationships. The RAPD-based sequence comparison requires relatively few samples from the species under consideration, and data collection and analysis are compara-
-0.029 0.157 0.023 -0.006 0.083
0.036 -0.024 -0.022 0.141
0.168 0.141 0.547
-0.001 0.033
0.i01
FIGURE 6 Distance matrix obtained with microsatellite allele frequency data for the 10 Haplochromis species shown in Fig. 5. Nei's identity method (Nei, 1972) was used for the generation of the matrix with the program microsat 1.4 (Goldstein et al., 1995).
48
HOLGER SfflLTMANN AND WERNER E. MAYER
Haplochromis laparogramma Haplochromis pyrrhocephalus Haplochromis plagiodon Haplochromis nigricans Haplochromis nyererei Haplochromis velifer I Haplochromis beadlei [ Haplochromis cinctus [ Haplochromis sauvagei
I I
I
Haplochromisxenognathus I
0 0.05 0.1 Relative length
I
0.15
FIGURE 7 Neighbor-joining tree (Saitou and Nei, 1987) constructed using the distance matrix from Fig. 6 as input data for the PHYLIP distance algorithm (Felsenstein, 1986 - 1993 ).
tively easy to carry out. The chance of finding interspecies variation in the set of random sequences is high. A prerequisite of the method as described here, however, is complete lineage sorting of the particular RAPD marker. To distinguish young species, therefore, frequency data are necessary. Subcloning and sequencing can be performed by established methods, but are time-consuming. Sequence data provide two types of characters, substitutions and indels, both of which can be used in separate phylogenetic analyses. The results obtained thus far agree well with previously reported molecular data and support the use of this method for molecular evolutionary studies of cichlid fishes. Yet, because some conflicting hypotheses (e.g., the position of Cyphotilapia with respect to the other east African Lake cichlids) could not be clearly resolved, the number of RAPD markers has to be increased in order to obtain phylogenetic trees with higher bootstrap support values. The likelihood of detecting synapomorphic characters between related species increases with the time that has passed since species separation. In this respect, the evolutionarily young (less than 2 MY) species flocks from Lake Malawi and Lake Vic-
toria, which are the interesting ones regarding the speciation process, will require many more nuclear DNA loci and sequencing. For the Lake Victoria haplochromines, the feasibility of obtaining such markers has been shown with the RAPD approach. The microsatellite approach makes use of the high mutation rate of short tandemly repeated sequences. Therefore, this method is more suitable for the determination of relationships between closely related species like the haplochromines of Lake Malawi and Lake Victoria. Once the polymorphic loci have been identified and the variation has been shown to be due to the repeat numbers, allele typing can be performed much quicker as compared to sequencing approaches. However, because the method employs frequency data, the requirement for large sample sizes per species may be a major obstacle to generating reliable data sets. It has been shown for di- to hexanucleotide microsatellite loci that the variance of the stepwise-weighted genetic distance does not change significantly when more than 25 individuals per species are used (Shriver et al., 1995). This number thus defines the preferred sample size. Two limiting factors hinder the use of microsatellite
4. Reconstruction of Cichlid Phylogeny
allele size typing: First, at oligomorphic loci the allele frequency distributions are similar in all species because the time for generation of variability has been too short or polymorphism has predated the speciation process. Second, in the case of convergence the allele frequency distributions of the species are similar because the process of generation of variability has reached an equilibrium state in all the species. In addition, to use the microsatellite approach effectively, several theoretical considerations have to be resolved: First, there is still uncertainty concerning the mechanism generating the variability (Schl6tterer and Tautz, 1992; DiRienzo et al., 1994). Thus, the available models might have to be refined once the mechanism of generation of repeat number variation has been elucidated. Second, the possibility of selection acting on microsatellite repeat number cannot be excluded and may lead to inconsistent results when loci are compared. Third, ignorance of interspecific hybridization events introduces a high degree of uncertainty concerning the topology of the phylogenetic tree. Fourth, errors due to inadequate sample size and possible kinships between cichlid fish taxa are difficult to evaluate and may bias the observed genetic distances between species. In general, RAPD and the microsatellite approach are both able to detect polymorphism between closely related taxonomic groups. With respect to cichlid phylogeny, RAPD can be primarily applied to genera under comparison. In contrast, the microsatellite method should be applied to the species and population level. Despite the unresolved problems with microsatellites, it is the authors' opinion that they are the best tool so far among all the available methods for studying cichlid phylogeny. Nonetheless, the search for additional polymorphic nuclear DNA markers should be continued because these will provide excellent markers for testing the validity of phylogenetic hypotheses.
Acknowledgments We thank Jan Klein for critical reading of the manuscript and helpful suggestions, Herbert Tichy for discussions on the cichlid species and providing the samples, Eric Minch from the Department of Genetics, Stanford University, for the microsat 1.4 program and help in getting it started, and Lynne Yakes for editorial assistance.
References Ayliffe, M. A., Lawrence, G. J., Ellis, J. G., and Pryor, A. J. 1994. Heteroduplex molecules formed between allelic sequences cause nonparental RAPD bands. Nucleic Acids Res. 22:1632-1636. Bardakci, F., and Skibinski, D. O. F. 1994. Application of the RAPD technique in tilapia fish: Species and subspecies identification. Heredity 73:117-123. Bowcock, A. M., Ruiz-Linares, A., Tomfohrde, J., Minch, E., Kidd, J. R., and Cavalli-Sforza, L. L. 1994. High resolution of human
49
evolutionary trees with polymorphic microsatellites. Nature 368: 455 -457. Bowditch, B. M., Albright, D. G., Williams, J. G. K., and Braun, M. J. 1994. Use of randomly amplified polymorphic DNA markers in comparative genome studies. Meth. Enzymol. 224:294-309. Bowers, N., Stauffer, J. R., and Kocher, T. D. 1994. Intra- and interspecific mitochondrial DNA sequence variation within two species of rock-dwelling cichlids (Teleostei: Cichlidae) from Lake Malawi, Africa. Mol. Phylogenet. Evol. 3(1):75-82. Brown, W.M., George, M., Jr., and Wilson, A.C. 1979. Rapid evolution of animal mitochondrial DNA. Proc. Natl. Acad. Sci. USA 76: 1967-1971. Buchanan, F. C., Adams, L. J., Littlejohn, R. P., Maddox, J. F., and Crawford, A. M. 1994. Determination of evolutionary relationships among sheep breeds using microsatellites. Genomics 22: 397-403. Charlesworth, B., Sniegowski, P., and Stephan, W. 1994. The evolutionary dynamics of repetitive DNA in eukaryotes. Nature 371: 215-220. Cichocki, F. P. 1976. "Cladistic History of Cichlid Fishes and Reproductive Strategies of the American Genera Acarichthys, Biotodoma and Geophagus," Vol. 1. Ph.D. thesis, University of Michigan, Ann Arbor, MI. Cohen, A. S., Soreghan, M. J., and Scholz, C A. 1993. Estimating the age of formation of lakes: An example from Lake Tanganyika, East African rift system. Geology 21:511-514. Deka, R., Jin, L., Shriver, M. D., Yu, L. M., Decroo, S., Hundrieser, J., Bunker, C. H., Ferrell, R. E., and Chakraborty, R. 1995. Population genetics of dinucleotide (dC-dA)n 9(dG-dT)n polymorphisms in world populations. Am. J. Hum. Genet. 56:461-474. Devereux, J., and Haeberli, P. 1991. "Genetics Computer Group, Program manual for the GCG package, Version 7," April 1991, Madison, WI. DiRienzo, A., Peterson, A. C., Garza, J. C., Valdes, A. M., Slatkin, M., and Freimer, N. B. 1994. Mutational processes of simple-sequence repeat loci in human populations. Proc. Natl. Acad. Sci. USA 91: 3166-3170. Ellsworth, D. L., Rittenhouse, K. D., and Honeycutt, R. L. 1993. Artifactual variation in randomly amplified polymorphic DNA banding patterns. Biotechniques 14:214-217. Felsenstein, J. 1986-1993. "PHYLIP: Phylogenetic Inference Package Version 3.5c." University of Washington. Franck, J. P. C., Kornfield, I., and Wright, J. M. 1994. The utility of SATA satellite DNA sequences for inferring phylogenetic relationships among the three major genera of tilapiine cichlid fishes. Mol. Phylogenet. Evol. 3:10-16. Franck, J. P. C., Wright, J. M., and McAndrew, B. J. 1992. Genetic variability in a family of satellite DNAs from tilapia (Pisces: Cichlidae). Genome 35: 719-725. Fryer, G., and Iles, T. D. 1972. "The Cichlid Fishes of the Great Lakes of Africa." Oliver and Boyd, Edinburgh. Gaemers, P. A. M. 1984. Taxonomic position of the Cichlidae as demonstrated by the morphology of their otoliths. Neth. J. Zool. 34: 566-595. Goldstein, D. B., Linares, A. R., Cavalli-Sforza, L. L., and Feldman, M. W. 1995. An evaluation of genetic distances for use with microsatellite loci. Genetics 139: 463-471. Greenwood, P. H. 1974. Cichlid fishes of Lake Victoria, East Africa: The biology and evolution of a species flock. Bull. Br. Mus. Nat. Hist. (Zool.) Suppl. 6:1-134. Greenwood, P. H. 1978. A review of the pharyngeal apophysis and its significance in the classification of African cichlid fishes. Bull. Br. Mus. Nat. Hist. (Zool.) 33:297-323. Greenwood, P. H. 1979. Towards a phyletic classification of the 'genus' Haplochromis (Pisces, Cichlidae) and related taxa. Bull. Br. Mus. Nat. Hist. (Zool.) 35:265-322.
50
HOLGER SCILTMANN A N D WERNER E. MAYER
Greenwood, P. H. 1980. Towards a phyletic classification of the 'genus' Haplochromis (Pisces, Cichlidae) and related taxa. II. The species from Lakes Victoria, Nabugabo, Edward, George, and Kivu. Bull. Br. Mus. Nat. Hist. (Zool.) 39:1-101. Greenwood, P. H. 1984. What is a species flock? In "Evolution of Fish Species Flocks" (A. A. Echelle and I. Kornfield, eds.), pp. 13-20. University of Maine at Orono Press, Maine. Greenwood, P. H. 1987. The genera of pelmatochromine fishes (Teleostei, Cichlidae). A phylogenetic review. Bull. Br. Mus. Nat. Hist. (Zool.) 53:139-203. Greenwood, P. H. 1991. Speciation. In "Cichlid Fishes: Behavior, Ecology and Evolution" (M. H. A. Keenleyside, ed.), pp. 86-102. Chapman and Hall, London. Hearne, C. M., Ghosh, S., and Todd, J. A. 1992. Microsatellites for linkage analysis of genetic traits. Trends Genet. 8:288-294. Higgins, D. G., Bleasby, A. J., and Fuchs, R. 1992. CLUSTAL V: Improved software for multiple sequence alignment. Cabios 8: 189-191. Jeffreys, A. J., Royle, N. J., Wilson, V., and Wong, Z. 1988. Spontaneous mutation rates to new length alleles at tandem-repetitive hypervariable loci in human DNA. Nature 332:278-281. Kaufman, L., and Liem, K. F. 1982. Fishes of the suborder Labroidei (Pisces: Perciformes): Phylogeny, ecology and evolutionary significance. Breviora 472:1-19. Kellogg, K. A., Markert, J. A., Stauffer, J. R., Jr., and Kocher, T. D. 1995. Microsatellite variation demonstrates multiple paternity in lekking cichlid fishes from Lake Malawi, Africa. Proc. R. Soc. Lond. B 260:79-84. Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120. Klein, D., Ono, H., O'Huigin, C., Vincek, V., Goldschmidt, T., and Klein, J. 1993. Extensive MHC variability in cichlid fishes of Lake Malawi. Nature 364: 330-334. Klein, J., Klein, D., Figueroa, F., and O'Huigin, C. 1997. Major histocompatibility complex genes in the study of fish phylogeny. In "Molecular Systematics of Fishes" (T. D. Kocher and C. A. Stepien, eds.). Academic Press, San Diego. Kocher, T. D., Conroy, J. A., McKaye, K. R., and Stauffer, J. R. 1993. Similar morphologies of cichlid fish in Lakes Tanganyika and Malawi are due to convergence. Mol. Phylogenet. Evol. 2:158-165. Kocher, T. D., Conroy, J. A., McKaye, K. R., Stauffer, J. R., and Lockwood, S. F. 1995. Evolution of NADH dehydrogenase subunit 2 in east African cichlid fish. Mol. Phylogenet. Evol. 4(4): 420-432. Kornfield, I. L. 1978. Evidence for rapid speciation in African cichlid fishes. Experientia 34: 335- 336. Kornfield, I. L. 1991. Genetics. In "Cichlid Fishes: Behavior, Ecology and Evolution" (M. H. A. Keenleyside, ed.), pp. 103-150. Chapman and Hall, London. Komfield, I. L., Ritte, U., Richler, C., and Wahrman, J. 1979. Biochemical and cytological differentiation among cichlid fishes of the Sea of Galilee. Evolution 33:1-14. LeBlanc-Straceski, J. M., Montgomery, K. T., Kissel, H., Murtaugh, L., Tsai, P., Ward, D. C., Krauter, K. S., and Kucherlapati, R. 1994. Twenty-one polymorphic markers from human chromosome 12 for integration of genetic and physical maps. Genomics 19:341349. Levinson, G., and Gutman, G. 1987. Slipped-strand mispairing: A major mechanism for DNA sequence evolution. Mol. Biol. Evol. 4(3):203-221. Litt, M., Hauge, X., and Sharma, V. 1993. Shadow bands seen when typing polymorphic dinucleotide repeats: Some causes and cures. BioTechniques 15(2):280-284. Livingstone, D. A. 1980. Environmental changes in the Nile head-
waters. In "The Sahara and the Nile" (M. A. J. Williams and H. Faure, eds.), pp. 339-359. Balkema, Rotterdam. McAndrew, B. J., and Majumdar, K. C. 1983. Tilapia stock identification using electrophoretic markers. Aquaculture 30: 249- 261. McAndrew, B. J., and Majumdar, K. C. 1984. Evolutionary relationships within three Tilapiine genera (Pisces: Cichlidae). Zool. J. Linn. Soc. 80:421-435. McKaye, K. R., Kocher, T., Reinthal, P., and Kornfield, I. 1982. Genetic analysis of a sympatric sibling species complex of Petrotilapia Trewavas (Cichlidae, Lake Malawi). Zool. J. Linn. Soc. 76:91-96. Meyer, A. 1987. Phenotypic plasticity and heterochrony in Cichlasoma managuense (Pisces, Cichlidae) and their implications for speciation in cichlid fishes. Evolution 41(6): 1357-1369. Meyer, A. 1993. Phylogenetic relationships and evolutionary processes in East African cichlid fishes. Trends Ecol. Evol. 8:279-284. Meyer, A., Kocher, T. D., Basasibwaki, P., and Wilson, A. C. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature 347:550-553. Moran, P., and Kornfield, I. 1993. Retention of an ancestral polymorphism in the Mbuna species flock (Teleostei: Cichlidae) of Lake Malawi. Mol. Biol. Evol. 10(5):1015-1029. Muralidharan, K., and Wakeland, E. K. 1993. Concentration of primer and template qualitatively affects products in randomamplified polymorphic DNA PCR. BioTechniques 14(3):362-364. Nei, M. 1972. Genetic distance between populations. Am. Nat. 949: 283-292. Nishida, M. 1991. Lake Tanganyika as an evolutionary reservoir of old lineages of East African cichlid fishes: Inferences from allozyme data. Experientia 47:974-979. Oliver, M. K. 1984. "Systematics of African Cichlid Fishes; Determination of the Most Primitive Taxon, and Studies on the Haplochromines of Lake Malawi (Teleostei: Cichlidae)." Ph.D. thesis, Yale University, New Haven, CT. Ono, H., O'Huigin, C., Tichy, H., and Klein, J. 1993. Major-histocompatibility-complex variation in two species of cichlid fishes from Lake Malawi. Mol. Biol. Evol. 10:1060-1072. Orita, M., Iwahana, H., Kanazawa, H., Hayashi, K., and Sekiya, T. 1989a. Detection of polymorphisms of human DNA by gel electrophoresis as single-strand conformation polymorphism. Proc. Natl. Acad. Sci. USA 86:2766-2770. Orita, M., Suzuki, Y., Sekiya, T., and Hayashi, K. 1989b. Rapid and sensitive detection of point mutations and DNA polymorphisms using the polymerase chain reaction. Genomics 5:874-879. Owen, R. B., Crossley, R., Johnson, T. C., Tweddle, D., Kornfield, I., Davison, S., Eccles, D. H., and Engstrom, D. E. 1990. Major low levels of Lake Malawi and their implications for speciation rates in cichlid fishes. Proc. R. Soc. Lond. B 240:519-553. Pellegrin, J. 1904. Contribution a l'6tude anatomique, biologique et taxonomique des poissons de la famille des cichlid6s. M~m. Soc. Zool. Fr. 16: 41-402. Poll, M. 1986. Classification des Cichlidae du lac Tanganyika: Tribus, genres et esp~ces. M~m. Acad. R. Belg. CI. Sci. 45: 5-163. Postlethwait, J. H., Johnson, S. L., Midson, C. N., Talbot, W. S., Gates, M., Ballinger, E. W., Africa, D., Andrews, R., Carl, T., Eisen, J. S., Home, S., Kimmel, C. B., Hutchinson, M., Johnson, M., and Rodriguez, A. 1994. A genetic linkage map for the zebrafish. Science 264: 699- 703. Regan, C. T. 1906. A revision of the fishes of the South American cichlid genera Cichla, Chaetobranchus and Chaetobranchopsis, with notes on the genera of the American Cichlidae. Ann. Mag. Nat. Hist. 7: 230-239. Regan, C. T. 1920. The classification of the fishes of the family Cichlidae. I. The Tanganyikan genera. Ann. Mag. Nat. Hist. 9:33-53. Ribbink, A. J. 1991. Distribution and ecology of the cichlids of the
4. Reconstruction of Cichlid Phylogeny
African Great Lakes. In "Cichlid Fishes: Behavior, Ecology and Evolution" (M. H. A. Keenleyside, ed.), pp. 36-59. Chapman and Hall, London. Riley, J., Butler, R., Ogilvie, D., Finniear, R., Jenner, D., Powell, S., Anand, R., Smith, J. C., and Markham, A. F. 1990. A novel rapid method for the isolation of terminal sequence from yeast artificial chromosome (YAC) clones. Nucleic Acids Res. 18:2887-2890. Roy, M. S., Geffen, E., Smith, D., Ostrander, E. A., and Wayne, R. K. 1994. Patterns of differentiation and hybridization in North American wolflike canids, revealed by analysis of microsatellite loci. Mol. Biol. Evol. 11(4):553-570. Sage, R. D., Loiselle, P. V., Basasibwaki, P., and Wilson, A. C. 1984. Molecular versus morphological change among cichlid fishes of Lake Victoria. In "Evolution of Fish Species Flocks" (A. A. Echelle and I. Kornfield, eds.), pp. 185-197. University of Maine at Orono Press. Maine. Sage, R. D., and Selander, R. K. 1975. Trophic radiation through polymorphism in cichlid fishes. Proc. Natl. Acad. Sci. USA 72: 46694673. Saiki, R. K., Gelfland, D. H., Stoffel, S., Scharf, S. J., Higuchi, I. G., Horn, G. T., Mullis, K. B., and Erlich, H. A. 1988. Primer directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487-491. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Sambrook, J., Fritsch, E. F., and Maniatis, T. 1989. "Molecular Cloning: A Laboratory Manual." Cold Spring Harbor Press, Cold Spring Harbor, NY. Sanger, F., Nicklen, S., and Coulson, A. R. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74: 5463-5467. Schliewen, U. K., Tautz, D., and P~i~ibo, S. 1994. Sympatric speciation suggested by monophyly of crater lake cichlids. Nature 368:629632. Schl6tterer, C., and Tautz, D. 1992. Slippage synthesis of simple sequence DNA. Nucleic Acids Res. 20(2):211-215. Scribner, K. T., Arntzen, J. W., and Burke, T. 1994. Comparative analysis of intra- and interpopulation genetic diversity in Bufo bufo, using allozyme, single-locus microsatellite, minisatellite, and multilocus data. Mol. Biol. Evol. 11(5):737-748. Serikawa, T., Montagutelli, X., Simon-Chazottes, D., and Gu6net, J.-L. 1992. Polymorphisms revealed by PCR with single, shortsized, arbitrary primers are reliable markers for mouse and rat gene mapping. Mamm. Genome 3:65- 72. Seyoum, S. 1989. "Stock Identification and the Evolutionary Relationship of the Genera Oreochromis, Sarotherodon and Tilapia (Pisces: Cichlidae) Using Allozyme Analysis and Restriction Endonuclease Analysis of Mitochondrial DNA." Ph.D. thesis, University of Waterloo, Waterloo, Ontario, Canada. Shriver, M. D., Jin, L., Boerwinkle, E., Deka, R., Ferrell, R. E., and Chakraborty, R. 1995. A novel measure of genetic distances for highly polymorphic tandem repeat loci. Mol. Biol. Evol. 12(5): 914-920. Slatkin, M. 1995. A measure of population subdivision based on microsatellite allele frequencies. Genetics 139: 457-462. Smith, J. J., Scott-Craig, J. S., Ledbetter, J. R., Bush, G. L., Roberts, D. L., and Fulbright, D. W. 1994. Characterization of random amplified polymorphic DNA (RAPD) products from Xanthomonas campestris and some comments on the use of RAPD products in phylogenetic analysis. Mol. Phylogenet. Evol. 3(2):135-145.
51
Sodsuk, P., and McAndrew, B. J. 1991. Molecular systematics of three tilapiine genera Tilapia, Sarotherodon and Oreochromis using allozyme data. J. Fish Biol. 39:301-308. Stiassny, M. L. J. 1987. Cichlid familial interrelationships and the placement of the neotropical genus Cichla (Perciformes, Labroidei). J. Nat. Hist. 21:1311-1331. Stiassny, M. L. J. 1991. Phylogenetic intrarelationships of the family Cichlidae: An Overview. In "Cichlid Fishes: Behavior, Ecology and Evolution" (M. H. A. Keenleyside, ed.), pp. 1-35. Chapman and Hall, London. Sturmbauer, C., and Meyer, A. 1992. Genetic divergence, speciation and morphological stasis in a lineage of African cichlid fishes. Nature 358:578-581. Sturmbauer, C., and Meyer, A. 1993. Mitochondrial phylogeny of the endemic mouthbrooding lineages of cichlid fishes from Lake Tanganyika in Eastern Africa. Mol. Biol. Evol. 10:751-768. Sturmbauer, C., Verheyen, E., and Meyer, A. 1994. Mitochondrial phylogeny of the Lamprologini, the major substrate spawning lineage of cichlid fishes from Lake Tanganyika in Eastern Africa. Mol. Biol. Evol. 11:691-703. S(iltmann, H., Mayer, W. E., Figueroa, F., Tichy, H., and Klein, J. 1995. Phylogenetic analysis of cichlid fishes using nuclear DNA markers. Mol. Biol. Evol. 12(6): 1033-1047. Swofford, D. L. 1993. "PAUP: Phylogenetic Analysis Using Parsimony, Version 3.1.1." Computer program distributed by the Illinois Natural History Survey, Champaign, Ill. Trewavas, E. 1973. On the cichlid fishes of the genus Pelmatochromis with a proposal of a new genus for P. congicus; on the relationship between Pelmatochromis and Tilapia and the recognition of Sarotherodon as a distinct genus. Bull. Br. Mus. Nat. Hist. (Zool.) 26: 331-419. Trewavas, E. 1983. Tilapiine fishes of the genera Sarotherodon, Oreochromis and Danakilia. Br. Mus. (Nat. Hist.) Lond. Valdes, A. M., Slatkin, M., and Freimer, N. B. 1993. Allele frequencies at microsatellite loci: The stepwise mutation model revisited. Genetics 133: 737- 749. Vandewalle, P. 1971. Comparaison ost6ologique et myologique de cinq Cichlidae Africains et Sud-Americains. Ann. Soc. R. Zool. Belg. 101:259-292. Weber, J. L., and Wong, C. 1993. Mutation of human short tandem repeats. Hum. Mol. Genet. 2:1123-1128. Welsh, J., and McClelland, M. 1990. Fingerprinting genomes using PCR with arbitrary primers. Nucleic Acids Res. 18:7213-7218. Williams, J. G. K., Kubelik, A. R., Livak, K. J., Rafalski, J. A., and Tingey, S. V. 1990. DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res. 18: 6531-6535. Witte, F., and van Oijen, M. J. P. 1990. Taxonomy, ecology and fishery of Lake Victoria haplochromine trophic groups. Zool. Verh. Leiden 262:1-47. Wooster, R., Cleton-Jansen, A.-M., Collins, N., Mangion, J., Cornelis, R. S., Cooper, C. S., Gusterson, B. A., Ponder, B. A. J., Von Deimling, A., Wiestler, O. D., Cornelisse, C. J., Devilee, P., and Stratton, M. R. 1994. Instability of short tandem repeats (microsatellites) in human cancers. Nat. Genet. 6:152-156. Wright, J. M. 1989. Nucleotide sequence, genomic organization and evolution of a major repetitive DNA family in tilapia (Oreochromis mossambicus/hornorum). Nucleic Acids Res. 17:5071-5079. Zihler, F. 1982. Gross morphology and configuration of digestive tracts of Cichlidae (Teleostei, Perciformes): Phylogenetic and functional significance. Neth. J. Zool. 32:544-571.
This Page Intentionally Left Blank
CHAPTER
5 Biogeographic Analysis of Pacific Trout (Oncorhynchus mykiss) in California and Mexico Based on Mitochondrial DNA and Nuclear Microsatellites JENNIFER L. NIELSEN and MONIQUE C. FOUNTAIN USDA Forest Service Pacific Southwest Research Station and Hopkins Marine Station Department of Biology Stanford University Pacific Grove, California 93950
rainbow trout). Genetic and morphological characters reported in many studies have confirmed the Pacific trout as true members of Oncorhynchus (PJehnke, 1968; Utter et al., 1973; Kendall and Behnke, 1984; Thomas et al., 1986; Stearley and Smith, 1993; Utter and Allendorf, 1994; see also Phillips and Oakley, 1997). The popular terms "salmon" and "trout" are now generally thought to refer to a flexibility in life history pattern .that has evolved independently among separate monophyletic groups, the Pacific Oncorhynchus [i.e., anadromous steelhead and freshwater rainbow trout O. mykiss; anadromous sockeye salmon (O nerka) and resident kokanee; sea-run and resident cutthroat trout O. clarki], and the Atlantic Salmo (i.e., anadromous and landlocked Atlantic salmon, S. salar; anadromous and resident brown trout, S. trutta). Similar trade-offs in life history traits are also found within Salvelinus (i.e., lacustrine and anadromous char, S. alpinus), suggesting that this flexibility in life history may
I. Introduction At the turn of the century, the Pacific basin trout were traditionally classified as members of the Atlantic lineage Salmo, based on analyses of morphology, life history characteristics, and iteroparity in the Pacific trout that were lacking in other Pacific salmon (Oncorhynchus spp.) The current reclassification of Pacific steelhead, cutthroat, and rainbow trout into the genus Oncorhynchus was based on new morphological characters and associations drawn from molecular genetic data (Smith and Stearley, 1989). As early as 1914, Regan had suggested that the Pacific trout were more closely related to the Pacific salmon (Oncorhynchus) than to the European Salmo. Based on osteological characters, Vladykov (1963) recognized that Pacific basin trout were separable from Atlantic basin Salmo, and Behnke (1965) first reported the near morphological identity of O. mykiss (Asiatic trout) and S. gairdnerii (North American
MOLECULAR SYSTEMATICS OF FISHES
JONATHAN M. WRIGHT Marine Gene Probe Laboratory Department of Biology Dalhousie University Halifax, Nova Scotia Canada B3H 4J1
53
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
54
JENNIFER L. NIELSEN et al.
be a characteristic with roots ancestral to the split between Salmo and Oncorhynchus (Stearley and Smith, 1993; Foote et al., 1994). Genetic studies have revealed cryptic population structure due to behavior or life history variation that was not obvious from other types of analyses (Bowen et al., 1993; Bowcock et al., 1994). The reclassification of all Pacific anadromous steelhead and resident rainbow trout as O. mykiss has, therefore, led to significant controversy over the taxonomic status and genetic identity of the many subgroups of trout found throughout western North America (Behnke, 1992). Specific interest has evolved around the position of the California golden trout, the McCloud rainbow trout, Baja rainbow trout, the Eagle Lake rainbow trout, and the interior "redband" trout in the lineage of O. mykiss. The first genetic data used to support biogeographic separation of western trout into two major subgroups came from a study of allozymes via electrophoresis analyses conducted by Allendorf (1975). This study documented the geographical separation of western trout around the Cascade Crest (Pacific Crest), dividing O. mykiss into "inland" and "coastal" populations. Allendorf (1975) showed that allozyme allelic frequency differences separated inland and coastal groups of O. mykiss longitudinally over a broad geographic area throughout the western United States. Subsequent molecular studies conducted on the North American coastal distributions of O. mykiss supported genetic similarities between both resident (rainbow trout) and anadromous (steelhead) forms of coastal Pacific trout within geographically proximate locations (Utter et al., 1973; Okazaki, 1984; Parkinson, 1984; Currens et al., 1990; Gall et al., 1990; Reisenbichler et al., 1992). DNA analyses of the intraspecific genetic diversity in coastal O. mykiss confirmed the genetic similarity of resident and anadromous life history forms of trout from proximate geographic areas (Wilson et al., 1985; Thomas and Beckenbach, 1989)and have shown significant biogeographic structure at the southern extent of the range (Nielsen et al., 1994b). The latter study used mitochondrial DNA (mtDNA) and nuclear microsatellites to demonstrate a high degree of population differentiation and levels of genetic diversity that were unprecedented for this species. This unique level of genetic diversity found in southern steelhead has been confirmed by allozyme analyses of California coastal stocks by the National Marine Fisheries Service for their scientific status review resulting from a petition for Federal listing of the Pacific steelhead under the Endangered Species Act (Dr. R. Waples, personal communications, National Marine Fisheries Service, Seattle, WA).
DNA studies of Pacific salmonids initially concentrated on mitochondrial DNA markers due to the relatively rapid rate of evolution in this maternally inherited molecule, the ease of extraction and amplification of mtDNA, and a significant literature on the theory and application of mtDNA sequence analyses available to researchers by the end of the 1980s (Avise et al., 1987, and literature therein). Controversy has evolved over the degree and level of phylogenetic resolution available with mtDNA markers due to demonstrated variability in mutation rates for individual parts of the molecule among different taxa and possible saturation of base-point mutations in highly polymorphic regions (Avise et al., 1987, 1994b; Hillis, 1995). Despite such arguments, this molecule has played an important role in high-resolution analyses of population structure in closely related vertebrate groups (Moritz et al., 1987; Stoneking et al., 1991; Avise, 1994 and references therein; Avise et al., 1994a; Moritz et al., 1995). The development of simple protocols for the detection and amplification of short repetitive DNA sequences (i.e., microsatellites; Miklos, 1985; Tautz, 1989; Weber and May, 1989; Moore et al., 1991) provides access to new molecular tools derived from the nuclear genome with unusually high levels of intraspecific polymorphism. Short repetitive DNAs are common throughout the eukaryotic genome, have exceptionally high mutation rates, and generally provide large numbers of alleles useful for the reconstruction of closely related phylogenetic groups (Kelly et al., 1991; Henderson and Petes, 1992; Queller et al., 1993; Estoup et al., 1993; Bowcock et al., 1994). Polymerase chain reaction (PCR) amplification of microsatellites has provided an alternative molecular approach for the analysis of groups sharing recent evolutionary divergence (Burke et al., 1989; Bruford and Wayne, 1993; Queller et al., 1993; Ellegren, 1995). Nuclear microsatellite loci have, in general, provided a degree of analysis not previously available at the intraspecific level from mtDNA or allozymes (Bowcock et al., 1994; Goldstein et al., 1995; FitzSimmons et al., 1995; Nielsen, 1996). The function and biochemical mechanisms underlying mutation of simple sequence repeat loci, however, remain unknown and controversial (Long and David, 1980; Di Rienzo et al., 1994). One theoretical mechanism of mutation has been proposed for the microsatellite class of tandem repeats: a stepwise mutation process in which an allele mutates up or down by a small number of nucleotide repeat units (Schlotterer and Tautz, 1992). Variations on the stepwise mutation model underlie two recently developed genetic distance measures designed specifically for microsatellite loci (Goldstein et al., 1995; Slatkin, 1995). These dis-
5. mtDNA and Nuclear Microsatellites in Trout
tance measures are closely related in their analytical techniques, but are based on different conceptual interpretations of the stepwise mechanisms leading to repeat polymorphisms. Goldstein et al. (1995) used a strict (single-step) stepwise mutation model to analyze variation in the number of repeats found within a simple DNA sequence. Slatkin (1995), however, developed a two-phase mutation model introduced by Di Rienzo et al. (1994), which allows replication or deletion of more than one repeat unit as a single mutation event. Under the two-phase model, single-step mutations (involving only one repeat unit) are thought to be the most common elements of change, but events involving larger groups of repeat units, inserted or deleted as a single mutational element, are possible (Di Rienzo et al., 1994). Despite the fact that mutational mechanisms in repetitive DNA remain an open question, microsatellite markers have proven useful in many vertebrate population studies (Bruford and Wayne, 1993; Wright, 1993; Bowcock et al., 1994; Morin et al., 1994a,b; Nielsen et al., 1994b; Wright and Bentzen, 1994; Spencer et al., 1995; Gerloff et al., 1995). To date, however, no empirical studies have looked at the implications of the different analytical approaches to microsatellite distance data. Phylogenies based on single genes or short sequence loci, especially among closely related taxa, can be discordant with organismal phylogenies (Weller et al., 1994). Discrepancies between an individual gene tree and the true phylogeny of an organism can arise from lineage-sorting processes or allelic introgression between closely related groups (Neigel and Avise, 1986; Pamilo and Nei, 1988). The degree of phylogenetic congruence available among independent genetic markers has become an important issue in the interpretation of gene trees in relationship to organismal phylogenies (Birky et al., 1989; Bernatchez and Danzmann, 1993; Avise et al., 1994b; Bernatchez, 1995; Moritz et al., 1995). Phylogenetic results derived from several independent DNA regions provide a more robust perspective on the genetic history of an individual group or population than any one gene or nucleotide sequence alone (Avise, 1994; Cummings et al., 1995). It is important, however, that the chosen gene or sequence data used to test congruence among phylogenetic information are appropriately matched to the level or degree of phylogenetic divergence in question (Graybeal 1994). This chapter compares genetic diversity for mtDNA and three independent, highly polymorphic nuclear microsatellite markers in putative wild trout and steelhead populations from California and Mexico. DNA data on trout populations from interior as well as coastal locations are presented, and the intraspecific biogeographic resolution available for O. mykiss in
55
California and Mexico using both mtDNA and nuclear markers is addressed. Inferences available from these molecular data concerning the status of various populations of trout and steelhead are discussed.
II. Material and Methods A. Sampling Protocol
Coastal steelhead and interior trout (O. mykiss) were sampled noninvasively by taking fin clips (2-mm 2) from 354 live fish captured within riverine habitats in California and Mexico (Fig. 1). Tissues were sent as frozen or dried samples to the authors' laboratory from 1990 to 1995 and stored at -70~ until DNA extraction and amplifications were performed. O. mykiss were sampled from stream locations where wild stocks of steelhead and trout have been reported to have received a minimum of hatchery introductions since the mid-1930s [California Department of Fish and Game (CDFG) unpublished records and personal communications; Swift et al., 1993; Gall, 1995; Titus et al., in press]. Streams and rivers included in these analyses were divided into six general geographic localities to aid in the graphic depiction of data (see Appendix I). The northern and southern regions of California were separated at the San Francisco Bay and the interior and coastal populations were separated by the western boundary of the Klamath mountains and the great valley region in the north, the Sierra Nevada range throughout central California, and the transverse range in the south. All coastal steelhead and trout in California are currently classified as O. mykiss
Steelhead Eel River
/
Steelhead northcoast Eagle Lake
/.-~,.,~on,,~,~,,.,,,~/
rainbow~~~ ~~
Upper SacramentoRiver rainbow Steelhead RussianRiver
McCIoudRiver rainbow GoldenTroutCreek
Steelhead SF Bay
SF KernRivergoldentrout
Little Kerngoldentrout
Kern River rainbow
Central coast steelhead
Southern steelheadBig Sur Southern steelheadSanta YnezR.
/
Southern steelhead Malib~Cr.eek
Bajatrout
(,(~~___~tL./Rio Yaquitrout /.../~A~Mexico ,~,,,,
FIGURE 1 General location of DNA-sampling sites for steelhead and trout in California and Mexicoused in this study.
56
JENNIFERL. NIELSENet al.
irideus (after Behnke, 1992). The north interior collection of California trout included two putative subspecies of trout, the Eagle Lake trout (O. mykiss aquilarum) and the McCloud River redband trout (O. mykiss stonei), but probably contained a diverse mixture of populations with redband and coastal rainbow ancestry (Behnke, 1992). The south interior California trout collection was made up of three reported subspecies: the Kern River rainbow trout (O. mykiss gilberti), Little Kern River rainbow trout (O. mykiss whitei), and California golden trout from the South Fork Kern River (O. mykiss aquabonita). Mexican trout from Baja California Norte (O. mykiss nelsoni) were collected by G. Ruiz-Campos (Facultad de Ciencias, Universidad Aut6noma de Baja California). Fin clips taken from trout from the Rio Yaqui basin (an undefined subspecies of O. mykiss, R. R. Miller, personal communications) were sent to the authors' laboratory by B. L. Jensen (U. S. Fish and Wildlife Service, Dexter National Fish Hatchery and Technology Center, Dexter, NM) and by Jose Campoy Favela (Centro Ecologico de Sonora, Hermosillo, Sonora, Mexico). These samples were collected from a headwaters tributary of the Rio La Cueva, a tributary of the Rio Bavispe, which is a tributary of the Rio Yaqui. B. M i t o c h o n d r i a l DNA Total genomic DNA was extracted from fin clips using Chelex-100 resin (BioRad) following the methods of Nielsen et al. (1994a). Primers used in this study (S-phe and P2) were developed by W. K. Thomas (University of Missouri, Kansas City) in the late Allan Wilson's laboratory using the methods given in Kocher et al. (1989). These primers are known to amplify a highly variable segment of the mtDNA control region in salmonids. These primers permit amplification and sequencing of a segment containing 188 bp of the O. mykiss mtDNA control region and 5 bp of the adjacent phenyalanine tRNA gene. Primer sequences, amplification and sequencing protocols, and sequence of the entire region amplified by these primers in this species can be found in Nielsen et al. (1994a).
C. Microsatellites Three microsatellite loci [Omy77, Morris et al. (1996); Omy207, M. O'Connell, Marine Gene Probe Laboratory (MGPL), Dalhousie University, personal communications; and Ssa289, McConnell et al. (1995)] were chosen for this study based on their level of polymorphism in O. mykiss. Omy77 and Omy207 were developed specifically for O. mykiss at MGPL, Dalhousie
University. Ssa289 was developed by MGPL for Atlantic salmon. The sequence for primers amplifying these microsatellite loci appears in the respective literature or is available by request from MGPL. For each locus, primer B was labeled according to protocols given in Nielsen et al. (1994b). The methods of Nielsen et al. (1994b) were used except that each PCR reaction contained 3.75 ~1 doubledistilled H20, 0.625 #1 10• PCR buffer (670 ~1 1 M Tris, 67 #1 1 M MgCI2, 83 ~1 2 M AmSo4, 7/.~1 14 M ]3-mercaptoethanol, and 173 Izl double-distilled H20), 0.625 #1 10 mM dDNTPs, 0.625 #1 10/~M primer A, 0.32 ~1 1 #M primer B, 0.32/~1 labeled B primer, and 0.03 ~1 (0.15 units) Taq polymerase. PCR conditions were 30 cycles of 94 ~ for 40 sec, 50~ for 1 min, and 72~ for 2 min. Microsatellites were run out on a 6% polyacrylamide gel. Prior to loading the gel, 5/~1 of loading buffer [94% formamide, 4% 0.5 EDTA, 0.025% (w/v) both bromphenol blue and xylene cyanol FF) was added to each sample. The size of each microsatellite allele was determined by reference to the M13mp18 sequence, known DNA samples that were rerun on each gel, and a doublestranded reference marker showing the common alleles available for each microsatellite locus. Only unambiguous bands were scored, and in the case of multiple (shadow) bands, the darkest band was scored as the allele. The appearance of stutter bands which overlap between alleles was resolved by comparing the intensity and number of stutter bands for each individual at each locus (O'Reilly and Wright, 1995). To ensure consistency in both the PCR reactions and the scoring of microsatellites, 3.5% of all samples were rerun separately on different gels and scored independently by two people.
D. A n a l y t i c a l A p p r o a c h A pairwise distance matrix was constructed for sequences from the mtDNA control region segment amplified by S-Phe and P2, based on the two-parameter model of Kimura (1980). Phylogenetic analysis was performed on mtDNA data using the unrooted neighbor-joining (NJ) tree procedure from PHYLIP (Felsenstein, 1991) with 1000 bootstrap replicates (Felsenstein, 1985) to assess reproducibility of the NJ mtDNAbranching pattern. Previous studies have documented the biogeographic concordance associated with the mtDNA haplotypes in coastal steelhead (Nielsen et al., 1994b; Neeley, 1995). To test for differences in biogeographic distribution of genotypes using nuclear microsatellites, microsatellite data were pooled for individual trout by
5. mtDNA and Nuclear Microsatellites in Trout
known mtDNA haplotype and capture location, where the parenthetical mtDNA haplotype designation refers to the most common haplotype found in that particular geographic population (Appendix II). These geographic-haplotype groups then served as sample units for microsatellite genetic distance analyses and tree development for comparison with the mtDNA NJ tree, allowing the authors to discuss results available from microsatellite data in individual populations with documented mtDNA phylogeographic structures. The trees depicted in these analyses were not intended to reflect historic evolutionary associations among trout populations, but rather to test for genetic congruence in biogeographic data drawn from two independent molecular markers with potentially different evolutionary histories among these populations. Observed and expected values of heterozygotes were calculated for microsatellite data, and a test for Hardy-Weinberg (HW) equilibrium was performed for all populations combined according to the Fisher method described by Louis and Dempster (1987), which provided an estimate of the probability of rejecting the null hypothesis, i.e., HW equilibrium. A pairwise genetic distance matrix was calculated for allelic diversity using both the Slatkin (1995) and the Goldstein et al. (1995) methods for the three microsatellite loci combined. For the Goldstein et al. (1995) distance analyses, the authors used the program available from Dr. E. Minch, Department of Genetics, Stanford University. Rst analyses were performed using a Pascal program developed by M. C. F. that implemented Slatkin's stepwise model for distance analyses. Both distance measures assume a linear expectation of the average-squared distance for each locus (assuming no correlation between mutation rate and repeat score) and use the arithmetic average of mutation rates across loci. Statistics in both methods are equivalent to a general analysis of variance. Both methods compute an average sum of squares of the differences in allelic size within each population [Sw in Slatkin (1995); Do in Goldstein et al. (1995)] and the average squared difference between all possible pairs of populations (SB and D1 respectively) to obtain an estimate of variance in allele size in the total population. The basic difference between the two methods involves how they interpret the parameters of the mutation process. Slatkin's Rst [developed under the assumptions of the infinite allele model, Slatkin (1991)] used a ratio of combinations of the mean squared distance which cancels out all parameters of the mutation process [see formula 12 in Slatkin (1995)]. Goldstein et al. (1995) maintain an estimate of the mutation process under the expectation of a strict, single-step (one
57
repeat unit) shift for each mutation event. Distance data from both methods were used to generate a consensus neighbor-joining tree (PHYLIP, Felsenstein, 1991). One thousand bootstrap replicate trees were generated to assess the reproducibility of branching patterns found in each consensus tree. Analysis of variance (ANOVA) and factor analysis using principal components (PCA) were used to describe biogeographic associations between genotype (mtDNA or microsatellite allelic diversity) and sample location (longitude and latitude). Each factor represented a linear combination of actual mtDNA haplotype or microsatellite allelic frequencies (weighted for sample size) over all genotypes. Factor analyses were based on the variance-covariance matrix for all sampled populations such that the range of components was associated with the proportion of total variance over all locations. The first component was, therefore, associated with the greatest portion of the total variance for all genotypes over all locations, the second component had the second greatest proportion, etc. Least-squares multiple regression analyses were then used to regress the first principal component on latitude or longitude by genotype to graphically depict the correlation between sampling locality and genotype distributions.
III. Results A. Mitochondrial
DNA
Three previously unpublished mtDNA controlregion haplotypes, containing novel single base mutations, were found in this survey of trout populations from California and Mexico (MYS15, MYS16, and MYS18; Table I). MYS15 was found only in golden trout from Golden Trout Creek in the Kern River basin and in Taylor Creek, a tributary to the South Fork Kern River. MYS16 was found in two tributaries of the South Fork Kern River (Fay Creek, Manter Creek), in Ramshaw Meadows on the South Fork Kern River, in Golden Trout Creek, and in Eagle Lake rainbow trout. MYS18 was unique to the trout of the Rio Yaqui basin of northwestern Mexico. Twenty-seven trout from the San Pedro M~rtir basin in Baja California were monomorphic for mtDNA haplotype MYS1. All other mtDNA haplotypes found in freshwater trout samples taken from interior California rivers and streams carried identical control-region haplotypes to those previously reported in coastal anadromous populations (Nielsen et al., 1994b). The frequency distribution of the
58
JENNIFER L. NIELSEN et al.
TABLE I Mitochondrial Control Region Variable Sites and Nucleotide Changes Found in California Steelhead and Trout (Oncorhynchus mykiss) in 1990-1995 and in Two Populations of Mexican Trout from Baja California and the Rio Yaqui Base pair no.a mtDNA type
No.
1021
MYS1 MYS2 MYS3 MYS5 MYS6 MYS8 MYS12 MYS13 MYS15 MYS16 MYS18
99 17 108 20 7 25 7 8 7 45 11
T C T T T T T T T T T
1050
1086
1103
1104
1106
1109
1147
1149
T T T C C C C C T T T
A A A G G A A G A A A
G G G G G G G G A G G
A A A C C C C C A A A
G A A G G G G G A G G
G G G A A A G G G A G
C C C C C C C C C T C
T T T T C T T T T T C
"Base pair numbers follow those published by Digby et al. (1992). The number of fish sequenced for this study is given for each mtDNA type. Mitochondrial haplotypes MYS1-14 are equivalent to haplotypes ST1-14 previously reported in Nielsen et al. (1994b). ST4, ST7, ST9, ST10, ST11, and ST14 were represented by less than five confirmed samples each and were, therefore, not included in these analyses.
s o u t h e r n extent of the range, i.e., s o u t h of Point Conc e p t i o n (MYS5, MYS6, MYS8, a n d MYS13), a n d one n o r t h e r n California h a p l o t y p e (MYS12) s h o w e d significant u n i t y w i t h b o o t s t r a p v a l u e s > 50%. A 68% bootstrap v a l u e s u p p o r t e d u n i t y b e t w e e n coastal s t e e l h e a d (MYS3) a n d r e s i d e n t t r o u t f r o m the S a c r a m e n t o River (MYS3), the Little K e r n River (MYS3), a n d t w o g o l d e n t r o u t h a p l o t y p e p o p u l a t i o n s (MYS3 a n d MYS15). Genetic u n i t y b e t w e e n the Eagle Lake t r o u t a n d Califor-
11 u n i q u e m t D N A h a p l o t y p e s f o u n d in this s t u d y is g i v e n b y g e n e r a l g e o g r a p h i c location in Fig. 2. A n u n r o o t e d n e i g h b o r - j o i n i n g tree for controlr e g i o n m t D N A s e q u e n c e data s u m m e d for h a p l o t y p e p o p u l a t i o n s is d e p i c t e d in Fig. 3. This tree d i v i d e d the t r o u t - s t e e l h e a d a s s e m b l a g e into four g r o u p s supp o r t e d w i t h b o o t s t r a p v a l u e s > 50%, w h e n c o n s i d e r e d in r e l a t i o n s h i p to the Rio Yaqui trout. Steelhead m t D N A h a p l o t y p e s f o u n d m o s t f r e q u e n t l y at the
1
-
mtDNA Control Region
0.9 0.8 9 North Coast
0.7 o~
[ ] North Interior
0.6
Nil South Coast
0.s
L~ South Interior
~" 0.4
[ ] Mexico Coast
0.3
I~ Mexico Interior
0.2 O'
o
,' i 1
: 2
'," 3
I ' = =
5
6
l
', 8
12
13
15
16
18
mtDNA haplotype
Frequency distribution of Oncorhynchus mykiss mtDNA haplotypes given for six general geographic locations. Haplotype numbers are given by streams and geographic areas in Appendix I.
FIGURE 2
5. mtDNA and Nuclear Microsatellites in Trout
59
Sacramento rainbow trout (3) Little Kern R. golden trout (3) Kern River rainbow trout (3) CA steelhead (3)
35 68
CA golden trout (3) CA golden trout (15)
17 Rio Santo Domingo trout (1)
17
CA steelhead (1)
541 I
99
McCloud rainbow trout (1) CA steelhead (2)
83
Eagle Lake trout (16) CA golden trout (16) CA steelhead (6)
84 1 34
57 51
70
I I
I
CA steelhead (5) CA steelhead (13) CA steelhead (8) CA steelhead (12) Rio Yaqui trout (18)
FIGURE 3 Unrooted phylogenetic tree for a 188-bp mtDNA control region segment inferred from neighbor-joining analysis (PHYLIP) of pairwise distances calculated for 11 mtDNA haplotypes found in anadromous steelhead and resident trout in California and Mexico (19901995). For these analyses, parenthetical mtDNA haplotype designations represented the most common haplotype found in each particular population. Bootstrap (1000 replicates) probability values are given in percentages on the tree branches; values >50% are indicated in bold type.
nia golden trout, which shared identical m t D N A haplotypes (MSY16), was supported by a bootstrap value of 83%. B. N u c l e a r M i c r o s a t e l l i t e s
The three microsatellite loci used in this study contained dimeric repeats [Omy207 and Ssa289 poly(CA)poly(GT), and Omy77 poly(CT)-poly(GA)], found in tracts up to 74 repeat units long, with 10-33 alleles expressed per locus (Appendix II). Frequency distributions for microsatellite alleles are given by locus and geographic area in Fig. 4. The combined allelic distribution for the three loci was found to be in H a r d y Weinberg equilibrium over the total sample population (Fisher's exact p = 0.013). The microsatellites developed specifically for O. mykiss, i.e., Omy77 (27 alleles; range 77-141 bp) and Omy207 (33 alleles; range
76-148 bp), were significantly more polymorphic in California and Mexican trout and steelhead than the Ssa289 locus developed for Atlantic salmon (10 alleles; range 89-109 bp). All three loci conformed to the expectation of a single-step allele model, with gaps in the two-base repeat sequence occurring only in the largest alleles for Omy77 and Omy207. Genetic distance measures for the three microsatellite loci combined as calculated by Slatkin (1995) and Goldstein et al. (1995) are given by the haplotype population in Table II. Derived distance measures showed similar m e a n values for both models across all populations. Slatkin's m e a n Rst value was 0.207, whereas the m e a n Fst of Goldstein et al. was 0.205. The r of Goldstein et al. (expected duration of linearity of distance for the three loci combined) equaled 299,058 + 14,732 generations. Neighbor-joining trees developed from microsatellite distance data using both methods were
1
-r-
Locus - Omy77
0.9
i
0.8
North Coast
~] North Interior
0.7
I
0.6
South Coast South Interior
= 0.5 O"
[ ] Mexico Coast
n m
[ ] Mexico Interior
0.4 0.3 i
0.2 0.1
-
i
77-85
87-95
=
97-107
109-117
121-131
L
135-148
allele size range (bp)
0.9
Locus - O m y 2 0 7
0.8 0.7 0.6
i i
North Coast
[ ] North Interior
0.5
I
South Coast South Interior
0.4
[ ] Mexico Coast
0.3
E~ Mexico Interior 0.2 0.1 0 76-84
86-94
96-104
ii[
106-114
116-124
126-134
136-148
allele size range (bp)
Locus-
0.9
Ssa289
_.=
0.8
__=
0.7
i
-=
0.6
_=
___=
= 0.5 O" L
-
0.4
i
North Coast
[]
North Interior
i
South Coast
[ ] South Interior
~iiiii
[ ] Mexico Coast
0.3
I~ Mexico Interior
0.2
i
--=
0.1
1 89
91
93
9 95
97
~n 101
103
105
m 107
109
allele size (bp)
FIGURE 4 Frequency distributions of Oncorhynchus mykiss alleles from Omy77 (A),
60
Omy207 (B), and Ssa289 (C) microsatellite loci given by geographic area (see Appendix I for sample locations). Frequencies have been pooled by size class (each bin includes five sequential alleles) to aid in graphic resolution.
TABLE H
Genetic Distance Measures for Three Microsatellite Loci (Omy77, Omy207, Ssa289) from California and Mexican Trout Populations a
Population (haplotype) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
Rio Yaqui (18) CA steelhead (1) M c C l o u d (1) Rio Santo D o m i n g o (1) CA steelhead (2) CA g o l d e n (3) CA steelhead (3) Kern r a i n b o w (3) Little Kern River golden (3) Sacramento (3) CA steelhead (5) CA steelhead (6) CA steelhead (8) CA steelhead (12) CA steelhead (13) CA g o l d e n (15) CA golden (16) Eagle Lake (16)
1
2
3
m 23.10 41.05 94.92 25.96 32.10 35.46 22.46 56.43 7.28 44.09 11.68 42.67 28.43 55.70 13.39 10.00 11.53
0.40 m 4.93 109.53 0.25 15.32 7.79 45.41 64.87 36.89 8.51 3.64 10.00 1.22 34.91 12.20 11.70 31.80
0.42 0.04 ~ 91.92 3.19 11.36 2.96 47.06 53.29 50.56 3.24 9.03 3.61 3.21 21.37 14.44 16.64 37.19
4 0.90 0.59 0.59 ~ 106.21 73.58 69.52 35.31 6.08 79.73 84.11 81.47 73.11 108.65 29.47 60.25 68.54 45.57
5
6
7
8
9
10
11
12
13
14
15
16
17
18
0.33 0.03 0.04 0.59 ~ 12.67 6.67 44.46 62.38 37.98 7.52 3.89 8.75 0.48 32.46 11.51 11.45 31.85
0.63 0.22 0.09 0.71 0.07 -15.47 21.32 38.83 25.16 22.56 9.55 20.02 9.81 27.98 5.34 6.57 18.31
0.42 0.03 0.05 0.52 0.03 0.16 ~ 38.37 37.27 46.73 1.15 8.22 0.43 8.52 10.63 12.22 15.21 29.26
0.17 0.21 0.28 0.40 0.24 0.02 0.20 -15.21 9.56 52.65 23.75 46.01 44.15 32.80 10.79 11.83 1.98
0.59 0.40 0.34 0.34 0.34 0.41 0.26 0.02 m 46.23 49.41 43.24 41.12 64.09 13.24 28.12 34.02 20.42
0.30 0.27 0.29 0.60 0.21 0.32 0.28 0.04 0.36 ~ 60.04 18.94 56.06 37.59 59.33 12.16 9.29 5.48
0.44 0.05 0.05 0.57 0.04 0.17 0.00 0.27 0.31 0.33 ~ 12.77 0.40 9.87 15.12 20.01 23.17 41.30
0.00 0.10 0.14 0.63 0.10 0.14 0.14 0.18 0.26 0.11 0.13 ~ 12.11 4.80 26.95 3.01 2.63 13.94
0.43 0.03 0.05 0.50 0.05 0.20 0.00 0.23 0.27 0.30 0.02 0.20 ~ 10.95 10.75 17.24 20.78 36.31
0.11 0.07 0.01 0.59 0.03 0.31 0.00 0.23 0.22 0.05 0.00 0.00 0.06 -34.92 11.36 11.25 32.52
0.41 0.16 0.17 0.37 0.19 0.04 0.07 0.20 0.00 0.24 0.09 0.19 0.12 0.18 ~ 22.48 28.72 30.81
0.05 0.01 0.09 0.46 0.04 0.38 0.02 0.04 0.02 0.12 0.06 0.00 0.09 0.04 0.11 m 0.43 5.73
0.13 0.14 0.18 0.60 0.12 0.08 0.15 0.06 0.25 0.06 0.19 0.02 0.20 0.02 0.18 0.08 ~ 5.83
0.19 0.27 0.30 0.54 0.25 0.16 0.25 0.04 0.19 0.06 0.30 0.10 0.28 0.16 0.20 0.05 0.05
Distance measures (Rst) obtained according to the Slatkin (1995) m e t h o d using a stepwise m u t a t i o n process are given above the diagonal. Distance m e a s u r e s calculated according to Goldstein et al. (1995) using a one-step mutation model are given below the diagonal. a
62
JENNIFER L. NIELSEN et al. McCIoud r a i n b o w trout (1)
9
34 29
I
22 I
I
36
I
, C A steelhead (8) C A steelhead (5) C A steelhead ( 1 2) C A steelhead (1 3)
77
971 I
Rio Santo D o m i n g o trout (1) Little Kern R. golden trout (3) CA golden trout (1 6)
20
13
C A steelhead (2) C A steelhead (3)
28
I
11
CA steelhead (1)
CA golden trout (3) S a c r a m e n t o r a i n b o w trout (3)
4? r
44 31 !
I
Eagle Lake trout (16) Kern River r a i n b o w trout (3) CA golden trout (1 5) CA steelhead (6) Rio Yaqui trout (1 8)
FIGURE 5 Consensusunrooted phylogenetic tree for three microsatellite loci combined (Omy77, Omy207, and Ssa289) inferred from pairwise distances (Rst) resulting from mi-
crosatellite distance analysis based on Slatkin (1995) and using neighbor-joininganalysis (PHYLIP) of distance values to construct the tree. Bootstrap probability values based on 1000 replicate trees developed from bootstrapping of the original Rst distance data are given in percentages on the tree branches; values >50% are indicated in bold type.
not congruent for most haplotype populations (Figs. 5 and 6). Bootstrap values > 50% were rare among the microsatellite NJ branching units, making comparisons between the microsatellite and mtDNA trees difficult. No similar branching patterns were found by analyses of microsatellites that reflected the biogeographic associations developed from the authors' analyses of m t D N A haplotypes. The genetic similarity of the Rio Santo Domingo trout from Baja (MYS1), trout from the Little Kern River (MYS3), and a haplotype found only in southern steelhead (MYS13) was supported with > 50% bootstrap values in both microsatellite NJ trees. In both trees, close associations among the coastal steelhead populations (with the exception of haplotype MSY13)
and the McCloud River rainbow trout were supported. Only the Goldstein distance method, however, supported this association with bootstrap values 50%. Eagle Lake trout that shared a mtDNA haplotype with the South Fork Kern River golden trout (MYS16) were more closely associated with the Kern River and Sacramento River rainbow trout (both MYS3 haplotypes) using microsatellites. Neither tree supported these association with high bootstrap values. C. B i o g e o g r a p h i c
Concordance
A significant correlation was observed between mtDNA haplotype variation and both latitude (ANOVA F test < 0.001) and longitude (F test = 0.01), with lati-
5. m t D N A and Nuclear Microsatellites in Trout
63
McCloud rainbow trout (1) 11 22 [ 42
I
CA steelhead (1) CA steelhead (12)
38 /
CA Steelhead (2) 69
CA Steelhead (8)
55
CA steelhead (5)
59
CA Steelhead (3)
32i
CA steelhead (6) 21
CA golden trout (3) CA Steelhead (13)
58
21 21
s9 I I
Rio Santo Domingo trout (1) Little Kern golden trout (3) CA golden trout (15)
11 38 99
CA golden trout (16) Kern rainbow trout (3) Eagle Lake trout (16) Sacramento rainbow trout (3)
Rio Yaqui trout (18) FIGURE 6 Unrooted phylogenetic tree for three microsatellite loci combined (Omy77, Omy207, and Ssa289) inferred from pairwise distances resulting from microsatellite distance analysis based on the Goldstein et al. (1995) single-step distance model and using neighbor-joining analysis (PHYLIP) of distance values. Bootstrap (1000 replicates) probability values developed from the Goldstein el al. (1995) program are given in percentages on the tree branches; values >50% are indicated in bold type.
tude explaining 46% of the variance within haplotypes and longitude explaining 39% of the variance. Factor analysis of m t D N A haplotype frequency showed that the first principal component explained 72% of the variation across sampling areas, whereas the second factor explained 21% of the haplotype variance. Multiple regression of the m t D N A first principal component on latitude had the highest correlation 0 "2 - - 0.74; Fig. 7), with the m a x i m u m trend detected between populations above and below 37~ latitude (approximate location of Santa Cruz, CA). Regression of the first principal component on longitude gave 1,2 - - 0.62. The frequency distribution of microsatellite alleles, however, was weakly associated with longitude (F test = 0.05) and not at all with latitude (F test - 0.46). The
first principal component explained 33% of the variation in microsatellite allelic frequencies across all sampling areas, whereas the second factor contributed only 9% of the variance. Multiple regression analyses of the first principal component on longitude (1,2 = 0.55; Fig. 8) demonstrated a m a x i m u m trend in allelic variation around 121 ~ longitude (the approximate b o u n d a r y of the Sierra Nevada Crest in north-central California). Principal components analysis of genotype distributions using both m t D N A and microsatellite data combined found that the first principal component explained 68% of the variation in genotype frequency, whereas the second factor contributed 31% of the proportionate genotype variance. Factor axis loadings for
64
JENNIFER L. NIELSEN et at.
1.5 c-
mmm
.5
O
E o
0
._o. -.5, o_
-I
(y.)
mimmm
imm
o
mira m 9
9
-2~ -2.5 m
-3
28
3b
32
3:4
3i3
3i3
4b
42
Degrees Latitude
FIGURE 7 Regression of the first principal component derived from factor analysis of mtDNA haplotype diversity on latitude 0 .2 -0.74). The maximum trend was detected between populations above and below 37~ latitude (approximate location of Santa Cruz, CA).
mtDNA were -0.72 (factor one) and -0.09 (factor 2); for microsatellites, factor one and two axis loadings were 0.16 and 0.98, respectively.
IV.
Discussion
Biogeographic structure based on analyses of mtDNA and nuclear microsatellites proved to be non-
"9
...'"
9.'~, I"
~! _.."
....'I
i"-'%
_=_
l-"-
E
8--~
O,
-r-
Ii,
~08
,,0
,,2
,,,
,,6
,,8
9
,20
" "
,22
,2,
,2,
Degrees Longitude
FIGURE 8 Regression of the first principal component derived
from factor analysis of microsatellite allelic frequency on longitude 0 .2 = 0.55). The maximumtrend was detected between populations east and west of 121~latitude (approximatelocationof SierraNevada Crest in north-central California).
congruent in this study, with no intraspecific phylogenetic relationships supported by both markers with significant bootstrap values. This noncongruence may be explained by the documented differences found between these genetic markers and geography. Mitochondrial DNA haplotypes showed significant correlation with both longitude and latitude. Nuclear microsatellites, however, correlated only weakly with longitude and not at all with latitude. Although it is widely understood that data from differents parts of the genome often evolve differently (Avise, 1994; Huelsenbeck et al., 1996), the influence different evolutionary processes may have on phylogeographic structure within closely related populations is not generally known. Three mtDNA haplotype bioregions for coastal steelhead were suggested in Nielsen et al. (1994b). Neeley (1995) confirmed these findings using additional mtDNA haplotype data and showed significant genetic subdivision in coastal trout at 38.7~ (just above the mouth of the Russian River on the north coast of California) and at 36.7~ at the Pajaro River in central California. The analyses of trout mtDNA diversity presented here included interior populations from the McCloud River, the upper Sacramento River, and the Kern River basin, as well as two southern populations from Mexico. These new results support a latitudinal cline in O. mykiss mtDNA haplotype variation, but suggest that the maximum difference in variation for inland and coastal populations occurs north and south of 37~ latitude. The resolution of mtDNA frequency distributions in interior trout populations from California would gain from the addition of samples from the San Joaquin River that were not available at the time of these analyses. One mtDNA haplotype (MSY3) was common in anadromous steelhead from the Russian River north of San Francisco Bay to the Carmel River just south of Monterey, California. This mtDNA haplotype was also found in dominant frequencies in resident trout from the upper Sacramento River, California golden trout from the South Fork Kern River at Ramshaw Meadows, Golden Trout Creek, and Johnson Creek, and in rainbow trout populations from the Kern River and the Little Kern River. These data imply an extensive geographic distribution of this haplotype in the interior populations and suggest a strong genetic relationship between resident and anadromous trout in the Sacramento River drainage and the trout of the Kern River basin. Behnke in his 1992 monograph on native trout suggests such a linkage between Sacramento River redband trout and the California golden trout based on coloration and other taxonomic characters, which would appear to support the mtDNA findings. Ac-
5. mtDNA and Nuclear Microsatellites in Trout cording to Behnke (1992), the most primitive redband trout found in the Sacramento River basin is represented by fish from an isolated population found in Sheepheaven Creek near the McCloud River. mtDNA was sequenced from 11 fin clips taken from trout from Sheepheaven Creek that were sent to the author's laboratory by the California Department of Fish and Game. These Sheepheaven Creek fish were monomorphic for mtDNA haplotype MYS1, as were all 54 McCloud River rainbow trout that were sequenced. Haplotype MYS1 was most frequently found in coastal steelhead from northern California. This haplotype has never been found in California golden trout. McCloud River redband trout had microsatellite alleles that have not been found in coastal steelhead groups. One notable example was the Omy77 allele (Omy77-79), which was common in the upper Sacramento River, Kern River, and Little Kern River rainbow trout, in Eagle Lake trout, and in California golden trout, but has been found in only one steelhead from the Carmel River. A second Omy77 allele (Omy77-121) dominated frequency in the McCloud trout populations and was rarely found in coastal steelhead. Two new mtDNA haplotypes (MYS15 and MYS16), never seen in coastal populations, were found among golden trout captured in Taylor, Fay and Manter Creeks, and in the South Fork Kern River at Ramshaw Meadows. Haplotype MYS16 was also found to be monomorphic in Eagle Lake trout. The isolated geographic distribution of this haplotype into this northern interior lake remains unclear. There have been no officially documented fish transfers from the South Fork Kern River to Eagle Lake in recent history (E. Gerstung, California Department of Fish and Game, personal communications). Microsatellite distance analyses did not link these two populations with any statistical rigor. A third unique mtDNA haplotype (MYS18) was found in the Rio Yaqui trout from northwestern Mexico. This group of fish had a significantly different genetic profile for both mtDNA and microsatellites when compared to the rest of O. mykiss. Several alleles that dominated the microsatellite frequency in the Rio Yaqui fish were found only rarely or not at all in California trout populations. The position of this group in the evolutionary history of Pacific trout has been speculated on in several early studies (Miller, 1950, 1972; Needham and Gard, 1964), but their taxonomic status remains undefined. These genetic findings support a unique identity for this group of trout which deserves further study. The mtDNA haplotype (MYS1), which dominated anadromous steelhead populations in northern California, was also found to be fixed in Rio Santo Domingo rainbow trout from Baja California. It has been
65
speculated that the Baja rainbow trout originated from the anadromous coastal steelhead of southern California (Ruiz-Campos and Pister, 1995). The rare, but ubiquitous, distribution of the MYS1 haplotype throughout southern California supports a possible historic connectivity between these anadromous stocks and the resident rainbow trout populations of Baja. In an earlier study using electrophoretic analyses of allozymes, Berg (1987) found a unique creatine kinase allele (Ck-2) in Baja trout that was not found in other coastal populations. Microsatellites also paint a different picture of biogeographic associations for the Baja trout. Microsatellite alleles (Omy77-77, Omy77-87, and Omy207-124) show a closer relationship between the Baja fish and trout populations in the Kern River and the South Fork Kern River. Omy77-87 was found in only one fish from Bull Frog Lake on the Little Kern River. Omy77-77 was found only in fish from Dry Meadows Creek on the Kern River. These associations demonstrate a possible evolutionary connection between the Baja trout and the Kern River basin, suggesting an alternate evolutionary path for these fish. Both analyses of microsatellite distance supported the unity of the Baja trout with the rainbow trout of the Little Kern River with high bootstrap values. This lack of congruence between mtDNA and microsatellite allelic frequencies argues against a single Pleistocene radiation for O. mykiss. An alternative hypothesis is two radiations from a Gulf of California refugium as suggested by Behnke (1992), with one contributing to the interior redband/golden trout complex and one to the coastal radiation of steelhead and coastal rainbow trout. It is interesting that three controversial trout populations, McCloud River's Sheepheaven Creek redband trout, the Eagle Lake trout, and the Baja trout, have demonstrable differences in interpretation of their evolutionary associations based on mtDNA and microsatellites. It is possible that these unique trout populations represent different ancestral nodes for both radiations. Another possible explanation for the lack of congruence between mtDNA phylogenetic structure and microsatellite data is male-mediated gene flow (Karl et al., 1992). This study found significant differences in population structure in nuclear vs mtDNA assays of sea turtles (Chelonia mydas) and attributed this finding to life history differences between males and females, where females alone demonstrated a strong natal site philopatry in rookery use. Male straying from natal streams during spawning migrations in anadromous salmon is thought to be more pervasive than straying in females (Flemming and Gross, 1994; Quinn and Foote, 1994). Similar behavior and male-mediated gene flow in resident rainbow trout, however, would be limited to
66
JENNIFERL. NIELSENet al.
straying among tributaries of the same river basin and would not seem to be a credible cause of the microsatellite allelic panmixia shown in this study across many interior river basins where there is currently no access to the ocean. The artificial transfer of trout from basin to basin could explain such a panmixia, but artificial stock transfers would not be limited strictly to male fish. A more likely mechanistic explanation for the lack of congruence among these molecular markers lies in the fact that microsatellites probably have diverged more rapidly than the mtDNA control region and may, therefore, not be useful in detecting phylogenetic relationships among closely related taxa due to a lack of lineage sorting in these markers. Our comparison of the Slatkin (1995) and Goldstein et al. (1995) distance measures gave no indication as to which of the two methods used for constructing distance matrices for microsatellites might more likely reflect trout phylogeny. The three microsatellite loci used in this study seem to fit the expectation of the singlestep allele model with the exception of a few large alleles in Omy77 and Omy207. The general results of the consensus trees were similar, with few significant branching patterns based on bootstrap analyses of 1000 trees. Analyses of additional polymorphic microsatellite loci may provide a more reliable signal for divergence of O. mykiss, but it is clear from these data that mtDNA control region sequence and microsatellites can give very different evolutionary signals for closely related groups. In a study that inferred phylogenetic trees from 10 vertebrate species, Cummings et al. (1995) suggested that a large number of genes and nucleotide sites are needed to exactly determine phylogenetic relationships. The selection of molecular markers used in phylogenetic studies is frequently made based on factors related to the historic use of the marker in systematic studies, the functional characteristics of the marker, the ease of extraction and amplification, but not on the relevance of that marker to the evolutionary history of the population. The conflicting results reported here for mtDNA sequence data and nuclear microsatellites confirm the need to draw phylogenetic inference from several independent markers before reaching conclusions that are presumed to represent the evolutionary history of the organism. In summary, the biogeographic results derived from mtDNA and microsatellites were not congruent for this study of trout and steelhead populations. The phylogeographic structure for mtDNA was significantly associated with both longitude and latitude in western trout populations. Unlike the conclusion drawn by Phillips and Oakley (1997), intraspecific mtDNA control region data retain significant biogeographic struc-
ture, suggesting that control region divergence can serve as a rigorous marker in the documentation of stock structure in this species. Only a weak association, however, was shown between longitude and the frequency of microsatellite alleles. The most significant separation for this marker occurred at the approximate boundary of the Sierra Nevada Crest, weakly supporting the biogeographic subdivision of O. mykiss previously reported by Allendorf (1975) for allozymes in trout. These data suggest that microsatellite, allozymes, and mtDNA data do not reflect the same evolutionary architecture in O. mykiss. Based on morphological data, Behnke (1992) suggested a Gulf of California refugium for Oncorhynchus during the mid-Pleistocene, approximately 250,000 years ago. With 4.5% mtDNA control region sequence divergence (Nielsen et al., 1994b), the female lineage of O. mykiss appears to have retained significant phylogenetic structure for a far longer period, assuming an expected substitution rate of around 4% per million years (Avise, 1994). Microsatellites, however, with only a weak geographic association between longitude and allelic frequency distributions, seem to represent population structure that has more recently diverged, perhaps during the mid to late Pleistocene (Bailey, 1966) when the Sierra Nevada area was strongly uplifted and tilted to the west. It is interesting, however, to note that factor analyses of the geographic range of samples and genetic data from both molecular markers showed that the first two factors could be used to explain 99% of the genetic variance reported in this study, suggesting that a combination of molecular markers reflecting independent evolutionary histories do a far better job of depicting phylogeography than either one alone.
Acknowledgments Numerous people were instrumental in collecting tissue from steelhead and trout for this project, clarifying our analytical approach, and editing difficult and cumbersome drafts of this paper. We express our special appreciationto MuriceCardenas, Cindy Carpanzano, Sara Chubb, Bill Cox, Karen Crow, Tom Dowling, Chris Gan, Eric Gerstung, Ed Henke, Buddy Jensen, Wendy Jones, Mat Lectner, Giles Manwaring, Bob Miller, Eric Minch, Steve Nettie, Steve Parmenter, Phil Pister, Dennis Powers, Mike Rode, Gorgonio Ruiz-Campos, Monty Slatkin, Kelley Thomas, Doug Tupper, Steve Turek, and Don Weidlein. We are grateful for the suggestions and corrections made in this manuscript by the editors and two anonymous reviewers.
References Allendorf, F. W. 1975. "Genetic Variability in a Species Possessing Extensive Gene Duplication: Genetic Interpretation of Duplicate Loci and Examination of Genetic Variation in Populations of
5. m t D N A and Nuclear Microsatellites in Trout
Rainbow Trout." Unpublished Ph.D. dissertation, University of Washington, Seattle, WA. Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman & Hall, New York. Avise, J. C., Arnold, J., Ball, R. M., Bermingham, E., Lamb, T., Neigel, J. E., Reeb, C. A., and Saunders, N. C., 1987. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 18: 489-522. Avise, J. C., Nelson, W. S., and Sibley, C. G. 1994a. DNA sequence support for a close phylogenetic relationship between some storks and New World vultures. Proc. Natl. Acad. Sci. USA 91: 5173-5177. Avise, J. C., Nelson, W. S., and Sibley, C. G. 1994b. Why one-kilobase sequences from mitochondrial DNA fail to solve the Hoatzin phylogenetic enigma. Mol. Phylogenet. Evol. 3:175-184. Bailey, E. H. 1966. "Geology of Northern California." USGS Bulletin 190. CA Div. Mines and Geol. Ferry Bldg., San Francisco. Behnke, R. J. 1965. "A Systematic Study of the Family Salmonidae with Special Reference to the Genus Salmo." Doctoral dissertation, University of California, Berkeley, CA. Behnke, R. J. 1968. A new subgenus and species of trout, Salmo (Platysalmo) platycephalus, from south-central Turkey, with comments on the classification of the subfamily Salmonidae. Mitteil. Hamburg. Zool. Mus. Inst. 66:1-15. Behnke, R. J. 1992. "Native Trout of Western North America." Am. Fish. Soc. Mon. Berg, W. J. 1987. "Evolutionary Genetics of Rainbow Trout, Parasalmo gairdnerii (Richardson)." Doctoral dissertation, University of California, Davis, CA. Bernatchez, L. 1995. A role for molecular systematics in defining evolutionarily significant units in fishes. In: "Evolution and the Aquatic Ecosystem: Defining Unique Units in Population Conservation," (J. L. Nielsen, ed.), pp. 114-132 Am. Fish. Soc. Symposium No. 17, Bethesda, MD. Bernatchez, L., and Danzmann, R. G. 1993. Congruence in controlregion sequence and restriction site variation in mitochondrial DNA of Brook char (Salvelinus fontinalis Mitchill) Mol. Biol. Evol. 10:1002-1014. Birky, C. W., Fuerst, P., and Maruyama, T. 1989. Organelle gene diversity under migration, mutation, and drift: Equilibrium expectations, approach to equilibrium, effects of heteroplasmic cells, and comparison to nuclear genes. Genetics 121:613-627. Bowcock, A. M., Ruiz-Linares, A., Tomfohrde, J., Minch, E., Kidd, J. R., and Cavalli-Sforza, L. L. 1994. High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368: 455-457. Bowen, B. W., Richardson, J. I., Melan, A. B., Margaritoulis, D., Hopkins Murphy, R., and Avise, J. C. 1993. Population structure of loggerhead turtles (Caretta caretta) in the northwestern Atlantic Ocean and Mediterranean Sea. Conserv. Biol. 7:834-844. Bruford, M. W., and Wayne, R. K. 1993. Microsatellites and their application to population genetic studies. Curr. Opin. Genet. Dev. 3: 939-943. Burke, T., Davies, N. B., Bruford, M. W., and Hatchwell, B. J. 1989. Parental care and mating behavior of polyandrous dunnocks Prunella vulgaris related to paternity by DNA fingerprinting. Nature 338:249-251. Cummings, M. P., Otto, S. P., and Wakeley, J. 1995. Sampling properties of DNA sequence data in phylogenetic analyses. Mol. Biol. Evol. 12(5):814-822. Currens, K. P., Schreck, C. B., and Li, H. W. 1990. AUozyme and morphological divergence of rainbow trout (Oncorhynchus mykiss) above and below waterfalls in the Deschutes River, Oregon. Copeia 1990(3):730-746. Digby, T. J., Gray, M. W., and Lazier, C. B. 1992. Rainbow trout mito-
67
chondrial DNA: Sequence and structural characteristics of the non-coding region and flanking tRNA genes. Gene 118:197-204. Di Rienzo, A. A., Peterson, A. C., Garza, J. C., Valdes, A. M., Slatkin, M., and Freimer, N. B. 1994. Mutational processes of simple sequence repeat loci in human populations. Proc. Natl. Acad. Sci. USA 91:3166-170. Ellegren, H. 1995. Microsatellites. In "Methods in Molecular Population Genetics for Ecologists" (D. T. Parkin, ed.). Blackwell Sci., Oxford. Estoup, A., Presa, P., Krieg, F., Vaiman, D., and Guyomard, R., 1993. (CT)n and (GT)n microsatellites: A new class of genetic markers for Salmo trutta L. (brown trout). Heredity 71:488-496. Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using bootstrap. Evolution 39: 783-791. Felsenstein, J. 1991. "PHYLIP 3.4--Phylogeny Inference Package Distributed by Author. Department of Genetics SK-10, University of Washington, Seattle, WA. FitzSimmons, N. N., Moritz, C., and Moore, S. S. 1995. Conservation and dynamics of microsatellite loci over 300 million years of marine turtle evolution. Mol. Biol. Evol. 12(3):432-440. Flemming, I. A., and Gross, M. R. 1994. Breeding competition in a pacific salmon (Coho: Oncorhynchus mykiss): Measures of natural and sexual selection. Evolution 48:637-657. Foote, C. J., Mayer, I., Wood, C. C., Clarke, W. C., and Blackburn, J. 1994. On the developmental pathways to anadromony in sockeye salmon, Oncorhynchus nerka. Ca. J. Zool. 72:397-405. Gall, G. A. E. 1995. "California Trout of the Kern River: A Genetic Analysis. Report submitted to California Department of Fish and Game, Inland Fisheries Division, Sacramento, CA. Gall, G. A. E., Bentley, B., and Nuzum, R. C. 1990. Genetic isolation of steelhead rainbow trout in Kaiser and Redwood Creeks, California. Calif. Fish Game 76:216-223. Gerloff, U., Schlotterer, C., Rassmann, K., Rambold, I., Hohmann, G., Fruth, B., and Tautz, D. 1995. Amplification of hypervariable simple sequence repeats (microsatellites) from excremental DNA of wild living bonobos (Pan paniscus). Mol. Ecol. 4:515-518. Goldstein, D. B., Linares, A. R., Cavalli-Sforza, L. L., and Feldman, M. W. 1995. An evaluation of genetic distances for use with microsatellite loci. Genetics 139: 463-471. Graybeal, A. 1994. Evaluating the phylogenetic utility of genes: A search for genes informative about deep divergence among vertebrates. Syst. Biol. 43(2): 174-193. Henderson, S. T., and Petes, T. D. 1992. Instability of simple sequence DNA in Saccharomyces cerevisiae. Mol. Cell. Biol. 12:2749-2757. Hillis, D. M. 1995. Approaches for assessing phylogenetic accuracy. Syst. Biol. 44:3-16. Huelsenbeck, J. P., Bull, J. J., and Cunningham, C. W. 1996. Combining data in phylogenetic analyses. TREE 11(4): 152-158. Karl, S. A., Bowen, B. W., and Avise, J. C. 1992. Global population genetic structure and male-mediated gene-flow in the green turtle (Chelonia mydas): RFLP analyses of anonymous nuclear loci. Genetics 131:163-173. Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111 - 120. Kelly, R., Gibbs, M., Collick, A., and Jeffreys, A. J. 1991. Spontaneous mutation at the hypervariable mouse microsatellite Ms6-hm: Flanking DNA sequence and analysis of and early somatic events. Proc. R. Soc. Lond. B 245:235-245. Kendall, A. W., Jr., and Behnke, R. J. 1984. Salmonidae: Development and relationships. In "Ontogeny and Systematics of Fishes." (H. G. Moser, W. J. Richards, D. M. Cohen, M. P. Fahay, A. W. Kendall, Jr., and S. L. Richardson, eds.), pp. 142-149. Am. Soc. Ichthyol. Herpetol., Special Publication 1, Allen Press, Lawrence, KS.
68
JENNIFER L. NIELSEN et al.
Kocher, T. D., Thomas, W. K., and Meyer, A. 1989. Dynamics of mitochondrial DNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86: 6196-6200. Long, E. O. and David, I. B. 1980. Repeated genes in eukaryotes.Ann. Rev. Biochem. 49: 727- 764. Louis, E. J. and Dempster, E. R. 1987. An exact test for Hardy-Weinberg and multiple alleles. Biometrics 43:805-811. McConnell, S. K., O'Reilly, P., Hamilton, L., Wright, J. M., and Bentzen, P. 1995. Polymorphic microsatellite loci from Atlantic salmon (Salmo salar): Genetic differentiation of North American and European populations. Can. J. Fish. Aquat. Sci. 52:18631872. Miklos, G. L. G. 1985. Localized, highly repetitive DNA sequences in vertebrate and invertebrate genomes. In "Molecular Evolutionary Genetics" (R. J. MacIntyre, ed.), pp. 231-241 Plenum Press, New York. Miller, R. R. 1950. Notes on the cutthroat and rainbow trouts with the description of a new species from the Gila River, New Mexico. Occ. Pap. Mus. Zool. Univ. M1529 :1-42. Miller, R. R. 1972. Classification of the native trouts of Arizona, with the description of a new species, Salmo apache. Copeia 1972:401422. Moore, S. S., Sargent, L. L., King, T. J., Mattick, J. S., Georges, M., and Hetzel, D. J. S. 1991. The conservation of dinucleotide microsatellites among mammalian genomes allows the use of heterologous PCR primer pairs in closely related species. Genomics 10:654-660. Morin, P. A., Moore, J. J., Chakraborty, R., Jin, L., Goodall, J., and Woodruff, D. S. 1994a. Kin selection, social structure, gene flow, and the evolution of chimpanzees. Science 265:1193-1201. Morin, P. A., Wallis, J., Moore, J. J., and Woodruff, D. S. 1994b. Paternity exclusion in a community of wild chimpanzees using hypervariable simple sequence repeats. Mol. Ecol. 3:469-478. Moritz, C., Dowling, T. E., and Brown, W. M. 1987. Evolution of animal mitochondrial DNA: Relevance for population biology and systematics. Ann. Rev. Ecol. Syst. 18:269-292. Moritz, C., Lavery, S., and Slade, R. 1995. Using allele frequency and phylogeny to define units for conservation and management. In "Evolution and the Aquatic Ecosystem: Defining Unique Units in Population Conservation" (J. L. Nielsen, ed.), pp. 249-262. Am. Fish. Soc. Symposium No. 17, Bethesda, MD. Morris, D. B., Richard, K. R., and Wright, J. M. 1996. Microsatellites from rainbow trout (Oncorhynchus mykiss) and their use for genetic studies of salmonids. Can. J. Fish. Aquat. Sci. 53:120-126. Needham, P. R., and Gard, R. 1964. A new trout from central Mexico: Salmo chrysogaster, the Mexican golden trout. Copeia 1964:169173. Neeley, D. 1995. A statistical evaluation of coastal California steelhead genetic data gathered by J. L. Nielsen et al. and by Trihey and Associates. Prepared for S. P. Cramer & Asso. Submitted to Association of California Water Agencies, Sacramento, CA. Neigel, J. E., and Avise, J. C. 1986. Phylogenetic relationships of mitochondrial DNA under various demographic models of speciation. In "Evolutionary Processes and Theory" (E. Nevo and S. Karlin, eds.), pp. 515-534. Academic Press, New York. Nielsen, J. L. 1996. Molecular genetics and the conservation of salmonid biodiversity: Oncorhynchus at the edge of their range. In "'Molecular Genetic Approaches in Conservation" (T. Smith and R. Wayne, eds.) pp. 383-398. Oxford University Press, London. Nielsen, J. L., Gan, C. A., and Thomas, W. K. 1994a. Differences in genetic diversity for mtDNA between hatchery and wild populations of Oncorhynchus. Can. J. Fish Aquat. Sci. 51(Suppl. 1):290297. Nielsen, J. L. Gan, C. A., Wright, J. M., Morris, D. B., and Thomas, W. K. 1994b. Biogeographic distributions of mitochondrial and
nuclear markers for southern steelhead. Mol. Marine Bio. Biotech. 3:281-293. Okazaki, T. 1984. Genetic divergence and its zoogeographic implications in closely related species Salmo gairdneri and Salmo mykiss. Jap. J. Ichthyol. 31:297-310. O'Reilly, P., and Wright, J. M. 1995. The evolving technology of DNA fingerprinting and its application to fisheries and aquaculture. J. Fish. Biol. 47(Suppl. A) :29-55. Pamilo, P., and Nei, M. 1988. Relationships between gene trees and species trees. Mol. Biol. Evol. 5:568-583. Parkinson, E. A. 1984. Genetic variation in populations of steelhead (Salmo gairdneri) in British Columbia. Can. J. Fish. Aquat. Sci. 41: 1412-1420. Phillips, R. B., and Oakley, T. H. 1997. Phylogenetic relationships among the Salmonidae based on nuclear DNA and mitochondrial DNA sequences. In "Molecular Systematics of Fishes" (T. D. Kocher and C. A. Stepien, eds.). Academic Press, San Diego. Queller, D. C., Strassmann, J. E., and Hughs, C. R. 1993. Microsatellites and kinship. Trends Ecol. Evol. 8:285-288. Quinn, T. P., and Foote, C. J. 1994. The effects of body size and sexual dimorphism on the reproductive behavior of sockeye salmon (Oncorhynchus nerka). Anim. Behav. 48: 751-761. Regan, C. T. 1914. The systematic arrangement of the fishes of the family Salmonidae. Ann. Mag. Nat. Hist. 13(8):405-408. Reisenbichler, R. R., McIntyre, J. D., Solazzi, M. F., and Landing, S. W. 1992. Genetic variation in steelhead of Oregon and Northern California. Trans. Am. Fish. Soc. 121:158-169. Ruiz-Campos, G., and Pister, E. P. 1995. Distribution, habitat, and current status of the San Pedro Martir rainbow trout, Oncorhynchus mykiss nelsoni (Evermann). Bull. S. CA Acad. Sci. 94(2):131148. Schlotterer, C., and Tautz, D. 1992. Slippage synthesis of simple sequence DNA. Nucleic Acids Res. 20: 211-215. Slatkin, M. 1991. Inbreeding coefficients and coalescence times. Genet. Res. 58:167-175. Slatkin, M. 1995. A measure of population subdivision based on microsatellite allele frequencies. Genetics 139:457-462. Smith, G. R., and Stearley, R. F. 1989. The classification and scientific names of rainbow and cutthroat trouts. Fisheries 14:4-10. Spencer, P. B. S., Odorico, D. M., Jones, S. J., Marsh, H. D., and Miller, D. J. 1995. Highly variable microsatellites in isolated colonies of the rock-wallaby (Petrogale assimilis) Mol. Ecol. 4:523-525. Stearley, R. F., and Smith, G. R. 1993. Phylogeny of the Pacific trouts and salmon (Oncorhynchus) and genera of the family Salmonidae. Trans. Am. Fish. Soc. 122:1-33. Stoneking, M., Hedgecock, D., Higuchi, R. G., Vigilant, L., and Erlich, H. A. 1991. Population variation of human mtDNA control region sequence detected by enzymatic amplification and sequence-specific oligonucleotide probes. Am. J. Hum. Genet. 48:370-382. Swift, C. C., Haglund, T. R., Ruiz, M., and Fisher, R. N. 1993. The status and distribution of the freshwater fishes of southern California. Bull. South. Calif. Acad. Sci. 92(2): 101-167. Tautz, D. 1989. Hypervariability of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Res. 17: 6463-6471. Thomas, W. K., and Beckenbach, A. T. 1989. Variation in salmonid mitochondrial DNA: Evolutionary constraints and mechanisms of substitution. J. Mol. Evol. 29:233-245. Thomas, W. K., Withler, R. E., and Beckenbach, A. T. 1986. Mitochondrial DNA analysis of Pacific salmonid evolution. Can. J. Zool. 64:1058-1064. Titus, R. G., Erman, D. C., and Snider, W. M. History and status of steelhead in California coastal drainages south of San Francisco Bay. Hilgardia, in press.
5. m t D N A and Nuclear Microsatellites in Trout Utter, F. M., F. W. Allendorf, and H. O. Hodgins. 1973. Genetic variability and relationships in Pacific salmon and related trout based on protein variation. Syst. Zool. 22:257-270. Utter, F. M., and Allendorf, F. W. 1994. Phylogenetic relationships among species of Oncorhynchus: A consensus view. Conser. Biol. 8: 864- 867. Vladykov, V. 1963. A review of salmonid genera and their broad geographical distribution. Trans. Roy. Bd. Can. 1 (Ser. 4, Sect. 3): 459504. Weber, J., and May, P. 1989. Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. Am. J. Hum. Gene. 44:388-396. Weller, S. J., Pashley, D. P., Martin, J. A., and Constable, J. L. 1994.
69
Phylogeny of noctuoid moths and the utility of combining independent nuclear and mitochondrial genes. Syst. Biol. 43(2):194211. Wilson, G. M., Thomas, W. K., and Beckenbach, A. T. 1985. Intra- and inter-specific mitochondrial DNA sequence divergence in Salmo: Rainbow, steelhead, and cutthroat trouts. Can. J. Zool. 63:20882094. Wright, J. M. 1993. DNA fingerprinting in fishes. In "Biochemistry and Molecular Biology of Fishes" (P. W. Hochachka and T. Mommsen, eds.), Vol. 2, pp. 57-91. Elsevier Press, New York. Wright, J. M., and Bentzen, P. 1994. Microsatellites: Genetic markers for the future. In "Reviews in Fish Biology and Fisheries" (G. R. Carvalho and T. J. Pitcher, eds.). Chapman and Hill, London.
70
JENNIFER L. NIELSEN et at. APPENDIX I m t D N A haplotype Location North coast Van Duzen River Eel River Albion River Navarro River Gualala River Garcia River Russian River Salmon River Usal Creek Cottoneva Creek H o w a r d Creek Redwood Creek Lagunitas Creek North interior Sacramento River Mears Creek Soda Creek Dog Creek Slate Creek McCloud River Edson Creek Dry Creek Trout Creek Sheepheaven Creek Eagle Lake
1
2
5 10 7 2 1 1
1 10
3
6
8
12
13
15
16
18
Total
Total
8 25 8 2 1 1 2 4 3 3 3 5 3 68
Total
5 5 3 2 8 6 8 5 4 11 10 67
Total
4 4 6 15 9 6 6 5 7 6 9 77
2 2
3 1
1 2 1 3
6 8 5 4 11 10
South coast San Lorenzo River Zyante Creek Carmel River Santa Ynez River Morro Bay Scott Creek Waddell Creek Santa Rosa Creek Pico Creek Gaviota Creek Malibu Creek South interior Kern River Dry Meadows Creek Freeman Creek Little Kern River Bullfrog Lake Sheep Creek Willow Creek South Fork Kern River Fay Creek Manter Creek Ramshaw Meadows Taylor Creek Golden Trout Creek Johnson Creek
5
2 2 3 3
4
1 3 2 7
1
3 2
1 2
11 6 6
11 6 6
3 8 8 13 15 6
3 7 10
6 1
1 Total
13 15 9 6 9 10 98
5. mtDNA and Nuclear Microsatellites in Trout
71
APPENDIX ImContinued mtDNA.haplotype Location
1
Mexican coastal Rio Santo Domingo Arroyo San Rafael Arroyo San Antonio Arroyo La Zanja Arroyo E1 Potrero
12 6 4 3 2
2
3
5
6
8
12
13
15
16
Mexican interior Rio Yaqui Total
99
17
108
20
7
25
7
8
7
45
18
Total
Total
12 6 4 3 2 27
11
11
11
354
APPENDIX H Locus = O m y 77 Population (haplotype)
77
79
81
83
85
87
89
CA steelhead (1) M c C l o u d r a i n b o w (1) Rio Santo D o m i n g o (1)
91
93
95
97
101
103
105
107
109
2
1
1
14
5
5
4
3
4
117
6
CA steelhead (3) 6
Little Kern golden (3)
14
Sacramento r a i n b o w (3)
20
3 1
4
2 1
1
9
1 5
3
2
CA steelhead (6)
1 1
3
CA steelhead (12)
1
CA steelhead (13)
5
CA g o l d e n trout (15)
3 1
Eagle Lake r a i n b o w (16) Rio Yaqui trout (18)
12 2
5 30
1 2
10
8
2
2 2 11 1
1 3
5 2
16
5
3 2
3 2
1
CA steelhead (5) CA steelhead (8)
125
127
129
131
135
1
2
3
1
7
137
141 Total
1
2
1
1
1
6
2 1
5
3 1
2
4
1 1
4
1
3
1
1 1
4 3 6
1
2 1
3
1
4
2
1 2
32 38 1
3
2
1
1
1
2
3
4
10
1 1
1
3
32 14
3
56 1
2
14 16 14 18
1
3 20
44 16 38 46
1
11
2
1 1
2 4
2 5
2
2
7 10 3
1
1 4
54 32 20
14 3
2
121
24
CA g o l d e n t r o u t (3)
CA g o l d e n trout (16)
115
8
CA steelhead (2)
Kern River r a i n b o w (3)
111
20 22
2 Locus = O m y 207
76
CA steelhead (1)
78
80
82
84
86
88
90
4
5
1
1
1
5
92
M c C l o u d r a i n b o w (1)
94
96
98
100
6
2
1
9
2
Rio Santo D o m i n g o (1) CA steelhead (2) CA G o l d e n t r o u t (3)
104
106
4
1
1
14 1
1
1
1
2
2
CA steelhead (3)
2
2
1 2
3
2
2
1
10
1
7
1
1
1
108
110
112
4
116
118
2
1 8
1
3
124
1
1
1
1
2
1
7
6
4
10
122
126
128
130
132
134
136
138
148 Total
1
50
2
4 1
120
1
1 1
1 4
114
6 1
3
6
21 2
Kern River r a i n b o w (3) Little Kern g o l d e n (3)
102
20 6 2
19
30 30
1 1
38
1
1
5
1
1
3
1
38 1
2
3
1
16 36
Sacramento rainbow (3) CA steelhead (5) Kern River rainbow (5) CA steelhead (6) CA steelhead (8) Kern River rainbow (8) CA steelhead (12) CA steelhead (13) CA golden trout (15) CA golden trout (16) Eagle Lake rainbow (16) Rio Yaqui trout (18)
2
19 2
3
6
7
5 2
1
3
9 3 2 2 2
4 1
1
1
4 3
1 1
3
1
1
7
2 8
1
5
2
2
89
CA steelhead (1)
Y
91
93
95
97
2
30 8 16 14 2 17 3 22 10 12 5 9 3 6 3 14
3
4
2 11
1
1
4 5
1
2 4 14 2 4 14 14 1 1 36 2 1 4 5 12 22
3
2 1
2
9 5 3 1 2
6
2
6
1 2 1 6 2
4 4
5 27 3
2 2 2
1 3 1 3
2 3
4 1 4 1 3 1 4 2
4 4
5
3 2 1 6 1 4 1 3 4 1
1
1 1
2 2
3
4
1
1
2 1 1 2 6
101 103 105 107 109 Total
3 26 2 1
1
1
1 3
Locus = Ssa 289
McCloud rainbow (1) Rio Santo Doming0 (1) CA steelhead (2) CA golden trout (3) CA steelhead (3) Kern River rainbow (3) Little Kern golden (3) Sacramento rainbow (3) CA steelhead (5) Kern River rainbow (5) CA steelhead (6) CA steelhead (8) Kern River rainbow (8) CA steelhead (12) CA steelhead (13) CA golden trout (15) CA golden trout (16) Eagle Lake rainbow (16) Rio Yaqui trout (18)
1 2
50 34 20 32 34 44 16 36 44 32 14 14 54 10 14 16 14 20 20 22
1 1 1 2
4 2
46 30 12 12 36 10 14 16 14 20 20 20
This Page Intentionally Left Blank
C H A P T E R
6 Mitochondrial DNA Sequence Variation among the Sand Darters (Percidae: Teleostei) E. O. WILEY Museum of Natural History and Department of Systematics and Ecology University of Kansas Lawrence, Kansas 66045
time, developments in population genetic theory also have begun to provide bridges between disciplines (e.g., Slatkin and Maddison, 1989; Hudson, 1990; Templeton et al., 1992). In 1987, Avise and colleagues coined the term "intraspecific phylogeography" for the use of molecular data to reconstruct population histories in relation to geography. The essence of their approach is a threestage process. Molecular data are obtained from individuals sampled from geographically distinct populations. These data are next used to generate a tree showing genealogical relationships among individuals. Finally, the geographic distribution of individuals is compared with the tree. Avise et al. (1987) argued that patterns of concordance between genealogy and geography should reflect historical events responsible for current distribution of an organism. A fundamental assumption is that molecular data preserve a record of genealogy that is independent of the historical pattern of dispersal or vicariance among populations. Intraspecific phylogeography may offer significant insight into processes occurring at the interface between systematics and population genetics, but much work is still needed to refine methods. Since 1987, several authors have proposed methods that may allow
I. I n t r o d u c t i o n
The interface between population genetics and systematics remains one of the most challenging areas in evolutionary biology. Part of the difficulty arises from differing analytical approaches. The goal of most phylogenetic studies is to reconstruct historical (sister group) relationships among taxa through analysis of character distributions. Standard phylogenetic methods work best with characters that are monomorphic within species (Swofford et al., 1996). In contrast, the goals of most population genetic studies are to infer rates and directions of ongoing processes that affect relationships among individuals and populations through analysis of allele frequencies within and among populations. Standard population genetic methods do not include information about the historical relationship among different alleles (Weir, 1996). The contrasting emphases on monomorphic versus polymorphic characters, on character states versus alleles, and on historical versus contemporaneous processes have tended to create barriers between disciplines. However, since the mid-1980s, there has been increasing convergence in the types of data used for population genetic and systematic studies. At the same MOLECULAR SYSTEMATICS OF FISHES
ROBERT H. HAGEN Department of Entomology University of Kansas Lawrence, Kansas 66045
75
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
76
E. O. WILEY AND ROBERT H. HAGEN
formal statistical assessment of inferences about population histories. The approaches can be divided into three categories: (1) extensions of the standard F statistics used in population genetic studies (Excoffier et al., 1992; Excoffier and Smouse, 1994); (2) extensions of spatial autocorrelation methods (Bertorelle and Barbujani, 1995); and (3) cladistically based tests of geographic associations (Slatkin and Maddison, 1989; Templeton et al., 1992, 1995). In addition to theoretical work, empirical studies are also needed to assess the utility and robustness of assumptions and methods. This chapter uses phylogenetic methods to address both inter- and intraspecific variation among sand darters within a single hierarchical framework. The chapter begins with a brief account of cytochrome b and its relevance to the study of inter- and intraspecific variation. The chapter then describes sand darters, a small group of North American percid fishes, before moving on to the results of the study.
A. Cytochrome b Analysis of mitochondrial DNA (mtDNA) sequence variation has become a well-established tool for studying fish evolution (reviewed in Meyer, 1994a). For this study, the authors chose to sequence a portion of the cytochrome b gene. Because the cytochrome b gene is a well-characterized gene that codes for an evolutionarily conservative protein, it has been used in a large number of systematic and population studies. The availability of polymerase chain reaction (PCR) primers that reliably amplify portions of the gene (Irwin et al., 1991), ease of aligning sequences from different species, and the ability to compare results from other studies all contribute to this popularity. Graybeal (1993) and Meyer (1994b) have issued cautions about the uncritical use of cytochrome b sequences in systematic studies. However, most of the difficulties appear at high levels of divergence, when widely separated taxa are included in the analyses. Krajewski and King (1996), using data from a series of phylogenetic studies on cranes (Gruidae), found that cytochrome b sequences yielded consistent results even with uncorrected divergences of up to 11%. Most studies have used cytochrome b sequences for studies at lower taxonomic levels, including studies not reviewed by Meyer (1994b) on Rivulus (Murphy and Collier, 1996) and Gambusia (Lydeard et al., 1995). The usefulness of cytochrome b sequences for intraspecific studies is more likely to be limited by lack of variation, although there is no obvious reason why the amount of variation should be less than for any other mitochondrial region. In a comparison of restriction fragment polymorphism with partial cytochrome b se-
quences, Birt et al. (1995) found comparable levels of mitochondrial variation within Mallotus villosus (Atlantic capelin) population samples from both techniques. Cytochrome b sequences have been used to detect intraspecific variation in five nominal species of rainbow fishes (Melanotaenia: Zhu et al., 1994), in three species of South American rodents (Patton et al., 1996), in the Atlantic cod (Gadus morhua: Carr et al., 1995), and in the Pacific sockeye salmon (Oncorhynchus nerka: Bickham et al., 1995).
II. Systematics of Sand Darters Percidae comprises some of the more familiar Eurasian and North American freshwater fishes, including the yellow perches (Perca), walleyes and saugers (Stizostedion), the ruffes (Gymnocephalus), the North American darters (Crystallaria, Percina, and Etheostoma), and two darter-like European genera (Zingel and Romanichthys). Darters are the largest percid group with approximately 164 described species distributed throughout eastern North America (Mayden et al., 1992). Sand darters consist of six species of small (a maximum of 50-60 mm standard length), translucent, insectivorous predators that live in clear streams, usually over sand bottoms. They typically bury into the sand and await their prey. Two typical species are shown in Fig. 1. Prior to Simons (1991, 1992) and Wiley (1992), seven species of percids were considered sand darters and placed in their own genus, Ammocrypta (Williams, 1975). One species, the crystal darter (Crystallaria asprella), was shown to be the sister group of Per-
FIGURE 1 Two members of Etheostoma (Ammocrypta): (a) E. beanii and (b) E. bifascia. From Williams (1975); reproduced with permission of the author and the Bulletin, Alabama Museum of Natural History.
T
77
6. mtDNA Sequence Variation among Sand Darters cina+Etheostoma (Simons, 1992; Wiley, 1992). The remaining species were shown to be related to species well embedded in Etheostoma (Simons, 1992), specifically to darters of the subgenera Ioa (monotypic: Etheostoma vitreum) and Boleosoma (five species including the common johnny darter, E. nigrum). Thus, Ammocrypta in the strict sense (s.s.) is now regarded as a subgenus of Etheostoma. Williams (1975) recognized two species groups within Ammocrypta s.s.; the E. beanii group and the E. pellucidum group (Fig. 2a). The E. beanii group comprised three species. Etheostoma beanii (Jordan) inhabits the Gulf Coastal Plain from the Hatchie River, southwest Tennessee, south along eastern tributaries of the Mississippi River to Lake Pontchartrain, Louisiana, and south and east to the Tombigbee and Alabama rivers of Alabama (Stauffer, 1980a; Fig. 3). Etheostoma bifascia (Williams) is distributed along Gulf Coast drainages in southern Alabama and western Florida from the Perdido River east to the Choctawhatchee, with possible introduction to the Apalachicola River (Stauffer et al., 1980; Fig. 3). Etheostoma clarum (Jordan and Meek) is sporadically distributed from the Neches and Sabine rivers in Texas north through the Mississippi Valley to Minnesota and Wisconsin, with populations in the Green and Cumberland river drainages of Kentucky (Stauffer, 1980b; Fig. 4). The E. pellucida group also comprised three species. Etheostoma pellucidum (Agassiz) was the northern species of the group and is found throughout the Ohio river basin south to western Kentucky and north to the southern margin of Lake Huron, around Lake Claire, and Lake Erie, with a disjunct population in the central tributaries of the St. Lawrence-Lake Champlain drainage (Hocutt, 1980b; Fig. 4). Etheostoma vivax (Hay) is
distributed from the Trinity River basin of eastern Texas, east to the Pascagoula River drainage of Mississippi, and north along the major tributaries of the Mississippi River to southern Missouri and western Tennessee and Kentucky (Stauffer and Hocutt, 1980; Fig. 4). Etheostoma meridianum (Williams) occupies the Tombigbee and Alabama rivers and their tributaries in Mississippi and Alabama, immediately adjacent to the southeastern range of E. vivax (Hocutt, 1980a; Fig. 4). William's (1975) recognition of two groups of sand darters was largely intuitive, i.e., not based on synapomorphic characters. Simons (1992) analyzed the relationships among members of the group with phylogenetic methods using a number of different morphological characters and arrived at a different hypothesis (Fig. 2b). He hypothesized that E. clarum was the basal member of the clade, removing it from the E. beanii group while maintaining the E. pellucidum group sensu Williams (1975). Although Simons (1992) hypothesized that E. meridianum and E. pellucidum were sister species, he acknowledged that support for this hypothesis was weak and that recognition of the three species as closest relatives rested on a single character. The most recent attempt to understand the relationships of sand darters was undertaken by Shaw et al., (1997) using morphology and allozyme data. They removed E. pellucidum to a more basal position, between E. clarum and the remaining four species (Fig. 2c), and hypothesized that E. vivax was the sister of E. beanii + E. bifascia. This chapter presents a new level of analysis of the sand darters, based on comparison of mitochondrial DNA sequences. Its objectives are threefold: (1) to further test the three different hypotheses of sand darter
t t<<,:t <<,.~
I
I
I
I
1o,+'
<4" <<,." <<,.'><<9
LIJ I
Io
FIGURE 2 Threeprevious hypotheses of the interrelationships of sand darters of Etheostoma(Ammocrypta): (a) Williams (1975),(b) Simons (1992),and (c) Shaw et al. (1997).
78
E. O. WILEYAND ROBERTH. HAGEN
relationships using molecular characters, (2) to test the utility of mitochondrial DNA sequence data for inferring relationships among populations of E. beanii and E. bifascia, and (3) to see how well the genetic structure of these populations corresponds to their geographic distribution and likely histories of vicariance. A less immediate objective is to explore the relationship between the species-level biogeography of southeastern fishes (Wiley and Mayden, 1985; Swift et al., 1986) and the intraspecific patterns of their component populations (e.g., Avise, 1992; Templeton et al., 1995).
Uncertain bases (marked "?") in individual sequences were assigned the same base found in other individuals of the same species or population. Because fewer than 2% of nucleotides were scored as uncertain, this conservative strategy had little to no effect on assignment of individuals to haplotype classes. Haplotypes were analyzed, alone and together, with morphological data taken from Shaw et al. (1997) using PAUP 3.1.1 (Swofford, 1993). A list of morphological characters, organized by species, is shown in Appendix II.
IV. Results III. Methods and Materials Specimens were collected in the field and immediately frozen in liquid nitrogen. Specimens examined are listed in Appendix I. Exact localities are available from E. O. Wiley. Tissues were stored at - 70 ~C until dissected. Approximately 0.1 g of tissue was dissected, and DNA was extracted from freshly frozen tissue by standard chloroform/phenol methods (Maniatis et al., 1982). PCR (Saiki, 1990) was used to amplify a 425-bp region that included approximately one-third of the N-terminal end of the mitochondrial cytochrome b gene and a portion of the adjacent glutamine tRNA gene (Irwin et al., 1991). Primers L14725 (5' CGAAGCTTGATATGAAAAACCATCGTTG 3') (Paabo, 1990) and H15149 (5' AAACTGCAGCCCCTAGAAATATTTGTCCTCA 3') (Kocher et al., 1989) were used. The thermal profile was 94~ sec, 55~ sec, and 70~ 2 min 30 sec, with a 4-sec extension for 30 cycles. Amplification products were separated by electrophoresis on NuSieve (FMC) agarose gels. The band containing the amplified DNA was excised from the gel, and the agarose was enzymatically digested with QiaQuick (Qiagen). The purified PCR product was manually sequenced with the fmol DNA-sequencing system (Promega) using the same primers used for amplification. Results were visualized by autoradiography and scored by visual inspection. Finally, the 250- 300-bp sequences obtained from each strand were aligned by eye, then spliced to produce a complete 422-bp sequence. Extensive overlap between sequences facilitated splicing and provided a partial check on the accuracy of individual sequence reads. The number of individual fishes sequenced was increased rather than using fewer fishes to obtain complete sequences from both strands. Sequences were deposited in GenBank, under accession numbers U90569-U90581. For phylogenetic and population-level analyses, DNA sequence data were reduced to haplotype data.
Mitochondrial DNA sequences were obtained from a total of 91 individual fish. Thirty-five of these were specimens of E. beanii from four river systems representing three drainages and 34 were E. bifascia from four river systems also representing three drainages. Eleven E. vivax from five drainages; four E. meridianum from its drainage; two specimens each of E. clarum, E. pellucidum, and E. vitreum; and one E. nigrum were also examined. The localities from which specimens of sand darters were taken are shown in Figs. 3 and 4. Complete sequences for haplotypes of each species are shown in Table I, aligned against E. nigrum [selected as the outgroup based on Simons (1992)]. Haplotypes for the four species represented by more than two fish (E. vivax, E. meridianum, E. beanii, and E. bifascia) are shown in Table II. To simplify comparisons, the invariant sites are omitted from Table II. Substitutions were identified at 116 of 422 sites (27%) of which 83 (72%) were transitions. The majority of the transitions (71%) were C ~ T changes. Among E. beanii and E. bifascia, only 21 of the 422 sites varied (5%) and the vast majority of changes (95%) were transitions. Of these, C ~ T transitions accounted for 70% (14) of the total. The sequenced region contains 402 base positions of the cytochrome b gene that code for 134 amino acids. Only 6 amino acid differences (4%) were found among all individuals. All amino acid substitutions were conservative, as would be expected for this highly constrained gene (Irwin et al., 1991). Three of the amino acid substitutions occurred in single individuals within two species (E. vivax and E. bifascia). The others occurred as presumed fixed differences supporting internodes on relatively deep branches of the tree (see Section IV, Table I, underlined codons and A). Silent substitutions occurred in 98 of the codons (73%) across all individuals. Among the recently diverged sister species E. beanii and E. bifascia, 17 codons contained si-
6. mtDNA Sequence Variation among Sand Darters
,
79
.~
E. beanfi
"
f,r"
/
E. bifascia
Distributions and localities sampled of Etheostoma beanii (west) and E. bifascia (east). Small and probably disjunct populations exist for each species (see map). Distributional data from Stauffer (1980a) and Stauffer et al. (1980). Drainage systems are numbered next to locality dots: 1, Yellow River; 2, Blackwater River; 3, Escambia River (one dot represents two collections); 4, Perdido River (two dots); 5, Alabama River (two dots); 6, Tombigbee River; 7, Escatawpa River; 8, Pascagoula River (two dots).
FIGURE 3
lent substitutions (13%). There were no obvious differences among codon or amino acid classes in the relative frequency of substitutions. Only minor differences were encountered in the phylogenetic analyses when differential weighting was employed (see below). The uncorrected frequency of nucleotide differences was used to estimate per site nucleotide substitution rates within species because of the overall low frequency of substitutions (Nei, 1987). Morphological data used in the total evidence phylogenetic analysis were taken from Simons (1992) and Shaw et al. (1997) and the constraints of those data are described in Appendix II. To save space, these data are presented in Appendix II as data for species, not haplotypes. Morphological data used in the analysis are presented in Table III in coded form organized by species. The numbering system duplicates that of the combined data matrix which is not presented. Characters
25, 27, 48, 50, and 52 of Shaw et al. (1997) were deleted because they were not relevant to elucidating relationships among species of Ammocrypta. Unfortunately, allozyme data reported by Shaw et al. (1997) are not combinable with the DNA data and were regretfully left out of the total evidence analysis.
A. PhylogeneticAnalyses Data derived from sequencing were reduced to haplotype vectors (Table II) and analyzed using PAUP 3.1.1 (Swofford, 1993). Four types of analyses were performed. One series of analyses consisted of DNA data and was performed with transversions and transitions equally weighted. Another series was performed in which transversions were weighted by a factor of three over transitions. A third series was performed using these data combined with morphological data. Finally,
80
E. 0. WILEY A N D ROBERT H. H A G E N 7-~.=&~r~:_,y2,.~:'.,.~,oY;, ?
\
;NC.>--H. ":, ' ,.. .-e'a~ .,,,,-~-
7;-.' ;o
te
."
i..'[t 9
~.,~
E. clarum
E. pellucidum
"9" ~
~ i ~.,~. D I(
9 ~ ~ :~,, " :?~,~" / !l~Vy_,, "7 .....~ ,,,':~. ~i,:,.~,-~>~,
..
E. meridianum
FIGURE 4 Distributions and localities sampled for four species of sand darters. The most northern population of E. pellucidum is thought to be disjunct. Distributional data are from Stauffer and Hocutt (1980), Stauffer (1980b), and Hocutt (1980a, b). Ranges are outlined. Localities sampled: black dots, E. meridianum; diagonal striped dots, E. vivax; horizontal striped dot, E. clarum; vertical striped dot, E. petlucidum.
a series of analyses was performed by subsampling individuals to see if sample size had an effect on the species-level tree. In each case, Etheostoma nigrum was designated as the outgroup species following Simons (1992).
1. Equally Weighted DNA Analyses A series of phylogenetic analyses were performed using the heuristic search option of PAUP 3.1.1. Each
analysis resulted in a group of equally parsimonious trees whose number equaled the maximum number of trees specified for the run. This resulted from the large number of polytomies encountered within E. vivax and within the E. beanii-E, bifascia complex. Exploration of varying numbers of the maximum trees setting showed that all analyses converged on two consensus tree topologies, both derived from 225 phylogenetic trees of 211 steps (CI = 0.678; RI = 0.891; RC = 0.604). The basal branches of each tree are shown in Figs. 5 and 6.
6. mtDNA Sequence Variation among Sand Darters
The relationships elucidated among the E. beanii and E. bifascia haplotypes are identical in both cases and are shown in Fig. 7. Tree 1 (Fig. 5) differs from tree 2 (Fig. 6) in the relationship between E. clarum and E. pellucidum. In tree 1, E. clarum is hypothesized as the sister species of all other Ammocrypta, with E. pellucidum as the next branch up the tree. In tree 2 these species are hypothesized as sister species. Both trees hypothesized similar relationships between haplotypes of E. vivax and E. meridianum, with E. vivax being unresolved. Both trees also hypothesize similar relationships among haplotypes of E. beanii and E. bifascia, portraying a complex mixture of within and between species haplotype relationships (Fig. 7). It should be noted that PAUP did not find both tree topologies; rather, a series of manual branch swaps using MacClade (Maddison and Maddison, 1992) were performed to examine likely alternative topologies, resulting in tree 1. If PAUP had found both topologies, a strict consensus analysis would have collapsed the appropriate nodes into a trichotomy. This final act of consensus was not performed because of results obtained in the total evidence analysis, as presented later. A final analysis was performed to determine the cost of grouping each haplotype within its own lineage (forcing the haplotypes to form a priori species-level lineages), resulting in a consensus tree derived from 100 phylogenetic trees of 214 steps (CI = 0.668; RI = 0.886; RC = 0.592). Thus a penalty of three steps is required in order to group E. beanii and E. bifascia haplotypes into their appropriate species-level lineages.
2. Transversion-Weighted Analyses Weighting transversions by a factor of three over transitions generated 100 equally parsimonious trees of 301 steps. The only difference between these trees and equally weighted trees is in the relationship of E. vivax and E. meridianum, where the haplotypes group into their "proper" species-level lineages (Fig. 8). The consensus topology of E. beanii and E. bifascia haplotype relationships was not affected by differential weighting.
3. Total Evidence Analyses Total evidence analyses were conducted using morphological data combined with the haplotype vectors for each haplotype. There are several caveats in this analysis. First, it was assumed that morphological data represent true species-level variation and that no significant geographic variation exists for these data that would fundamentally alter species-level characters.
81
Second, the authors have not examined the specimens from which they derived the DNA haplotype vectors for these morphological characters. Third, because E. beanii and E. bifascia share one haplotype (B1), this vector was duplicated and one vector was assigned the morphological data associated with E. beanii, (labeled "Bnl"), whereas the other was assigned morphological data associated with E. bifascia (labeled Bil). Fourth, the data matrix presented by Simons (1992) and modified by Shaw et al. (1997) was not designed to ferret out the autapomorphies of each species but to examine the relationships among species. As a result, the matrix is deficient in terminal autapomorphies that would presumably sort out lineages such as E. beanii and E. bifascia. Of course, each species is distinctive, with its own unique array of character differences. However, these have not been sorted out and phylogenetically argued, so they were not used here. For example, it is not clear if the two continuous dorsal fin stripes of E. bifascia are derived relative to the single stripe in E. beanii (Williams, 1975). It is clear, however, that E. beanii and E. bifascia form separate lineages. Thus, the differences between gene lineages and species relationships have not been interpreted as evidence that fewer species are involved in the analysis. Phylogenetic analyses were restricted to a maxim u m of 300 trees. One strict consensus tree (Fig. 9) was generated from 300 equally parsimonious trees of 241 steps (CI = 0.693; RI = 0.903; RC = 0.626). This tree corroborates the hypothesis that E. pellucidum is more closely related to other Ammocrypta than to E. clarum and that E. clarum is the basal member of the subgenus, i.e., the total evidence tree corroborates those relevant parts of the DNA-only tree 1 shown in Fig. 5 rather than tree 2 in Fig. 6. If the analysis is constrained to make E. pellucidum and E. clarum a monophyletic species pair, the result is a tree of 243 steps (CI = 0.687; RI = 0.901; RC = 0.619). The combined evidence consensus tree is identical with regard to the unweighted consensus topology of E. vivax and E. meridianum samples. The combined total evidence consensus tree has the effect of collapsing the relationships among E. beanii and E. bifascia haplotypes as shown in Fig. 7. The explanation for this is simple: there is a conflict between the morphological autapomorphy shared among individuals of E. beanii (character 138-1) and the history of haplotype evolution. This conflict is sufficient to erase all substructure shown in the DNA-only analyses except the basic grouping of B-17, B-18, and B-19, grouped by their unique array of three haplotypes and the pseudoconvergent morphological character 138-1, an autapomorphy of E. beanii.
Complete 422 b.p. Sequence for Etheostorna Speciesa
TABLE I
tRNA (glu)
<
t t
Etheostoma nigrum (KU2 3143) vitreum (KU 24389) vitreum (KU 23144) pellucidum (KU 23150) pellucidum (KU 23150) clarum (KU 23145) clarum (KU 23145) vivax meridianum beanii and bifascia
-10
L
V
R
H
T
K
10
t
D
L
50
A
P
P
N
S
L
P
20
V
S
I
70
60
. . . . . . . . . . . . . . . . . . . . . . . .T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . . . .
L
K
A
I
30
N
N1
40
W
W
80
N
F
Z
90
G
S
L
100
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . . . . . . . . A . . C . . . . . C . A. . . . . . A . . . . . . . . A . . C . . . . . C . .A. . . . . . . . . . T . . . . . . . . C . . C . . A . . . . . . . . A . . C . . . . . C A. . T . . . . . . . . . . A . . . . . A . . . . . T . . . . . T . . C . . C . . A . . . . . . . . A . . C . . . . . AC . . T . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T . . . . . . . . C . . . T . . . . . . . . C . .
. . C C
. . . . . . . . w . . T . . y . . . . . T . . . . . . . . C . . C . . A . . G . . . . . A . . C . y . . . Cr . . . . . . . . . . . . . . . . . . . . . . . . . T . . . . . . . . C . . C . . A . . G . . . . . A . . C . . . . . C . . A. . . . . . . . . . . . . . T . . . . . y . T. . . . . . . . . C . . C . . T . . G . . . . . A . . C . . . . . C . . . . . . G
L
C
L
110
I
3
T
120
Q
I
L
130
T
G
L
140
F
L
A
M
150
H
Y
T
160
CTAGGTCTTTGCTTAATCACCCAGATCCTCACCGGACTCTTCTTGGCCATACACTATACC . . . . .C . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . . . . . TC . . . . T . . T . . . . . . . . G . . A . . C . . . . . T C . A . . . . . G . . . . . C . . . . . . . . C . . . . . TC . . . . T . . T . . . . . . . . G . . A . . C . . . . . T C . A . . . . . G . . . . . C . . . . . . . . . . . . . . . C . . . . T . . T . . . . . . T . A . . A . . C T . A . . . C . A . . . . . G . . . . . C . . . . . . . . . . . . . . .C . . . . T . . T . . . . . . T . A . . A . . C T . A . . . C . A . . . . . G . . . . . C . . . . . G . . m . . . . . y y . . . . y . . T . . A. . . . . . . . r . . C . . y . . y C .. A. . . . I . . . . . C . . . . . G . .m . . . . . T C . . . . T . . T . . A . . . . . . . . A . . C . . T . . . C . A . . . . . G . . . . . C . . . . .G. . C . . . . . y C . . G y y . . T . . A . . y . . y . . A . . T . . y . . . . . . . . . . . G . . . . . C . . .
A E theostoma nigrum (KU2 3143) vitreum (KU 24389) vitreum (KU 23144) pellucidum (KU 23150) pellucidum (KU 23150)
L
GCACTTGTTGACCTCCCTGCCCCCTCAAATATTTCGGTATGATGGAATTTTGGTTCCCTT
L Etheostoma nigrum (KU2 3143) vitreum (KU 24389) vitreum (KU 23144) pellucidum (KU 23150) pellucidum (KU 23150) clarum (KU 23145) clarum (KU 23145) vivax meridianum beanni and bifascia
>
S
TTCAACTACAAAAACCTCTA ATGGCAAGCCTCCGAAAAACCCACCCCTTACTAAAAATTGCAAACAAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . . . . . . . . . . . . . . . . .G . . . . . . . . . . . . .G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . . . . . . . . . . . . . . . . . G . . C . . . . . . . . . . . . . . . . .G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T . . . . . . . . . . . . . . . . . .C . . . . . . . . . . . . . . . . . G . . . . . . . . . . . . .G . . . . . y . .. . . . . . . . . . . . . . . . . . . . . . . . . . . C . . . . . . . . . . . . . . . . .G . y . . . G . . . . . . . G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . . . . . . . . . . . . . . . . .G . . . . . . . . . . y . .~ . . y . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . C . C . . . . . . . . . . . . . . . G . . A
Etheostoma nigrum (KU2 3143) vitreum (KU 24389) vitreum (KU 23144) pellucidum (KU 23150) pellucidum (KU 23150) clarum (KU 23145) clarum (KU 23145) vivax meridianum beanni and bifascia
Cytochrome B M A
170
GCCG . . . . . . . . . . A. . . A.
D
I
A 180
T
A
F
S 190
S
V
A 200
H
I
C 210
R
D
V
N
Y
G
220
ATATTGCCACAGCCTTTTCATCTGTTGCTCATATTTGCCGGGACGTAAACTACGGA . . . . . . . . . . . . . T . . . . . . . . . . . . . . . . . C . . . . . . . . A . . T . . . . . . . . G. . . . . . . . . . . . . . . T . . . . . G . . . . . . . . . . . C . . . . . . . . A . . T . . . . . . G. . . . . . C . . G . . C . . . . . C . . . . . A . . A . . A . . C . . C . . . . . . . . . . . . . .C . . . . . .C . . G . . C . . . . . C . . . . . A . . A . . A . . C . . C . ? . . ? . . . . . . . .C . .
. . . . . . . . . . . . .
clarum (KU 23145) clarum (KU 23145) vivux meridianurn beanni and bifascia
. . A. . . . . C . . A. . . . . C . . A. . y . .C . . A . . . . . C . . A. . . . . C
W
Etheostoma nigrum (KU2 3143) vitreum (KU 24389) vitreum (KU 23144) pellucidum (KU 23150) pellucidum (KU 23150) clarum (KU 23145) clarum (KU 23145) vivax meridianurn beunni and bifasciu
230 T G A . . G . . G . . .
A A A A A
R
. . . . .
N
. . . . .
C . . . . . C . . . . . A . . A . . A . . C . . C . . . . . A . . . . . . . C . . . . . . . C . . . . . C . . . . . A . . A . . A . . C . . C . . . . . A . . . . . . . C . . . . . . . C . . . . . C . . . . . A . . A . . A . . . . . y . . . . . . . . y . . w . .C . . . . . . C . . . . . C . . . . . A . . A . . . . . . . . C . . . . . . . . . . . T C . . . . . . . . C . . . . . C . . G . . A . . A . . . . . C . . C . . . . . A . . . . . w . .C . . . . . .
I
4
H
A
N
G
A
S
F
F
F
I
C
I
Y
L
5
240 250 260 270 280 C T T A T C C G T A A T A T T C A T G C C A A C G G T G C A T C C T T T T T C T T C A T C T G C A T T T A C C T G
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . A . . T . . . . . C C . C . . . .. . . . . . . C . . . . . . . . . . . . . . . . . . . . . . . C . . TT . A . . . . . . . . T . . C . . C C . y . .. .. . . . . . . C . . . . . T . . C . . y . . . . . y . . . . . C . . T A . . . . . . . . . . T . . C . . C C . . . . . . . . . . . . . C . . . . .T . . C . . . . . . . . T . . . . . C . . T A . . . . . . . c .. T .. c .. c c . . . . . . . . . . . . . . . . . . . A . . C . . . . . . . . T . . T . . y . . T A . .
I
G
R
29 0
G
L
Y
300
Y
G
S
310
Y
L
Y
320
K
E
T
W
330
N
I
G
340
CACATTGGACGAGGCCTGTACTACGGGTCCTACCTTTATAAAGAAACATGAAATATTGGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T . . . . . . . . C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. . ? . . . . . C . . G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .G . . . . . . . .T . . C . . . . . . . . . . . . . . . . . G . . . . . . . . C . . . . . . . . . . . . . . . . . . . . A . . . . . . . . T . . C . . . . . . . . . . . . . . . . . G . . . . . . . . C . . . . . . . . . . . . . . . . . . . .G T . A . . . . . T . . C . . G . . T . . A . . C . . . . . G . . G . . . . . C . . . . . . . . . . . . . . . . . . . . G T . A. . . . . T . . C . . G . . T . . A . . C . . . . . G . . G . . . . . C . . . . . . . . T . . . . . . . . . . . r . . r . . y . . T . . y . . r . . . . . . . . . . . . . . r . . . . . r . . C . w . . . . . . T . . . . . . . . . . .GT . . . . . . .T . . C . . G . . . . . . . . . . . . . . . . . . . . . . . C . . . . . . . . T . . . . . . . . . . . r . . . . . . . . T . . C . . A . . . . . . . . . . . . . . . . . G . . . . .r C . . . . .
V
Etheostoma nigrum (KU2 3143) vitreum (KU 24389) vitreum (KU 23144) pellucidum (KU 23150) pellucidum (KU 23150) clarum (KU23145) clarum (KU 23145) vivax meridianurn beanni and bifascia
I
. . . . .
. . . . . T . . C . . C C . C . .. .. . . . T . . C . . . . . . . . . . . . . . . . . . . . . .A. . . . . . . . . . . . . . . T . . C . . C C . C . .. .. . . . T . . C . . . . . . . . . . . . . . . . . . . . . .A . . . . . . . . . . . . A . . T . . . . . C C . C . . . .. . . . . . . C . . . . . . . . . . . . . . . . . . . . . . . C . . TT . A
H Etheostoma nigrum (KU2 3143) vitreum (KU 24389) vitreum (KU 23144) pellucidum (KU 23150) pellucidum (KU 23150) clarum (KU 23145) clarum (KU 23145) vivax meridianurn beanni and bifasciu
L
. . . . .
V6
L
L
350 360 G T A G T T C T T C T G C . . . . . . . . C . . C . . . . . . . . . C . . C . . . G . . G . . C . . C . . G . . G . . C . . C
. . C . .G . . . C . . G . . . . . .r . . . . . . G .
. . yr . C .
L
L
V
M
M
T
A
F
V
G
Y
V
L
P
370 380 390 400 T A C T A G T G A T A A T G A C C G C C T T T G T A G G G T A T G T T C T T C C C
. . T . . . . . . . . . . . . . . . . . . . . . . T . . A . . . . . . . . C . . . . . T . . . . . . . . . . . . . . . . . . . . . . T . . A . . . . . . . . C . . . . . . . . . . . A . . . . . . . . . . . . . . . . . T . . A . . C . . C . . G . . . . . . . . . . . A . . . . . . . . . . . . . . . . . T . . A . . C . . C . . G . . . . C . . C . . . . . . . . A . . . . . . . . . . . . . . . . . . . . A . . C . . . . . A . . . . C . . C . . . . . . . . A . . . . . . . . . . . . . . . . . . . . A . . C . . . . . A . . . . C . . C T . r . . . . . A . . . . . A . . . . . . . . . . . y . . A . . . . . y . . C . . . . A . . C T . G . . . . . A . . . . . A . . . . . . . . . . . C . . A . . . . . C . . C . . . . C . . C y . G. . . . . A . . . . . r . . . . . . . . . . . T . . r . . C . . . . . y . . .
Amino acid substitutions: (1) N D; (2) F S; (3) I V; (4)I L; (5) L M; (6) V I. aAmino acid translation of Cytochrome B partial sequence is given for E. nigrum; underlined codons and superscrips denote nonsynonymous substitutions in other species. For E. beunii, E. bifasciu, E. meridianurn, and E. vivux sequences lowercase letters denote intraspecific variation at that position. The complete list of haplotype variation for these 4 species is given in Table 11. Codes for intraspecific variation: y = C or T; r = A or G; w = A or T; s = G or C; m = A or C.
TABLE IZ Haplotypes Observed in E. vivax (V), E. mevidianum (M), E. beanii and E. bifascia (B) Position
Haplotypes
-17
-12
-9
-6
-3
24
48
57
60
63
66
98
108
114
120
121
125
126
135
138
141
147
150
159
174
201
207
v-1
A
C
G
C
C
T
C
T
T
C
T
T
G
C
T
C
T
T
C
C
G
T
C
G
T
A
T
v-2 v-3 v-4 v-5 V-6 v-7 V-8 M-1 M-2 B-1 B-2 B-3 B-4 B-5 8-6 B-7 B-8 B-9 B10 811 812 B13 B14 815 B16 B17 B18 B19
.
T
. .
'
G
.
G A
. C
G
C
T '
C C
T '
A
.
' .
C C
' '
' '
C C
' '
C
C
C
T
T
C
T
' .
. A A A A A . A A
A A A A A A A . A
. . . . C . .
T
T
C
T
. . C . .
T .
' T
.
.
.
A
T
.
C C C
. ' '
.
c
.
. C
' .
. . '
C
T
T
T
C
.
'
C
'
. ' .
. C C C
. . . . .
A A A A A
C C C . .
. .
A A
. .
C
A
C
C
.
T
.
. .
T T
. .
.
T
.
'
.
. . .
T T T
. . .
.
c
.
T ' ' '
.
' . . T
A A A .
G
C C C
. . . '
C C C C
. .
T T
C C
T
T
C
.
c
Position Haplotypes
216
219
246
270
276
282
303
304
306
309
315
318
333
339
348
351
352
354
357
361
363
375
387
390
396
399
V-1 V-2 V-3 V-4 V-5 V-6 V-7 V-8 M-1 M-2 B-1 B-2 B-3 B-4 B-5 B-6 B-7 B-8 B-9 B10 Bll B12 B13 B14 B15 B16 B17 B18 B19
T
T
T
C
T
C
A
C
G
C
C
A
A
G
A
A
G
G
C
T
G
A
C
A
C
C
A A A
T T T
T
C C C C
A
C C C C C
G
T
C
C C C C C
A
T
C
T
G G A
A A A
G
G 9
C C C
G
A A A A A
G
C
G G G
T T C
G
C
T T
C
A
A A A
9 T
G
T T T
9
C
G
C
A A C
T
G
G
T
A
T
A
Positions that did not v a r y a m o n g these h a p l o t y p e s are omitted; positions w i t h i n the sequence c o r r e s p o n d to those given in Table I.
A
9
A A A
T
A Matrix of Morphological Characters Organized by Species a
TABLE III Species E. nigrum E. vitreum E. clarum E. pellucidum E. vivax E. meridianum E. beanii E. bifascia
117 0 1 1 1 1 1 1 1
118 0 0 0 0 0 0 1 1
119 0 0 0 0 0 0 1 1
120 0 1 1 1 1 1 1 1
121 0 1 1 1 1 1 1 1
122 0 0 1 1 1 1 1 1
123 0 0 1 1 1 1 1 1
124 1 1 0 0 0 0 0 0
125 0 1 1 1 1 1 1 1
126 0 0 0 1 1 1 0 0
Species E. nigrum E. vitreum E. clarum E. pellucidum E. vivax E. meridianum E. beanii E. bifascia
131 1 1 0 0 0 0 0 0
132 1 1 0 0 0 0 0 0
133 0 1 1 1 1 1 1 1
134 0 0 1 1 1 1 1 1
135 0 1 1 1 1 1 1 1
136 0 0 0 0 0 0 1 1
137 0 0 0 0 0 0 1 1
138 0 0 0 0 0 0 1 0
139 0 0 1 0 0 0 1 1
140 0 0 1 0 1 0 1 1
127 0 0 1 1 1 1 1 1
128 0 0 0 1 1 1 1 1
129 0 0 0 0 0 0 1 1
130 0 0 0 1 1 1 1 1
"See Appendix II for a description of characters.
"T" +8
T
3
I'
LJ
L L! 13
E. bifascla
1 2
4
7
2
2
8
1
l 7
1
=4
~
1
1
10
4 1
I
I 52
4
13
3
7
LJ
5 18 10
'
t
1 5
~
1
11
=4 () 4
8 4
1
1
FIGURE 6 Basal part of tree 2 of two consensus trees generated by analysis of equally weighted D N A sequence data. See legend to Fig. 5 for definition of abbreviations.
00
I
O00
@ @ ID 0 0 0
0 0 0 0 0
r
III
f
9
9 @
I ..,
FIGURE 5 Basal part of tree I of two consensus trees generated by analysis of equally weighted D N A sequence data. The crown is shown in Fig. 7. In this and Figs. 6 - 9 , terminal units are haplotype vectors that represent one to several individuals. N, Etheostoma nigrum; Vt, E. vitreum; C, E. clarum; P, E. pellucidum; V, E. vivax; M, E. meridianum; B, E. beanii and E. bifascia. Dark bars are tagged with the n u m ber of transversions and transitions that show homoplasy along a particular internode. Open bars are tagged with the number of unique (CI = 1.0) transitions. Open circles are tagged with the number of unique transversions. A u t a p o m o r p h o u s characters are not plotted.
4 4 E. bifaseia E. beanii
FIGURE 7 Crown of D N A consensus trees 1 and 2. Relationships among haplotypes of E. beanii ( 9 and E. bifascia (o) are common to the consensus trees shown in Figs. 5 and 6. Note that the haplotypes do not sort into a priori species lineages.
6. mtDNA Sequence Variation among Sand Darters
I 1I
ried out on the sequence level and thus their contribution to the analyses would amount to using the same information twice. The authors note, however, that this class of changes includes amino acid substitutions that corroborate the monophyly of Ammocrypta [two nonsynonymous substitutions (Table I)] and the monophyly of E. vixax, E. meridianum, E. beanii, and E. bifascia [one nonsynonymous substitution (Table I)]. Three nonsynonymous substitutions were also found within species (Table I).
"3 =1 )1
3
t
87
8 1
5. Effects of Sampling Size
FIGURE 8 A portion of the topology of a consensus tree generated by weighting transversions three times transitions. Topologies (but not necessarily character distributions) of other parts of the tree are identical to those of consensus trees I and 2 shown in Figs. 5, 6, and 7.
4. Nonsynonymous Codon Substitutions This class of characters does not appear in any of the phylogenetic analyses because the analyses were car-
Because systematic studies using DNA sequencing are frequently undertaken using one or a few specimens for each taxon, the authors were interested in the effects of sample size on the phylogenetic results generated. To test these effects, 10 replicate phylogenetic analyses using a single individual for each of the ingroup taxa were performed. An exhaustive search was implemented. Tree lengths varied between 150 and 166
~, ~" ~, ~, ~, ~, ~, ~, ~, 138-1
8-1 4
4
'
-;12
"5
7
()4
"1
.
126-1
i
~
}
118..1 119,-1 120-1 136-1 137-1
138-1
8
4
1
26-1
14o-1
130-1
"" c-~
11 18 11
-----
--
122.1 123-1 124-o 125..1 131.o
132.o 134..1
FIGURE 9 A total evidence analysis using equally weighted DNA characters and morphological characters. Character annotations are similar to previous figures, except for the morphological characters. Dashes are labeled with the specific morphological character that was unique (CI = 1.0), whereas morphological characters found to be homoplasous are denoted with an "X."
TABLE IV
Distribution of Haplotypes among E. beanii and E. bifascia Samples Haplotype Sample
Species
beanii
Drainage
B-1
B-2
B-3
Pascagoula .
Tombigbee
3
Subtotal
.
9
3
0
Perdido
0
9
9
0
Yellow
1
Blackwater
2 3
2
9 9
2
1
1
0
9
1 9 0
9 0
B-16
B-17
0
0
0
1
3
.
.
.
1
3
0
0
0
3
2
2
6
9 16
11 1
1
35 11 12
.
.
.
6
9 0
size
6 9
9 1
1
.
. 9
3
B-19
. 9
1
.
3
B-18
12
4
1
3 4
B-15
.
9
10
B-14
12 .
.
0
B-13
. .
.
9
B-12
.
1
1
B-11
. .
10
2 .
1
B-10
1
9 9
B-9
.
2
0 .
B-8 .
. 9
. 9
B-7 .
. 9
Escambia
Total
B-6
.
.
Alabama
bifascia
B-5
.
Escatawpa
Subtotal
B-4
3
2
5 2
0
0
0
34 69
6. mtDNA Sequence Variation among Sand Darters steps whereas rescaled consistency indices varied between 0.765 and 0.816. Skewness (G1) varied between -1.44 and -1.63. Thus, one would judge all the results to carry a significant phylogenetic signal using the skewness criterion of Hillis (1991) and Hillis and Huelsenbeck (1992). The 10 replicates resulted in two tree topologies. Eight of the 10 replications found DNA tree 2 (Fig. 6) when all individuals were analyzed. Two of the replications found this tree plus an additional tree that asserts that E. meridianum is the sister species of the E. beanii-E, bifascia pair. None of the results duplicated the sequential basal positions of E. clarum and E. pellucidum shown in tree I (Fig. 5). Three additional analyses were produced using three randomly picked E. beanii and E. bifascia, and two each of the other members of the ingroup (because of haplotype diversity this amounted to all other members of the ingroup except E. vivax). Branch and bound analyses were employed to ensure that the most parsimonious tree or set of trees was found. All three replicates were consistent with the tree shown in Figs. 6 and 7, including one analysis that grouped an E. beanii with individual E. bifascia.
B. Intraspecific E v o l u t i o n
and Biogeography 1. E. beanii and E. bifascia A total of 19 mitochondrial haplotypes were detected among the 69 individual E. beanii and E. bifascia sequenced. There was very little overlap in the distribution of these haplotypes among drainages (Table IV). Only 3 haplotypes were found in more than one drainage system: among E. bifascia, haplotypes B-1 and B-9 appeared in the Yellow and Blackwater rivers; and among E. beanii, haplotype B-17 appeared in the Pascagoula and Escatawpa rivers. In both cases the rivers share a common estuary (Fig. 3, see numbers in legend to identify river drainages). The narrow geographic distribution of mitochondrial haplotypes among E. beanii and E. bifascia samples indicates that there is little contemporary gene flow between populations occupying separate river systems. Nucleotide diversities among these samples were estimated as 0.0033 for E. beanii and 0.0028 for E. bifascia using the formula of Nei (1987). These estimates are low, but within the range reported for other species and genes (Nei, 1987; Avise, 1994). Genetic distance between the two species, estimated from these diversities, was 0.0062. For the second level of analysis, genealogical relationships among haplotypes were added to geographic data. Haplotypes from both species were considered to-
89
gether. Genealogical relationships were inferred from patterns of sequence variation among haplotypes using two methods. The first was the parsimony-based phylogenetic analysis described earlier, which yielded the consensus tree shown in Fig. 7. The haplotypes found in E. beanii and E. bifascia were not separated on this tree. The most basal split occurred among the three haplotypes found only in the Escatawpa and Pascagoula rivers (B-17, B-18, and B-19), which were the westernmost drainages sampled, and all other haplotypes. The other resolved node within this section of the tree, comprising B-1 through B-11, has no obvious biogeographical interpretation: these haplotypes were found in rivers to the east of the Escatawpa and Pascagoula. Clearly, the interspecific phylogenetic tree lacks resolution at the intraspecific level for E. beanii and E. bifascia. For such closely related haplotypes, construction of an unrooted, minimum spanning network provides an alternative method for genealogical inference (Templeton et al. 1992; Excoffier and Smouse, 1994). In this approach, haplotypes are represented as nodes connected to one another by the minimum number of discrete mutational steps. A minimum spanning network for the E. beanii and E. bifascia haplotypes is shown in Fig. 10 (note figure legend for correct spatial relationships among drainages). The Escatawpa and Pascagoula haplotypes (B-17 through B-19) can be used as examples of the principles used in its construction. In this network, haplotypes B-18 and B-17 are connected by a single line segment which represents the single nucleotide substitution that distinguishes them (C ~ T at position 66; Table II). Haplotype B-19 differs from B-17 by two other substitutions (G ~ A at-9 and C ~ T at 399; Table II), which are represented by two line segments. The filled circle separating these two segments may be regarded as an unobserved intermediate haplotype. Together, the three line segments that separate haplotypes B-18 and B-19 represent the three nucleotide substitutions that distinguish this pair. An important distinction between the network shown in Fig. 10 and the tree shown in Fig. 7 is that extant haplotypes in the network, such as B-17, can serve as internal nodes and have multiple branches. These haplotypes may be interpreted as surviving ancestral haplotypes that may have spawned a series of descendant haplotypes. Although the network does not, by itself, imply any ancestor-descendant directionality, support for the basal status of B-17 relative to B18 and B-19 is provided by the observation that the root of the network, as shown by the phylogenetic tree, requires that B-17 be considered basal and by the fact that the substitutions found in the latter two haplo-
90
E. 0. WILEY A N D ROBERT H. HAGEN
Yellow & Blackwater
Escambia
"
\\
Perdido
"1
.
~
.
.
.
......:~i~i~!:i~~~i'~iiiiiiii i~i!i~:~~'"i.' i~ii}i Tombigbee
~ ~ " ~ \ ~ , , ~
~
\ , ~ , ~ \ . ~ ~_...~/'~\\>!"
~iil Alabama .....
\\\\
Perdido ~i~i~::~ii~::~::i~i~i~i ;i i i i i i~::i~::~:~::~i~i~i~::~::i~;~@~::~;~;@~f:~f:~@~@~~~ ::i::i::i~::F: i',i i i~i~
ii,i,iii',iiiii!ii!!!!iiii!i!ii!',!i,i',',i;;
i~;~i~i~iii~i~;~i~!~i~i~i~!ill~i~i~i
Escatawpa
I iiiiiiiiilE. beanii
!ii i ',',',i
& Pascagoula
~~
E. bifascia
FIGURE 10 A network of haplotypes observed in E. beanii and E. bifascia. Dots represent unobserved haplotypes one mutational step removed from other dots and/or observed haplotypes. The root occurs between haplotypes B-17 and B-12. Drainages from which haplotypes were observed are overlaid on the network. Note that for graphic reasons the geographic positions of drainages are not accurate. The actual spatial relationships among the drainages from west to east are: Pascagoula, Escatawpa, Tombigbee, Alabama (Tombigbee + Alabama = Mobile Bay drainage), Perdido, Escambia, Blackwater, Yellow rivers.
types are all unique within the set of E. beanii/E, bifascia haplotypes. The root of the entire E. beanii/E, bifascia clade lies between haplotypes B-17 and B-12, based on the reconstructed ancestral character state vector derived from the interspecific phylogenetic tree shown in Fig. 7. Of these two haplotypes, B-17 is one step from the ancestral node reconstructed by the parsimony analysis whereas B-12 is three steps removed. Another feature of the network in Fig. 10 is the closed loop connecting haplotypes B-1-B-3-B-2-B9. The presence of such loops indicates ambiguity among alternative, equally short, pathways connecting the
haplotypes (Excoffier and Smouse, 1994). Two circumstances could produce this condition: either one of the haplotypes arose from independent mutations in two different lineages (i.e., convergence) or one of the line segments represents a direct evolutionary path that did not actually occur (i.e., a case of back mutation resulting in character reversal at a particular site). As examples, haplotype B-2 may have been derived independently from B-9 and B-3 (convergence), or haplotypes B-1 and B-3 may differ by three mutations rather than one (transitions from C to T at positions 147 and 282, followed by reversal to C at 147; Table II). The rooted version of the network shown in Fig. 10 provides more biogeographic information than the haplotype tree. Haplotypes common to both species (B-l) or with descendants in both species (B-12) are near the middle of the network and have many descendants. Haplotypes represented by single individuals occur at the tips of the branches, as would be expected if these represent recent divergences. As in the tree, the three Escatawpa and Pascagoula haplotypes are clearly distinct, but co-occurring haplotypes in other drainages tend to be clustered on the network. There is an overall west-to-east pattern of haplotype distributions and most haplotypes are confined to single drainages (e.g., Perdido River populations) or drainages that share a common bay (e.g., Escatawpa and Pascagoula populations). This congruence would be expected of populations that have experienced little gene flow over a long period of time, sufficient for new haplotypes to evolve in situ. The authors note that while gene flow is lacking, the rooted network apparently does contain some haplotype relationships that imply historical connections between drainages. For example, the Escambia River is linked to both the Yellow and Blackwater and the Perdido through a series of ancestral/descendant haplotypes. Some of these relationships are probably younger than others. For example, the relationship between B-3 and B-11 suggests a relatively recent gene flow event because B-11 is far removed from any relationship with other Perdido haplotypes. 2. E. vivax and E. meridianum
Eight haplotypes were detected among 11 E. vivax individuals sequenced. The V-2 haplotype occurred in fish from the geographically close samples taken from the Mississippi and St. Francis rivers; the remaining haplotypes appeared only in single drainages. Two haplotypes were observed among th e four E. meridianum individuals sequenced (Table V). For these species, the phylogenetic trees based on equally weighted substitutions (Figs. 5 and 6) and the
6. mtDNA Sequence Variation among Sand Darters TABLE V
91
Distribution of Haplotypes among E. v i v a x and E. meridianum Samples Haplotype
Species
Drainage
M-1
M-2
V-1
V-2
V-3
V-4
V-5
V-6
V-7
V-8
meridianum Subtotal vivax
Tombigbee
3 3
1 1
0
0 1 1 .
0
0
0
0
0
0 1
1
1
1
Mississippi St. Francis Ouachita Pearl Sabine
1
9 2
0
Subtotal Total
0
1
minimum spanning network (Fig. 11) yield identical information. The eight E. vivax and two E. meridianum haplotypes resolve into five distinct haplotype lineages, none of which can be identified as basal. Each of the haplotype lineages are composed of individuals from the same or interconnected drainages. Greater divergence among haplotypes does limit the effectiveness of the network approach to genealogical reconstruction. When basal haplotypes are missing from the array (due to extinction or insufficient sampling), most of the network reduces to a tree,
2
1
1
1
2
2 2
1
Sample size 4 4 2 2 3 2 2 11 15
although information may still be obtained at the branch tips (Templeton et al., 1992). The small sample sizes representing these taxa limit the biogeographic analyses that can be made. The estimated nucleotide diversity among E. vivax was four times higher than for E. beanii and E. bifascia: 0.0142. However, the geographic area represented by these samples was also considerably larger, which could also account for the difference.
V. D i s c u s s i o n Ouachita
A. P h y l o g e n e t i c A n a l y s i s
....~:~:~!i!ii". ..... ................. ~ii!ii!iiiiii?
9 @
Tombigbee .................. ..:!iiiT
Sabine
. .........
Mississippi & St. Francis
Pearl
FIGURE 11 A network of haplotypes observed in E. vivax and E. meridianum. Dots represent unobserved haplotypes one mutational step removed from other dots a n d / o r observed haplotypes.
The major difference between the DNA-only analyses and those based on morphology and allozymes presented by Shaw et al. (1997) concerns the phylogenetic placement of E. pellucidum and the E. vivaxE. meridianum species pair. Shaw et al. (1996) hypothesized that E. pellucidum was the sister group of all Ammocrypta except E. clarum, a hypothesis identical to DNA tree 1 (Fig. 5), but not to DNA tree 2 (Fig. 6). If only a few individuals of each species had been sampled, the authors would have consistently drawn a conclusion at variance with the morphological and allozymic data unless they employed an approach minimally incorporating the DNA and morphological evidence. The authors' DNA-morphology tree favors the basal placement of E. clarum (Fig. 5) rather than a monophyletic group comprising E. clarum and E. pellucidum (Fig. 6). The authors consider this combined analysis tree (Fig. 9) to be their best estimate of the phylogenetic relationships among sand darters in this study because of the principle of total evidence (Kluge, 1989) and thus conclude that Shaw et al. (1997) correctly placed E. clarum and E. pellucidum. Regarding
92
E. O. WILEYAND ROBERTH. HAGEN
E. vivax and E. meridianum, the authors consistently arrived at the hypothesis that this species pair is monophyletic regardless of the data and weighting scheme used (Figs. 5, 6, and 9). Given that morphological data used by Shaw et al. (1996) as well as DNA data were used, the authors suggest that this is a more robust phylogenetic hypothesis than the alternative hypothesis that E. vivax is the sister to the E. beanii-E, bifascia pair. The biogeographic history implied by the total evidence phylogeny also provides some corroboration. A biogeographic pattern of a Mobile Bay drainage endemic being closely related to a species found west and north of the Mobile Bay is replicated in both the Hybopsis longirostris species group (Wiley and Titus, 1992) and the Lythrurus roseipinnis species group (Wiley and Siegel-Causey, 1994).
B. Biogeography of E. vivax and E. meridianum E. vivax is a relatively widespread species and it is not surprising that the populations sampled displayed a diversity of haplotypes. Although two of the E. vivax samples were geographically close (upper Pearl River drainage) to the samples of E. meridianum, they were not particularly close to that species in terms of sequence similarity. The authors hesitate to speculate as to the cause of the diversity observed in E. vivax haplotypes and suggest that a broader survey, including greater sample sizes and better geographic coverage, is needed.
C. Biogeography of E. beanii and E. bifascia
Despite the relative proximity of the rivers from which E. beanii and E. bifascia samples were collected, there was no indication of ongoing gene flow between populations within species as evidenced by haplotypes shared among drainages. The few exceptions were not sufficient to obscure the correlation. AMOVA analysis (Excoffier et al., 1992) does not contribute much additional information because there is little overlap among haplotype distributions. AMOVA may be more appropriate for situations where current gene flow plays a more significant role between populations within species. In contrast to clear differentiation among river drainage systems within species, relationships between the two species were obscure. Clearly, the gene tree shown through parsimony analysis and the haplotype network was not congruent with relationships on the species level, i.e., neither the gene tree nor the haplotype network supports the conclusion that E. beanii and
E. bifascia are separate species, but both morphological (Williams, 1975) and allozyme data (Shaw et al., 1996) leave no doubt that E. beanii and E. bifascia are separate lineages ("good" species). The authors suspect that this incongruence is due to the retention of ancestral haplotypes subsequent to speciation followed by mutational events derived from these haplotypes isolated in separate drainages. Such a scenario would not be expected to mirror evolution on the morphological or allozymic levels. This speculation is based on three lines of evidence. First, when the haplotype network was inspected, the authors observed that the haplotype shared by both species (B-l) is relatively far removed from the root of the network as shown by the phylogenetic analysis and that one found only in E. bifascia (B12) is between the shared haplotype (B-l) and the ancestral node (Fig. 10). Second, the shared haplotype is not found in geographically contiguous drainages, being found in the Tombigbee (E. beanii) and Yellow and Blackwater (E. bifascia) rivers. Given the lack of gene flow within species between drainages, it is unlikely that this shared haplotype is the result of gene flow. Third, the root of the network, as shown by the phylogenetic tree, is between the two major populations of E. beanii. Although the authors do not doubt that the Pascagoula/Escatawpa system is isolated from the Mobile Bay (Alabama-Tombigbee) drainage, it is believed that this occurred after the speciation event that separated E. beanii and E. bifascia. This is based on the general geologic history of the region, the monophyly of the species pair, and the distinctiveness of the two species which indicates that any split occurring between populations of E. beanii must have occurred after, not before, the origin of E. beanii and E. bifascia from their common ancestor. If this interpretation is accepted, then the historical gene flow patterns implied by the rooted network can be further resolved (Fig. 10). Older gene flow patterns are represented by haplotypes B-1 and B-12 and denote gene flow that occurred before the origin of either descendant species from their common ancestor. If so, then the ancestor/descendant patterns between these haplotypes and others found in both E. beanii and E. bifascia do not imply recent gene flow between these two species. Furthermore, they do not imply recent gene flow within E. bifascia relative to the relationship between the Blackwater and Yellow rivers on the one hand and the Escambia River on the other hand. Thus, the authors suggest that the mere presence of shared haplotypes cannot be automatically interpreted as evidence of gene flow. Rather, shared haplotypes are better interpreted within a historical context.
6. mtDNA Sequence Variation among Sand Darters Parts of the rooted network give indications of historical gene flow between drainages within species. There are two possible kinds of relationships within species between drainages. First, a haplotype might be shared between drainages. Haplotype B-l, as an ancestral haplotype, may not be indicative of a relationship between the Yellow and the Blackwater rivers, but B-9, a derived haplotype, probably is indicative of that relationship. Second, ancestor/descendant relationships among different haplotypes occupying different drainages might be indicative of a relationship. For example, however we resolve the loop B-l, B-9, B-2, B-3, the relationships of these haplotypes to other haplotypes in the Escambia, Yellow, and Blackwater rivers denotes historical gene flow exclusive to these drainages. The Perdido, highly isolated, shows some affinities to the contiguous Escambia through the B-3/B-11 haplotype relationship, but we must be cautious. Haplotype B-l, if interpreted as an ancestral haplotype, is not indicative of a close relationship among the Perdido, Escambia, Yellow, and Blackwater rivers. The link between B-1 and B-12 does not imply a close relationship among the Yellow, Blackwater and Perdido rivers. Just as shared haplotypes might not be indicative of a relationship between drainages, the link between ancestral and descendant haplotypes may be indications of more ancient rather than more recent gene flow.
D. Speciation and Biogeography E. beanii and E. bifascia are part of a larger story involving replicate speciation among several groups of fishes inhabiting the northern Gulf Coastal Plain. Wiley (1977) suggested that the speciation event involving two sister species of topminnows (Fundulidae), Fundulus nottii and F. escambiae, might be correlated with tectonic events occurring in the region that had apparently shifted drainage patterns from a northeastsouthwest direction to a north-south direction (Price and Whetstone, 1977). This event would have separated the Mobile Bay drainage basin (Alabama and Tombigbee rivers) from those immediately to the east (Perdido and Escambia rivers), producing the biogeographic pattern observed in the two species. Wiley and Mayden (1985) reviewed biogeographic distributions of a number of species groups of fresh and brackish water fishes, as well as selected other aquatic vertebrates and invertebrates. The same drainage boundaries corresponded to species limits for a number of groups. Wiley and Mayden (1985) identified six additional species pairs distributed in such a manner that one of the pair occupied the Mobile Bay basin and drainages to the west, whereas the sister species occupied the Perdido and Escambia river drainages and
93
drainages to the east. These include two groups of topminnows (F. nottii-F, escambiae; F. confluentus-F, pulvereus), two species pairs of darters (E. chlorosomaE. davisoni; E. beanii-E, bifascia), two species pairs of minnows (L. roseipinnis-L, atripiculus; H. longirostrisHybopsis sp.), and three species pairs of snakes (Natrix rhombifera-N, taxispilota; N. cyclopion-N, floridans; Farencia reinwardti-F, abacura). Two apparent exceptions to this pattern, a group of Hybopsis minnows (the H. longirostris group) and a group of Lythrurus minnows (the L. roseipinnis group), were shown by subsequent phylogenetic analyses to conform to the pattern (Wiley and Titus, 1992; Wiley and Siegel-Causey, 1994). Additional groups that might be implicated in the pattern include subspecies of the pike Esox americanus (Crossman, 1966), species of the blenny genus Chasmodes (Williams, 1983), and the mosquito fishes Gambusia affinis and G. holbrooki (Wooten et al., 1988; Wooten and Lydeard, 1990; Scribner and Avise, 1993). What emerges from the phylogenetic studies is a pattern of vicariance. In the case of closely related and relatively recent species pairs such as E. beanii and E. bifascia, dispersal apparently has not altered the original vicariance pattern and it is relatively easy to correlate this pattern with geologic events. When one examines the patterns of the clades to which the various species pairs studied by Wiley and Mayden (1985) belong, it becomes quickly apparent that simple vicariance explanations of deeper nodes are difficult (Wiley and Mayden, 1985), but there is some trace among the fishes. The closest relatives of H. longirostris and Hybopsis sp. live in the Mobile Bay basin and in tributaries west to the Mississippi. The same is generally true of the relatives of L. roseipinnis and L. atripiculus: L. bellus, a Mobile Bay endemic, is the sister species of the pair whereas other relatives are found to the north and west (L. ardens and L. umbratilis). F. nottii and F. escambiae are a bit different in having an eastern sister species (F. lineolatus), but the three are related to two species that have more western distributions. Thus, finding that E. pellucidum is not, in fact, closely related to either E. vivax or E. meridianum (Shaw et al., 1997) brings the biogeography of the subgenus in closer agreement with what is known of the phylogeny and biogeography of fishes with similar distributions and presumably similar histories. Swift et al. (1986) suggest that the eastern Gulf Coast lowland fish fauna represented by such groups as the F. nottii group and, by extension, Ammocrypta colonized the coastal plain during the late Miocene during a period of low sea levels. Their time table of vicariance suggests a late Miocene-early Pliocene (4-5 mybp) event and certainly an event that happened no later than the late Pliocene [1-2 mybp: see Price and Whet-
94
E. 0. WILEY AND ROBERT H. HAGEN
stone (1977) for geological evidence involving changes in drainage patterns]. In other groups, Pleistocene events are not correlated with speciation, but with geographic variation (Swift et al., 1986). Thus, differences within species among the various drainages along the Gulf Coastal Plain would be expected, but the authors conclude that they are younger than the vicariance events that produced the species observed in this study. If a relatively recent event (isolation of the Pascagoula/Escatawpa from the Mobile Bay within E. beanii) falls out at the base of a phylogenetic tree while a more ancient event (speciation of E. beanii and E. bifascia) does not, the most likely conclusion is that both species retained an array of haplotypes that were present in their common ancestor and that, given this, there is no expectation that the gene tree will reflect the species tree until such time as these ancestral haplotypes go extinct and more lineage-specific haplotypes emerge. Obviously, 2 - 4 million years has been insufficient time for this to occur.
VI. Summary DNA sequence data were obtained from the N-terminal end of the mitochondrial cytochrome b gene and a portion of the adjacent glutamine tRNA gene for 89 individuals representing six species of Etheostoma (Ammocrypta) and two closely related species, E. vitreum and E. nigrum. Substitutions were found at 27% of 422 sites and 71% of these were transitions. Six substitutions produce nonsynonymous codons out of the 134 cytochrome b codons included in the sequence. Among the recently diverged sister species E. beanii and E. bifascia, there were 21 substitutions (5%), none of which were fixed between species. Phylogenetic analyses of DNA data with haplotypes as terminal taxa were performed using equally weighted and transversionweighted matrices. A total evidence analysis was also performed that favored one of the DNA-only trees. In no case were the relationships among the haplotypes of E. beanii and E. bifascia resolved. Rather, some E. beanii haplotypes were basal relative to the remaining E. beanii and all E. bifascia haplotypes. A minimum spanning network of haplotypes was constructed. This network suggests that river drainage systems are isolated from each other and that haplotypes show high drainage system affinity. The placement of the root derived from the phylogenetic analysis onto the network suggests that there has been insufficient time for complete lineage sorting as E. beanii and E. bifascia diverged from their common ancestor. Despite this, the rooted network does suggest biogeographic relationships of populations from some of the drainages within E. bifascia.
Acknowledgments We thank Bob Cashner and Steve Stevenson (University of New Orleans), Rick Mayden, Berney Kahajda, and Andrew Simons (University of Alabama), Hank Bart (Tulane University), Doug SiegelCausey (University of Nebraska), Tim Schmidt (Wayne State University), George Harp (Arkansas State University), and Frank Cross and Kate Shaw (University of Kansas) for their help in collecting specimens in the field. Thanks to Larry Page (Illinois Natural History Survey) for specimens of E. pellucidum. Thanks to Bob Jenkins (Roanoke College) for specimens of E. vitreum. This project was generously supported by grants from the General Research Fund, University of Kansas (3208), and from the National Science Foundation (BSR 8722562 for field work and DEB 9207600 for data collecting and analysis).
References Avise, J. C. 1992. Molecular population structure and the biogeographic history of a regional fauna: A case history with lessons for conservation biology. Oikos 63: 62- 76. Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman and Hall, New York. Avise, J. C., Arnold, J., Ball, R. M., Bermingham, E., Lamb, T., Neigel, J. E., Reeb, C. A., and Saunders, N. C. 1987. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 18: 489- 522. Bertorelle, G., and Barbujani, G. 1995. Analysis of DNA diversity by spatial autocorrelation. Genetics 140:811-819. Bickham, J. W., Wood, C. C., and Patton, J. C. 1995. Biogeographic implications of Cytochrome b sequences and allozymes in sockeye (Oncorhynchus nerka). J. Hered. 86:140-144. Birt, T. P., Friesen, V. L., Birt, R. D., Green, J. M., and Davidson, W. S. 1995. Mitochondrial DNA variation in Atlantic capelin, Mallotus villosus: A comparison of restriction and sequence analyses. Mol. Ecol. 4: 771- 776. Carr, S. M., Snellen, A. J., Howse, K. A., and Wroblewski, J. S. 1995. Mitochondrial DNA sequence variation and genetic stock structure of Atlantic cod (Gadus morhua) from bay and offshore locations on the Newfoundland continental shelf. Mol. Ecol. 4: 79-88. Crossman, E. J. 1966. A taxonomic study of Esox americanus and its subspecies in Eastern North America. Copeia 1966(1):1-20. Danzmann, R. E., and Ihssen, P. E. 1995. A phylogeographic survey of brook charr (Salvelinus fontinalis) in Algonquin Park, Ontario based upon mitochondrial DNA variation. Mol. Ecol. 4: 681-697. Excoffier, L., and Smouse, P. E. 1994. Using allele frequencies and geographic subdivision to reconstruct gene trees within a species: Molecular variance parsimony. Genetics 136:343-359. Excoffier, L., Smouse, P. E. and Quattro, J. M. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA restriction data. Genetics 131: 479-491. Graybeal, A. 1993. The phylogenetic utility of cytochrome b: Lessons from bufonid frogs. Mol. Phylo. Evot. 2:256-269. Hocutt, C. H. 1980a. Ammocrypta meridiana Williams. In "Atlas of North American Freshwater Fishes" (D. S. Lee et al., eds.). North Carolina St. Mus. Nat. Hist., Raleigh, NC. Hocutt, C. H. 1980b. Ammocrypta pellucida (Agassiz). In "Atlas of North American Freshwater Fishes" (D. S. Lee et at., (eds.). North Carolina St. Mus. Nat. Hist., Raleigh, NC. Hudson, R. R. 1990. Gene genealogies and the coalescent process. Oxford Surv. Evol. Biol. 7:1-44. Irwin, D. M., Kocher, T. D., and Wilson, A. C. 1991. Evolution of the cytochrome b gene of mammals. J. Mot. Evol. 32:128-144.
6. m t D N A Sequence Variation among Sand Darters
Kluge, A. G. 1989. A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae, Serpentes). Syst. Zool. 38: 7-25, Kocher, T. D., Meyer, W. K., et al. 1989. Dynamics of mitochondrial DNA evolution in animals. Proc. Natl. Acad. Sci. USA 86:61966200. Krajewski, C., and King, D. G. 1996. Molecular divergence and phylogeny: Rates and patterns of cytochrome b evolution in cranes. Mol. Biol. Evol. 13:21-30. Lydeard, C., Wooten, M. C. and Meyer, A. 1995. Cytochrome b sequence variation and a molecular phylogeny of the live-bearing fish genus Gambusia (Cyprinodontiformes: Poeciliidae). Can. J. Zool. 73:213-227. Maddison, W. P., and Maddison, D. R. 1992. "MacClade." Sinauer, Sunderland, MA. Magoulas, A., Tsimenides, N., and Zouros, E. 1996. Mitochondrial DNA phylogeny and the reconstruction of the population history of a species: The case of the European anchovy (Engraulis encrasicolus). Mol. Biol. Evol. 13:178-190. Maniatis, T., Fristch, E. F., and Sambrook, J. 1982. "Molecular Cloning: A Laboratory Manual." Cold Spring Harbor Laboratory, Cold Spring, NY. Mayden, R. L., B. M. Burr, L. M. Page, and R. R. Miller. 1992. The native freshwater fishes of North America. "Systematics, Historical Ecology and North American Freshwater Fishes" (R. L. Mayden, ed.), pp. 827-863. Stanford Univ. Press, Stanford, CA. Meyer, A. 1994a. DNA technology and phylogeny of fish. In "Genetics and Evolution of Aqauatic Organisms" (A. R. Beaumont, ed.), pp. 219-249. Chapman and Hall, London. Meyer, A. 1994b. Shortcomings of the cytochrome b gene as a molecular marker. Trends Ecol. Evol. 9:278-280. Murphy, W. J., and Collier, G. E. 1996. Phylogenetic relationships within the aplocheiloid fish genus Rivulus (Cyprinodontiformes, Rivulidae): Implications for Caribbean and Central American biogeography. Mol. Biol. Evol. 13:642-649. Nei, M. 1987. "Molecular Evolutionary Genetics." Columbia Univ. Press, New York. Paabo, S. 1990. Amplifying ancient DNA. In "PCR Protocols: A Guide to Methods and Applications" (M. A. Innes et al., eds.), pp. 159-166. Academic Press, San Diego. Patton, J. L., da Silva, M. N. F., and Malcolm, J. R. 1996. Hierarchical genetic structure and gene flow in three sypatric species of Amazonian rodent. Mol Ecol. 5: 229- 238. Price, R. C., and Whestone, K. N. 1977. Lateral stream migration as evidence for regional geologic structures in the eastern Gulf Coastal Plain. Southeast. Geol. 18(3):129-147. Saiki, R. K. 1990. Amplification of genomic DNA. In "PCR Protocols: A Guide to Methods and Applications" (M. A. Innis et al., eds.), pp. 13-20. Academic Press, San Diego. Scribner, K. T., and Avise, J. C. 1993. Cytonuclear genetic architecture in mosquitofish populations and the possible roles of introgressive hybridization. Mol. Ecol. 2:139-149. Shaw, K., Simons, A. M. and Wiley, E. O., 1997. A reexamination of the phylogenetic relationships of the sand darters (Teleostei: Percidae). Occas. Publ. Mus. Nat. Hist. Univ. Kansas. Submitted for publication. Simons, A. M. 1991. Phylogenetic relationships of the crystal darter, Crystallaria asprella. Copeia 1991:927- 936. Simons, A. M. 1992. Phylogenetic relationships of the Boleosoma species group (Percidae: Etheostoma). In "Systematics, Historical Ecology and North American Fireshwater Fishes" (R. L. Mayden, ed.), pp. 268-292. Stanford Univ. Press, Stanford, CA. Slatkin, M., and Maddison, W. P. 1989. A cladistic measure of gene flow inferred from the phylogenies of alleles. Genetics 123:603613. Stauffer, J. R. 1980a. Ammocrypta beani Jordan. In "Atlas of North
95
American Freshwater Fishes" (D. S. Lee et al., eds.), p. 616. North Carolina St. Mus. Nat. Hist., Raleigh, NC. Stauffer, J. R. 1980b. Ammocrypta clara Jordan and Meek. In "Atlas of North American Freshwater Fishes" (D. S. Lee et al., eds.), p. 618. North Carolina St. Mus. Nat. Hist., Raleigh NC. Stauffer, J. R, and Hocutt, C. H. 1980. Ammocrypta vivax Hay. In "Atlas of North American Freshwater Fishes" (D. S. Lee et al., eds.), p. 621. North Carolina St. Mus. Nat. Hist., Raleigh, NC. Stauffer, J. R, Hocutt, C. H. and Gilbert, C. R. 1980. Ammocrypta bifascia Williams. In "Atlas of North American Freshwater Fishes" (D. S. Lee et al., eds.), p. 617. North Carolina St. Mus. Nat. Hist., Raleigh, NC. Swift, C. C., Gilbert, C. R. Bortone, S. A., Burgess, G. H., and Yeger, R. W. 1986. Zoogeography of the freshwater fishes of the southeastern United States: Savannah River to Lake Ponchartrain. In "The Zoogeography of North American Freshwater Fishes" (C. H. Hocutt and E. O. Wiley, eds.), pp. 213-265. WileyInterscience, New York. Swofford, D. L. 1993. PAUP 3.1.1. Computer Program distributed by the Illinois Natural History Survey, Champaign, IL. Swofford, D. L., Olsen, G. J., Waddell, P. J., and Hillis, D. M. 1996. Phylogenetic inference. In "Molecular Systematics" (D. M. Hillis, C. Moritz, and B. K. Mable, eds.), 2nd Ed., pp. 407-514. Sinauer Associates, Sunderland, MA. Templeton, A. R., Crandall, K. A., and Sing, C. F. 1992. A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics 132: 619-633. Templeton, A. R., Routman, E., and Phillips, C. A. 1995. Separating population structure from population history: A cladistic analysis of the geographical distribution of mitochondrial DNA haplotypes in the tiger salamander, Ambystoma tigrinum. Genetics 140: 767-782. Weir, B. S. 1996. Intraspecific differentiation. In "Molecular Systematics" (D. M. Hillis, C. Moritz, and B. K. Mable, eds.), 2nd Ed., pp. 385-406. Sinauer Associates, Sunderland, MA. Wiley, E. O. 1977. The phylogeny and systematics of the Fundulus nottii species group (Teleostei: Cyprinodontidae). Occ. Pap. Mus. Nat. Hist. Univ. Kansas. 67:1-31. Wiley, E. O. 1992. Phylogenetic relationships of the Percidae (Teleostei: Perciformes): A preliminary hypothesis. In "Systematics, Historical Ecology and North American Freshwater Fishes" (R. L. Mayden, ed.), pp. 247-267. Stanford University Press, Stanford, CA. Wiley, E. O., and Mayden, R. L. 1985. Species and speciation in phylogenetic systematics, with examples from the North American fish fauna. Ann. Missouri Bot. Gard. 72:596-635. Wiley, E. O., and Siegel-Causey, D. 1994. A phylogenetic analysis of the Lythrurus roseipinnis species group (Teleostei: Cyprinidae), with comments on the relationship of other Lythrurus. Occas. Pap. Mus. Nat. Hist. Univ. Kansas 171:1-20. Wiley, E. O., and Titus, T. A. 1992. Phylogenetic relationships among members of the Hybopsis dorsalis species group (Teleostei: Cyprinidae). Occas. Pap. Mus. Nat. Hist. Univ. Kansas 152:1-18. Williams, J. D. 1975. Systematics of the percid fishes of the subgenus Ammocrypta, genus Ammocrypta, with descriptions of two new species. Bull. Alabama Mus. Nat. Hist. 1:1-56. Williams, J. T. 1983. Taxonomy and ecology of the genus Chasmodes (Pisces: Blenniidae) with a discussion of its zoogeography. Bull. Florida St. Mus. Biol. Sci. 29(2): 1-100. Wooten, M. C., and Lydeard, C. 1990. Allozyme variation in a natural contact zone between Gambusia affinis and Gambusia holbrooki. Biochem. Syst. Ecol. 18(2/3): 169-173. Wooten, M. C., Scribner, K. T. and Smith, M. H. 1988. Genetic variation and systematics of Gambusia in the Southeastern United States. Copeia 1988(2):283-289.
96
E. O. WILEY AND ROBERTH. HAGEN
Zhu, D., Jamieson, B. G. M., Hugall, A., and Moritz, C. 1994. Sequence evolution and phylogenetic signal in control-regionand cytochrome-bsequences of rainbow fishes (Melanotaeniidae).Mot. Biol. Evol. 11(4):672-683.
A p p e n d i x h Specimens Examined
Specimens are presented by KU number (number of specimens), state, and major drainage. Exact localities are available from E. O. Wiley. Vouchered specimens are actual specimens used in this chapter, and the series may contain more specimens than analyzed. Etheostoma beanii: KU 22898 (6), Alabama, Esactawpa Dr. KU 24380 (5), Alabama, Tornbigbee Dr. KU 24382 (6), Alabama, Tombigbee Dr. KU 24381 (6), Alabama, Alabama Dr. KU 24383 (6), Mississippi, Pascagoula Dr. KU 24384 (6), Mississippi, Pascagoula Dr. Etheostoma bifascia: KU 24385 (6), Alabama, Perdido Dr. KU 24388 (5), Florida, Perdido Dr. KU 22146 (6), Florida, Escambia Dr. (1988 collection). KU 24387 (6), Florida, Escambia Dr. (1989 collection). KU 24386 (6), Alabama, Yellow Dr. KU 24862 (5), Florida, Blackwater Dr. Etheostoma clarum: KU 23145 (2), Arkansas, Strawberry Dr. Etheostoma meridianum: KU 23148 (2), Mississippi, Tombigbee Dr. KU 23149 (2), Misssippi, Tombigbee Dr. Etheostoma nigrum: KU 23143 (1), Kansas, Kansas Dr. Etheostoma pellucidum: KU 23150 (2), Indiana, Tippecanoe Dr. Etheostoma vitreum: KU 23144 (1), KU 24389 (1), Virginia. Blackwater Dr. Etheostoma vivax: KU 24390 (1), Missouri, Mississippi R. KU 23146 (2), Arkansas, St Francis Dr. KU 24391 (1), Louisiana, Ouachita Dr. KU 24392 (1), Louisiana, Ouachita Dr. KU 24393 (1), Louisiana, Sabine Dr. KU 24394 (1), Mississippi, Pearl Dr.
Appendix Ih Morphological Characters
The following characters are taken from Simons (1992) and Shaw et al. (1997). In each case, the observed attribute of individual specimens is considered a character, and homologous characters are organized into a transformation series (TS). Thus a transformation series is a column of data, and a cell in the matrix is the character observed for individuals of a species (see Wiley et al., 1991). This convention circumvents the usual (and inaccurate) convention of treating columns as "characters" and cells as "character states." TS 117: Ascending process of premaxilla perpen-
dicular to the alveolar process (0) or inclined posteriorly (1). TS 118: Premaxillary process of maxilla V-shaped (0) or U-shaped (1). TS 119: Notch laying posteroventral to the articular process of the quadrate shallow or absent (0) or deeply cut into the quadrate body (1). TS 120: Body of quadrate rounded (0) or rectangular (1). TS 121: Hyomandibular struts present as cruciform thickenings (0) or reduced to absent (1). TS 122: Descending process of the hyomandibular long and extending beyond the preopercular groove (0) or short and terminating at the end of the groove (1). TS 123: Hyomandibular spur absent (0) or present (1). TS 124: Ventral plate of the urohyal flattened (0) or curved (1). TS 125: Interhyal articular process of the posterior ceratohyal present (0) or absent (1). TS 126: Posterior margin of the preopercle smooth (0) or serrate (1). TS 127: Notch in the anterior angle of the preopercular roofing the articular facet for the interhyal present (0) or absent (1). TS 128: Opercular spine present (0) or absent (1). TS 129: Opercular strut extending from the hyomandibular articulatory facet strong (0) or greatly reduced (1). TS 130: Posterodorsal extension of the subopercle elongate (0) or truncated (1). TS 131: Mesethmoid thick and expanded anteriorly (0) or thin and not expanded (1). TS 132: Maxillary ligament inserted on two dorsomedial ridges of the mesethmoid (1), or inserted on two dorsolateral projections (0). Recoded from Shaw et al. (1996) three character transformation series. TS 133: Vomerine teeth present (0) or usually absent (1). TS 134: Membrane bone on the lateral margin of the nasal well developed (0) or present as a thin slip (1). TS 135: Remnant of the lateral line canal on the supracleithrum present (0) or absent (1). TS 136: Postcleithrum 2 present (0) or absent (1). TS 137: Longitudinal struts on the proximal anal pterygiophores present (0) or absent (1). TS 138: Processes for the insertion of the infracarinalis medius muscles on the anterior face of the first anal pterygiophore present (0) or absent (1). TS 139: Body scalation almost complete (0) or reduced laterally to a few scale rows (1). TS 140: Male anal fin breeding tubercles absent (0) or present (1).
C H A P T E R
7 Phylogeographic Patterns in Populations of Cichlid Fishesfrom Rocky Habitats in Lake Tanganyika CHRISTIAN STURMBAUER Department of Zoology University of Innsbruck A-6020 Innsbruck, Austria
ERIK VERHEYEN Royal Belgian Institute of Natural Sciences B-IO00 Brussels, Belgium
LUKAS ROBER Zoological Museum of the University of Z~rich Switzerland
AXEL MEYER Department of Ecology and Evolution State University of New York Stony Brook, New York 11794
I. Lake Tanganyika and Its Cichlid Species Flock
lake has reached its largest extension due to addition of a large tributary river, the Ruzizi in the northern edge of the lake. It was formed about 10,000 years ago by the formation of the Virunga volcano chain in Rwanda, blocking the formal connection of this area with the Nile system. The influx of the Ruzizi River also ended a long period of isolation from the Zaire River system and caused an overflow of Lake Tanganyika via the Lukuga into the Lualaba, the upper reaches of the Zaire River. Although the cichlid flock of Lake Victoria is considered to be monophyletic (Meyer et al., 1990), the Malawi and Tanganyika cichlid flocks are probably of polyphyletic origin (Greenwood, 1981; Nishida, 1991; Sturmbauer and Meyer, 1993; Moran et al., 1994; Sturmbauer et al., 1994; Kocher et al., 1993, 1995). Lake Malawi harbours a small sub-flock of five endemic species of Tilapiine cichlids (Eccles and Trewavas, 1989; Axelrod, 1993), in addition to its subflock of "haplochromines" which is considered monophyletic (Moran et al., 1994). The Lake Tanganyika cichlid flock is composed of several lineages, assigned to 12 tribes (Poll,
The cichlid species flocks of the great East African lakes represent the most diverse assemblages of freshwater fishes in the world. Lake Tanganyika is by far the oldest of the three major East African rift lakes with an estimated age of about 9 to 12 million years (Cohen et al., 1993). Its geological history is relatively well known (reviewed in Tiercelin and Mondeguer, 1991). The lake is formed of three basins which have been fused to one large lake about 5 to 6 million years ago. Seismic data show that about 200,000 (Tiercelin and Mondeguer, 1991) to 75,000 (Scholz and Rosendahl, 1988; C. A. Scholz personal communication) years ago the level of Lake Tanganyika dropped 600 m below its present level, possibly even splitting the lake into three sublakes for several tens of thousands of years. This vicariant event must have had severe effects on several habitats and their fish populations. After this period the lake level rose again with additional minor fluctuations in the more recent history. At present times the
MOLECULAR SYSTEMATICS OF FISHES
97
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
98
CHRISTIAN STURMBAUER et al.
1986). The ancestors of some tribes are likely to be older than the lake and probably have colonized the proto-lakes of Tanganyika to radiate in parallel into subflocks. The Victoria and Malawi cichlids are all, without exception, maternal mouthbrooders (females brood their eggs by buccal incubation; reviewed by Barlow, 1991; Keenleyside, 1991), and the Tanganyika flock contains several lineages of mouthbrooders as well as substrate breeders (Nishida, 1991; Sturmbauer et al., 1994; Kocher et al., 1995). The Tanganyikan cichlid fauna is morphologically, ecologically and behaviorally the most diverse species flock of the African lakes (Fryer and Iles, 1972; Greenwood, 1984). Due to its old age, the radiation may be in a highly advanced stage and the phylogeographic history of species and populations may reach back far in time compared to the cichlids of Lake Malawi and Lake Victoria. Several species are split into numerous populations which might have complex histories. Some are likely to be old, and therefore may have highly diversified genetically, to an extent that their history can be deduced from gene sequences.
sal and consequently the amount of gene flow among cichlid populations are ecological specialization and niche partitioning, e.g., habitat specifity, site fidelity or territoriality, homing behavior, and social organization (Fryer and Iles, 1972; McKaye and Gray, 1984; McElroy and Kornfield, 1990; Yanagisawa and Nishida, 1991; Hert, 1992; Sturmbauer and Dallinger, 1995). These species-specific characteristics may influence to which extent species will be split into distinct populations, to which extent populations will be isolated from each other, and also to which extent physical changes might affect their population structures. During periods of physical separation, genetic differences between populations will accumulate. Prezygotic isolation mechanisms might evolve as byproducts of genetic isolation, possibly driven by sexual selection on traits involved in social and/or reproductive behavior. This mechanism was suggested for color patterns of males being the decisive criterion of mate recognition and choice (Mayr, 1984; Dominey, 1984). Given the behavioral diversity in Tanganyikan mouthbrooders, the relative importance of sexual selection may also vary among species or lineages.
A. M o d e l i n g A d a p t i v e R a d i a t i o n
The Tanganyika cichlid species flock thus provides an excellent model system to elucidate the evolutionary mechanisms which induce and trigger explosive speciation events. An important aspect of understanding adaptive radiations is concerned with the mode of speciation which led to their diversification. Specifically, the relative importance of intrinsic biological characteristics such as ecology, anatomy, and behavior versus abiotic factors such as geological history, geographic structuring of the lake basin, barriers to gene flow, and fluctuations of the lake level is controversial. Although abiotic factors are thought to provide or prevent the opportunities for dispersal, biotic factors may define the dispersal capability of each species once an opportunity for gene flow is provided. Abiotic factors may thus be viewed as shape parameters of habitats in the lake ecosystem defining their location, size, and discontinuity in time and space. Changes in any abiotic parameter might reshape habitats, and existing barriers might be "torn down" at one time whereas others might arise at another. The degree of habitat change extends from small-scale fluctuations to vicariant events affecting almost all habitats and their species communities. Because abiotic factors most likely affect the whole species communities equally in their habitats, differences in the distribution patterns among species may primarily arise due to species-specific biological differences. Among biotic factors presumed to affect the disper-
II. Speciation and D N A The comparison of genetic patterns among species assemblages living sympatrically in geographically isolated populations is expected to provide insights into the dynamics of population histories and their evolutionary causes (e.g., reviewed in Avise, 1994). The amount of genetic divergence within and among populations, as well as frequencies and distributions of different genotypes, will provide information about their historical demography. By relating the observed patterns to ecology, habitat specificity and behavior, the decisive characteristics triggering the degree of isolation may be identified for several species on a comparative basis. Identifying the causes of isolation in various species is the goal of such an approach, and ultimately identifying possible patterns for various groups of species of similar biology. This chapter combines results of mitochondrial (mt)DNA sequence data presently available for three endemic Tanganyika cichlid lineages: Tropheus (Sturmbauer and Meyer, 1992), Simochromis (Meyer et al., 1996), and the members of the tribe Eretmodini (R(iber, 1994; Verheyen et al., 1996). MtDNA was shown to be a sensitive marker for population differentiation because it evolves 5 to 10 times faster than nuclear DNA (Avise, 1994). It is exclusively maternally transmitted in cichlids, making it more sensitive to population size fluc-
7. PhylogeographicPatterns of Cichlid Fishes tuations (reviewed in Meyer, 1993; Avise, 1994). This chapter focuses on results based on the mitochondrial control region because it is the most variable region of the entire genome (reviewed in Meyer, 1993, 1994) and thus most suitable in addressing phylogenetic questions at the population level. All species in this chapter inhabit rock and cobble shores along the lake where they often occur in sympatry. They all are epilithic algae feeders and are habitat specific to different degrees (Sturmbauer et al., 1992). For all three taxa, geographically distinct populations have been described, distinguishable only by minor, if any, morphological variation, but sometimes pronounced differences in coloration. DNA sequence data are available for populations of all three taxa along the central eastern coast of Lake Tanganyika. 1 This shoreline contains the major breakpoints which correspond to the locations of the three main basins of the lake and thus covers habitats in shallow water which are more strongly affected by fluctuations of lake level, as well as habitats situated at very steep shorelines which were probably not affected by periods of low lake level (Fig. 1A). Additional, yet unpublished, data were added to the Tropheus data set to increase geographical overlap with the data sets for Simochromis and the Eretmodini in the central eastern region of the lake. Species of the genus Tropheus are strictly confined to rock habitats for foraging and mating, and have a limited capacity for dispersal across open water (Brichard, 1989; Sturmbauer and Dallinger, 1995). Six nominal species are described (Poll, 1986), some of which have overlapping distribution (Snoeks et al., 1994), and more than 70 distinctly colored "races" have been reported (P. Schupke, personal communication). Samples of 23 localities are included in Fig. 1A. The genus Simochromis is closely related to Tropheus (Nishida, 1991; Sturmbauer and Meyer, 1992; Kocher et al., 1995) and both genera are classified within the same tribe, the Tropheini (Poll, 1986). The two Simochromis species studied so far, S. babaultii and S. diagramma, appear to be similar to Tropheus in their ecology, but they are typically less abundant. Their number of described geographical "races" is much smaller than Tropheus and some behavioral differences also exist between the taxa of the two genera. Although both sexes of Tropheus are highly sedentary, Simochromis species have been observed to move about in 1The nucleotide sequences in this chapter are available from EMBL/GenBankand are as follows:Z12047to Z12100and Z75694to Z75709 for Tropheus;X90593to X90638for the eretmodines; U40524 to U40532 for Simochromis babaultii; and U38808 and U38984 to U38995 for Simochromisdiagramma.
99
schools, and only dominant males keep territories (Brichard, 1989; C. Sturmbauer et al., unpublished observations). In contrast to Tropheus and to the Eretmodini, Simochromis is sexually dichromatic. Samples of 11 and 13 localities were analyzed for S. babaultii and S. diagramma, respectively. Species of the tribe Eretmodini are small, stenotopic cichlids and are the only group of cichlids that are adapted to living in shallow coastal areas, exposed to wave action. Three genera and four nominal species,
Eretmodus cyanostictus, Spathodus erythrodon, Spathodus marlieri, and Tanganicodus irsacae, have been described. Because eretmodines have a reduced swimbladder, they actually "sit" on the substrate, like the Gobiidae and the marine Blenniidae. The species differ in their ecology and dental morphology: three species were classified as epilithic algae feeders and one species (Tanganicodus) tends to feed on higher portions of invertebrates (Yamaoka et al., 1986). As in the case of Tropheus, numerous geographically isolated populations have been described for all four species, but color differences among populations are less pronounced than in Tropheus. Samples of 43 specimens from 32 localities were analyzed for the Eretmodini.
A. Genetic Variation in Tropheus Comparisons of the amount of sequence variation found within the genus Tropheus to that within the haplochromine species flocks of Lake Malawi and Lake Victoria suggested that Tropheus may be roughly twice the age of the whole Malawi species flock and six times the age of the Lake Victoria cichlid flock (Sturmbauer and Meyer, 1992). Tropheus duboisi is the most basal species in the genus, sister group to seven distinct lineages comprising the remaining five presently recognized species (Fig. lb). Although the average corrected sequence divergence among these seven different lineages amounted to 9.1% (standard deviation 1.6%), the average genetic divergence among populations of the same mitochondrial lineage was only 3.0% (standard deviation 1.1%). On the basis of the observed short branches and thus similar levels of genetic divergence defining the major mitochondrial lineages, and of similar levels among populations within each lineage, two successive radiations are hypothesized: in the primary radiation, Tropheus colonized rocky shores along the entire lake and the seven lineages originated. Relatively recently, each of those lineages underwent secondary radiations during which time geographically separated populations diversified to the patterns presently observed. The major fluctuation of the lake level was suggested as the trigger of the secondary radiations.
100
CHRISTIAN STURMBAUER et al.
B
primary radiation (7 lineages arise) -----
secondary radiations
, .... ~ /x Rutunga
:4:, ~
--99
N
D Nyanza Lac
--52--
Bemba Kiriza
(10
(2) /~
Ubwari
/x
~~~.
z, ~/k (3) Rutunga
p$~ A 1
Kabimba
C e n t r a l basin
~
S o u t h e m basin
... ~
D Kabimba
-- ~
N o r t h e r n basin
~. ~ \
,
~,Mahale M o u n t a i n Range
'| 97
<---IOOJ 9(6) cape Kibwesa
'
D Kala ' O C. Kungwe
I
I
I L ....
"~ [] (2) Ikola
22x "1
Y/~<~D Cape Mpimbwe Moba 9 ~./~"~ Zongwe ( 2 ) 9" ~ ( ~ " ~ ...- .... -.- -" (4) Wapembe
(2) (~
|
I
[
...... "-~-99
)
714
Lupota 9 Moliro
I. . . . . . .
[] (10) Mpulungu
,,
9 C. Kibwesa (T. "Kirschfleck")
"-93 "52"
(T. polli)
i C) E c. Kungwe ,, 9 C. Kibwesa ( T " K I r s c h f l e c k
I
~ "" Lupota O ~ . d , ~ \ Mollro 0 ) " ~ ' ~ (3) Kala Cape Chaitika
100 km
D Kavalla
--62
[] O O (7) Cape Kungwe
\ r ] (2) Kalambo
I
D Ikola
63-
Nyanza Lac
[]
Kavalla [] Kalemle []
D Mpimbwe
Minago
~
(T. brichardi)
D Wapembe I
-70 t Ka
Kiriza
: @ Zongwe :@ Moba 'l
(T. annectens)
C. Kibwesa (T. "Kibwesa")
@ Chaitika
(T. kasabae)
@ Wapembe II
Tropheus duboisi FIGURE 1 Mitochondrial phylogeny and geographic distribution of the Tanganyikan genus Tropheus. Seventy-two individuals, assigned to six currently recognized species, were sequenced (Poll, 1986): Tropheus moorii, Boulenger, 1898; T. annectens, Boulenger, 1898; T. duboisi, Marlier, 1959; T. brichardi, Nelissen and Thys, 1976; T. kasabae, Nelissen, 1977; and T. polli, Axelrod, 1978. The populations are identified by the name of the nearest village (Brichard, 1989). (A) Map of Lake Tanganyika with localities of 23 Tropheus populations and distribution of the seven recognized major mitochondrial lineages. Stippled areas represent the hypothesized shoreline at a depth of 600 m, illustrating the locations of the three lake basins and the steepness of the shoreline. Numbers of individuals analyzed from each locality are given in parentheses if larger than one. Boxed genotype symbols indicate the presence of two mitochondrial genotypes assigned to different major lineages within a single geographical "race." (B) Molecular phylogeny of the genus Tropheus. The species names are given according to the current taxonomic assignments; a name in quotation marks refers to that used in the text. All populations from which species names are omitted are presently classified as T. moorii. The symbols indicate genetically distinct lineages based on the phylogenetic analysis of mitochondrial sequence data. The primary radiation and the succeeding more recent secondary radiations are indicated by stippled boxes. This phylogeny was obtained by parsimony analysis (using PAUP 3.1.1; Swofford, 1993). A bootstrap consensus tree (Felsenstein, 1985; 1000 replicates, 50% majority rule) based on 442 bp of the control region is shown. T. duboisi, identified as the closest living relative of all other Tropheus species (Sturmbauer and Meyer, 1992), was used as the outgroup. Transversions were weighted nine times over transitions based on the average observed frequency among taxa of up to 5% uncorrected sequence divergence. Three insertions were observed; each was weighted like a single transversion. This analysis included 20 representative taxa and resulted in three equally parsimonious trees [heuristic search with random addition of taxa and 20 replications; ACCTRANS option; weighted tree length, 393 steps; unweighted tree length, 172 base substitutions; consistency index excluding uninformative characters (Kluge and Farris, 1969), 0.50]. Bootstrap values are given on the branches.
B. Phylogeographic Patterns in Tropheus The distributions of six of the seven major genetic lineages show a clear phylogeographic pattern. Each
lineage is confined to discrete shore sections of a single basin, typically centering along steeply sloping habitats (see Fig. 2). Their genetic distances to each other suggest that these six lineages might have colonized
7. Phylogeographic Patterns of Cichlid Fishes
92
I
87 N
Kiriza (2)
--97--
/ k (3) Rutunga /k Minago
I
ZX
Ubwari
_
~
~
Northern
basin
O O (3) Cape Kungwe .~
., f
,,Mahale
9~ Central
basin
"-
-.-" ""
Lupota
Q
Cape Chaitika
(
--
._
-~----LI_ . . . . .
-_
-..... /
~r76q
--64"--~
J
77
--81 4
97 --74
100 km
~1 Rutunga
o]
0 C. Kungwe(T.polli) 0 'lO C. Kungwe
_li] c.,owesa
(T. "Kirschfleck")
| Lupota ~
~ ~. ~
/x Minago /x Ubwari /k Kiriza
----
(4) Wapembe "
Moliro
-~._
"' ""
.....-- --- ""
basin
Range
(4) Cape Kibwesa
"
Moba 9 Q ~ Zongwe (2) 9 Southern
Mountain
82--
O0
101
@ Moliro O 1 C. Kibwesa (7-. "Kibwesa") O 9Moba
9-~Zongwe
O @ Wapembe II 1 Chaitika FIGURE 2 Mitochondrial phylogeny obtained by parsimony analysis and locations of Tropheus populations with geographically restricted distribution along the three major basins of Lake Tanganyika (indicated by stippled areas on the map). The populations are identified by the name of the nearest village (Brichard, 1989). Numbers of individuals analyzed from each locality are given in parentheses if larger than one. Boxed genotype symbols indicate the presence of two mitochondrial genotypes assigned to different major lineages within a single population. A bootstrap consensus tree (1000 replicates, 50% majority rule) is shown, based on 382 bp of the control region. Five individuals of a lineage from the northern shores (indicated by open triangle symbols) were used as the outgroup based on the analysis presented in Fig. lB. Transversions were weighted nine times over transitions based on the average observed frequency among taxa of up to 5% uncorrected sequence divergence. Three insertions were observed; each was weighted like a single transversion. This analysis included 22 representative taxa and resulted in 14 equally parsimonious trees (heuristic search with random addition of taxa and 20 replications; ACCTRANS option; weighted tree length, 312 steps; unweighted tree length, 128 base substitutions; consistency index excluding uninformative characters, 0.64). Bootstrap values are given on the branches. The symbols indicate genetically distinct lineages based on the phylogenetic analysis of mitochondrial sequence data. A split of the lake between the central and the southern basin during a period of low lake level, suggested by the distribution of closely related mitochondrial genotypes at opposite shores, is indicated on the map and connected to the relevant part of the phylogeny (lower box). The upper boxed section of the phylogeny indicates the presence of two distinct genotypes assigned to two major mitochondrial lineages in Tropheus "Kirschfleck."
their habitats during the primary radiation, well before the reported period of the low lake level. Due to the stability of their habitats, these lineages are likely to have survived the drop of the lake level intact. When the lake rose again, they did not seem to have significantly expanded their ranges during their secondary radiations. Only a single lineage (square symbols in Fig. 1B and Fig. 3) is widespread throughout the entire lake,
ranging from Bemba at the very northwestern end down to Mpulungu at the very south of Lake Tanganyika (Fig. 1B). Despite this, all members of the "square" lineage, even from distant populations, are genetically closely related, as closely as the populations within each primary lineage. Thus, the square lineage may have significantly expanded its range of distribution during its secondary radiation when the lake level rose after its low period.
102
CHRISTIAN STURMBAUER et al. ,
69
,,
1
I1OO
[-'--"
A
Minago
A
Ubwari
A
Kiriza
A
] Rutunga
/k
N
[] Bemba Kiriza Ubwari
[]
94
[] Bemba [] Nyanza Lac
Kabezi
,,
ZN~~lutunga
Minago Nyanza Lac
/k
[] Wapembe I [] Mpulungu
,
A
[] C. Mpimbwe
.....
[] q Kavalla []
Kabimba
Kalemie []
3~~11[._- [] Kabimba [] Kavalla
-"
[] -] C. Kungwe [] J (T. "yellow")
Cape Kungwe ---
---
.-.
_
-._
-.. ..._
[] Ikola []
_...
--
831
--
[] C. Kungwe (T. "yellow")
Cape Mpimbwe
[-~
[] Kalemie
Wal:~ml~
[] Ikola
Kala
[] Wapembe I
@ ["I KalamboRiver I
I 100
[] Wapembe I
[] Mpulungu
[] Kabezi
km
, 65
52
551
]
[] Kala [] Kalambo River [] Mpulungu [] Mpulungu
FIGURE 3 Mitochondrial phylogeny based on 382 bp of the control region, obtained by parsimony analysis and neighbor joining (Saitou and Nei, 1987), and locations of Tropheus populations with geographically widespread distribution (termed "square lineage" in the text) along the three major basins of Lake Tanganyika are shown (indicated by stippled areas on the map). The populations are identified by the name of the nearest village (Brichard, 1989). The boxed genotype symbols at Wapembe indicate the presence of two mitochondrial genotypes assigned to different major lineages within a single population. A bootstrap consensus tree (1000 replicates in both, parsimony and neighbor joining) is shown in which all branches that were found in less than 50% of the replicates of both methods are collapsed. Five individuals of a lineage from the northern shores (indicated by open triangle symbols) were used as the outgroup based on the analysis presented in Fig. lB. Transversions and transitions were weighted equally, as all taxa of the ingroup are closely related. Three insertions were observed; each was weighted like a single base substitution. In neighbor joining, JukesCantor distances (Jukes, 1980) were used to correct for multiple substitutions. This analysis included 24 representative individuals and resulted in 50 equally parsimonious trees (heuristic search with random addition of taxa and 20 replications; ACCTRANS option; tree length, 97 base substitutions; consistency index excluding uninformative characters, 0.59). Bootstrap values above the branches are from parsimony, whereas those below the branches are from neighbor joining. A split of the lake between the northern and the central basin during a period of low lake level, suggested by this phylogenetic analysis, is indicated on the map and is connected to the relevant part of the phylogeny (boxed section).
In addition, the distribution of mitochondrial genotypes within and among populations of the "square lineage" does not suggest any obvious phylogeographic structure. Of 10 individuals sequenced from the population at Mpulungu at the southern end of the
lake, 8 different mitochondrial genotypes were identified, clustering with mitochondrial genotypes of several other localities of this lineage (three individuals were included in the analysis presented in Fig. 3). During their period of geographic expansion the
7. PhylogeographicPatterns of Cichlid Fishes members of the "square lineage" seem to have entered regions that were already colonized by other lineages. Data of a population near Wapembe at the southeastern shore of the lake suggest an introgression event in the history of this population between the indigenous lineage (from the primary radiation of Tropheus) and a second lineage of invaders. Among four individuals sequenced, all were different: three mitochondrial genotypes are assigned to the "square lineage" (square symbols in Figs. 1B and 3), whereas one individual represents a separate and heretoforth undescribed major mitochondrial lineage, in addition to the previously recognized lineages (hatched circle symbol in Figs. 1B and 2). This genotype may be a relict of the indigenous lineage from the primary radiation. The central eastern shore of Lake Tanganyika along the Mahale Mountain range (see Fig. 1A) represents a continuous, steeply sloping rock habitat. Its northern edge at Cape Kungwe is situated close to the border of the northern and the central basin, and its southern edge is at the border between the central and the southern basin. Along this range more than one taxon of Tropheus occurs in sympatry (Snoeks et al., 1994), pointing to high frequencies of secondary contact between multiple lineages at the junctions of the intermediate lake basins. At Cape Kungwe, the northern edge of the Mahale Mountain range, four sympatric Tropheus species were found: T. duboisi (the sister group of all other Tropheus), Tropheus "yellow," T. polli, and Tropheus "Kirschfleck." Tropheus "yellow" belongs to the "square lineage" with widespread distribution. Two of the three analyzed individuals appear to be most closely related to the individuals sampled at Kavalla and Kabimba on the western shore of the lake, exactly where the northern basin touches the current western shore line (see boxed section in Fig. 3), suggesting that Lake Tanganyika was actually split into sublakes between the northern and the central basin during the low period. Three species live sympatrically at the southern edge of the Mahale Mountain range at Cape Kibwesa. Two species, T. polli and T. "Kirschfleck," were also found at Cape Kungwe. The third sympatric Tropheus, Tropheus "Kibwesa" (black filled circle in Figs. I and 2), was only found at a stretch of some hundred meters around Cape Kibwesa. These individuals were resolved in a clade together with fish from two populations of the southwestern shore at Moba and Zongwe (see lower boxed section in Fig. 2). Because Tropheus is highly stenotopic and most probably unable to cross several kilometers of open water (Brichard, 1978; Yanagisawa and Nishida 1991; Sturmbauer and Dallinger, 1995), the observed phylogeographic affinities between populations at opposite shores suggest that Lake Tan-
103
ganyika was also split into sublakes between the central and the southern basin. T. polli, and Tropheus "Kirschfleck" occur in sympatry all along the shore of the Mahale Mountain range. Of the individuals sampled at Cape Kungwe, T. polli and Tropheus "Kirschfleck" have closely related genotypes which belong to the same mitochondrial lineage (open circles in Fig. 2). The amount of genetic divergence between these two species is in line with other genetic divergences of individuals of the same major genetic lineage which diversified during secondary radiations. However, this close genetic relationship is contrasted by their clearly expressed differences in coloration and in the shape of their caudal fins. The individuals of Tropheus "Kirschfleck" sampled at the southern edge of the Mahale Mountain range around Cape Kibwesa were genetically heterogeneous (see upper boxed section in Fig. 2). One genotype is almost identical to that sampled at Cape Kungwe and thus is also closely related to the genotypes found in T. polli. The second genotype belongs to a different major mitochondrial lineage and is closely related to populations at Lupota and Moliro at the southwestern shore of the lake (shaded circles in Figs. 1 and 2). Thus, one species, probably T. polli, may be indigenous and Tropheus "Kirschfleck" might have invaded the area from the opposite shoreline when the lake was split. Introgression or hybridization upon secondary contact between invading Tropheus might explain the observed distribution of genotypes. C. Genetic Variation in Eretmodines
The mitochondrial phylogeny is not fully concordant with the present taxonomy of the eretmodines, pointing to the existence of yet undescribed species and in a few cases, to hybridization among species on secondary contact. This chapter focuses on the phylogeography of mitochondrial genotypes and discusses the taxonomic heterogeneities only briefly, as a study including morphology, mtDNA, and nuclear markers is currently underway to adequately address these problems. Of the four presently recognized species,
Spathodus marlieri, Spathodus erythrodon, Eretmodus cyanostictus, and Tanganicodus irsacae, two major mitochondrial lineages have been identified (symbolized as A and B in Fig. 4), subdivided into seven clades. Because the average corrected sequence divergences between the two eretmodine lineages are similar to those found among the seven major lineages of the genus Tropheus (9.2%), the Eretmodini may have about the same age as the primary radiation of Tropheus (see Fig. 1B). Species occur widely sympatrically withTropheus and they also seem to have undergone succes-
104
CHRISTIAN STURMBAUER et al. I~- Ti (MJnago) Ec(40) EC (49) I1 ~--"Sm(Burundi) II ITi (1) ,_~ Lt~ EC(1) I r
I ~ Kitaza
J t ) ' ~ Minag~ Id~ '~'~ Rubind,
(Cape Kungwe)
35 & 38
(Cape Kibwesa)31
I / I I I~,;I
I /~'k~ '-b J ~ l l II
I / ~ la ;.;I
,
I/,~KN lOB4 I LJ~
|
Ec (Minago)
t.o,
(Cape Mpimbwe, 20
~~,~14
(KiplllKl:la~So)n~e - ~ .
~
99
'~ \o ' ~ / ~ re~ o K,N.'C~)K,~BE] X ' ~ ~ ' ~ l O "~'~ I~~
I L--_._.EC (36)
II
32 29
_,-~-.-;~,s f ~ 2321~22
d--- Ec (38)
86
L,~'~(~0 ""~ ~
~
ITi (40)
I e7 FTi(3~
5211Ti131) ( ) 81 91W~ (SSe)(25)
88 100km
~_]
I
0
EC (32)
82 II r Ec (22) ~-II Ec(23) Ec (21) L Ec (10) Se(Minago) "[J'l~ se (49) Se(Kabezi) RR83JTi(45) I~Ti (41)
78 I ..
L.----.-EC(2) EC(29)
67 /
75
L'::((42~)Se (Tembwe) 55.1----Ec (9) I ~--Ec(14) ~ Se(14) Ti (21) (20) Se(Kamakonde) Ec (40)#
Tropheus duboisi
011
H~
FIGURE 4 Mitochondrial phylogeny of the Eretmodini based on 336 bp of the control region obtained by neighbor joining (Saitou and Nei, 1987); locations of sampled populations are shown on the left map, and geographic distribution of mitochondrial lineages along the three major basins of Lake Tanganyika (indicated by stippled areas on the map) are shown on the center and right map. Populations are identified either by the name of the nearest village or by the number of the sample location during a collection in 1992. Bootstrap values are given on those branches that were obtained in more than 50% of the replications. Tropheus duboisi was used as the outgroup based on a phylogenetic analysis of several Tanganyikan mouthbrooders (Sturmbauer and Meyer, 1993). Kimura distances were used to correct for multiple substitutions (Kimura, 1980). This analysis included 43 representative individuals. Two major lineages (A and B) were defined on the basis of the phylogeny and the observed genetic distances among clades. Lineage A was subdivided into three clades (A1, A2, and A3) whereas lineage B was subdivided into four clades (B1 to B4). The lineages and clades were not only defined on the basis of the mitochondrial control region reported here, but also on the basis of cytochrome b (R~iber, 1994; Verheyen et at., 1996). The geographic distribution of lineage A is shown on the map in the center, whereas that of lineage B is shown on the right map. A split of the lake between the central and the southern basin during a period of low lake level, suggested by the distribution of clade B3, is indicated on the right map and the relevant section of the phylogeny is boxed. The species names are given according to the current taxonomic assignments: Ec, Eretmodus cyanostictus; Sm, Spathodus marlieri; Se, Spathodus erythrodon; and Ti, Tanganicodus irsacae. Locality names or numbers are given in parentheses. Ec (40)# of clade B4 indicates a morphologically and genetically distinct Eretmodus sympatric to another Eretmodus, Ec (40) of clade A1.
sive radiations. As in Tropheus, several habitats along the lake may have been colonized during a primary radiation establishing the two major mitochondrial lineages. Each lineage has then undergone secondary radiations, but in contrast to Tropheus, these secondary radiations did not proceed simultaneously in both lineages and may have been triggered by different causes. The Kimura distances between the clades within lineage A range from 3.1 to 3.4%, similar to the distances found in the secondary radiations of Tropheus (3.0%). The considerably higher Kimura distances (4.5 to 7.7%) observed among the four clades within lineage B indicate that their split is considerably older. The phylogram depicted in Fig. 4 also shows that the secondary
radiations within lineage A occurred much more recently than the clade formation within lineage B.
D. Phylogeographic Patterns in Eretmodines Lineage A is subdivided into three clades (A1, A2, and A3; see Fig. 4), each of which has limited distributions, so that the habitats of each subclade can be assigned to shores of one of the three lake basins. As depicted in the central map of Fig. 4, clade A1 ranges from Burundi to the northern edge of the Mahale Mountain range at Cape Kungwe. Surprisingly, individuals of different species clustered within clade A1.
7. PhylogeographicPatterns of Cichlid Fishes Clade A2 is restricted to the northern half of the Mahale Mountain range and is exclusively composed by E. cyanostictus. Clade A3 ranges from the southern part of the Mahale Mountain range to Cape Mpimbwe and is morphologically homogeneous, comprising only
E. cyanostictus. The individuals assigned to lineage B are subdivided into four clades. Lineages B1, B2, and B3 have restricted distributions (see the right map on Fig. 4). Lineage B1 is comprised exclusively of S. erythrodon and is limited to the northeastern shores of the northern basin (circle-triangle symbols in Fig. 4). Lineage B2, composed of T. irsacae, is found in the central basin, extending further to the north, to the southeastern shores of the northern basin. The distribution of lineage B3 seems to be restricted to the southern edge of the central basin and is found both on the eastern and on the western shores of the lake. Clade B4 is also morphologically heterogeneous and, like clade A1, contains individuals classified as Spathodus, Eretmodus, and Tanganicodus (filled circles in Fig. 4). In contrast to clades B1, B2, and B3, clade B4 it was found in more than one basin, at the southernmost Tanzanian localities, the southwestern shore at Kamakonde, and also in a single individual of E. cyanostictus on locality 40 in the northern basin [symbolized as Ec (40)# in Fig. 4]; this individual is morphologically and genetically distinct to a sympatric Eretmodus [Ec (40) of clade A1]. Individual Ec (40)# may also represent a "mitochondrial relict" genotype of a previously more widespread clade because this genotype was resolved on the basis of clade B4. Like the "square lineage" of Tropheus, lineage A seems to have undergone a secondary radiation during which three clades (A1, A2, and A3 in Fig. 4) were formed and the range of distribution was significantly expanded to shores of all three basins. During their spread, populations of clades A1, A2, and A3 may have reached secondary contact with populations of lineage B which seem to have spread and diversified earlier. At locality 40 (marked by # in the right map in Fig. 4), individuals of three clades (A1, B2, and B4) were found sympatrically.
E. Genetic Variation in Simochromis The two species analyzed, S. babaultii and S. diagramma, had an average genetic distance of 3.8% (standard deviation 0.6%). Thus, S. babaultii and S. diagramma would seem to be much younger species than most Tropheus and most eretmodines. They probably originated at about the same time when Tropheus and eretmodines underwent their secondary radiations.
105
Among 25 sampled individuals of S. babaultii from 11 localities, nine mitochondrial genotypes were identified. The average genetic divergence among these nine genotypes was 0.9% (standard deviation 0.8%). Among 28 sampled individuals of S. diagramma from 13 localities, 13 genotypes with an average genetic distance of 1.2% (standard deviation 0.7%) were identified.
F. Phylogeographic Patterns in Simochromis The branching order among different haplotypes did not seem to reflect the geographic distribution of populations in either S. babaultii and S. diagramma in any obvious way, although in both species the two most ancestral haplotypes were found in the most southern populations. The geographic distribution of the mitochondrial genotypes observed in S. babaultii and S. diagramma was variable among haplotypes; however, when more than one individual was sampled from a given locality, both species tended to contain more than one haplotype per locality. Several genotypes were shared among geographically distant localities (Fig. 5) that did not correlate with the partitions of the lake according to the three basins.
III. From Patterns toward an Understanding of Processes
A. Habitat Characteristics, Lake L e v e l and Phylogeography The phylogeographic patterns of mtDNA variation show striking parallels in Tropheus and the Eretmodini in that both seem to have simultaneously undergone (at least) two consecutive radiations. The majority of these lineages are confined to small sections of a single basin (see Fig. 2 for Tropheus and Fig. 4 for the Eretmodini) and thus show a high degree of intralacustrine endemism (Snoeks et al., 1994). They all originated from a primary radiation, which occurred roughly at the same time in Tropheus and the Eretmodini. The lineages with restricted distribution are likely to have colonized their habitats before the period of the dramatically low lake level. Their habitats were situated at steeply sloping shore sections and may thus have in common that they were probably not wiped out by the drop of the lake level, but shifted along a continuous slope. After the lake rose again, the populations shifted back along the slope and did not seem to have expanded their ranges of distribution significantly.
106
CHRISTIAN STURMBAUER et al. O DIA9 67
S. diagramma 0
DIA6 DIA8
S. babaulti 67
S. babaultii
S.
~ DIAl ~ DIA2
diagramma
I O0
N
I O0
~ DIA4 O DIA5
67
~ DIAll 100 ,,
DIA7 DIA3
I O0
~,a=41~
DIAl0
BAB 5
DIAl2
100
BAB 3
DIAl3
I O0
" - ~ ' ~ 1 9 -------__.__~ BAB8
I O0
10
C) BAB2
I O0 9
C) BAB4 C~ BAB5 C) BAB1
PlA 12 i I
O0k m
i
I O0
DIA 13
C) BAB3 C) BAB6
1 O0
C) BAB7 C) BAB8 C) BAB9
Tropheus duboisi Petrochromis trewavasae
FIGURE 5 Mitochondrial phylogeny of two species of the genus Simochromis based on 414 bp of the control region, locations of sampled populations, and the geographic distribution of mitochondrial genotypes along the three major basins of Lake Tanganyika (indicated by stippled areas on the maps). Populations are identified according to the number of the sample location during a collection in 1992. Parsimony analysis used equal weights for transversions and transitions, and Tropheus duboisi and Petrochromis trewawasae, both classified within the same tribe (Tropheini), were used as outgroups (Poll, 1986). A heuristic search resulted in minimum length trees of 64 substitutions and a consistency index of 0.79. A bootstrap consensus tree (1000 replicates, 50% majority rule) is shown with bootstrap values on the branches. This analysis included 25 individuals (nine mitochondrial haplotypes) for S. babaultii and 28 individuals (13 mitochondrial haplotypes) for S. diagramma. The geographic distribution of the nine identified haplotypes in S. babaultii is shown on the map on the left, whereas that of the 13 identified haplotypes in S. diagramma is shown on the right map.
As a second striking parallel, the "square lineage" in Tropheus and the A-lineage in the Eretmodini seem to have simultaneously expanded their distributions to all three basins of the lake because the average genetic divergences within the two groups are small and similar (around 3%). Clade B4 of the eretmodines might also have extended its range at about the same time because their members were found at both the eastern and the western shore of the southern basin and also at locality 40 in the northern basin (see the right map in Fig. 4) and because the average Kimura distances within clade B4 also amounted to 3.2%. It seems reasonable that the spread of all three groups was triggered by the same event, a substantial rise in the lake level. However, there are differences with respect to phylogeographic structuring among populations: no clear phylogeographic pattern emerged in the "square lineage" of Tropheus. If the ancestors of the square lineage
originally inhabited more shallow coasts, their populations must have been most severely affected when the lake level retreated. Several populations, previously spread over a wide stretch of shallow coastline, could have been fused into one large population for the period of the low lake level, which might explain the present genetic heterogeneity of this lineage. When the lake level rose after the low period the shallow bays were flooded again and quickly recolonized by the square lineage, which now contained several mitochondrial genotypes. The origin of the square lineage may be situated at the southern section of the northern basin and possibly at the western shore of the central basin; these are the only regions where no individuals with genotypes of the other lineages of limited distribution have been found. In contrast to the square lineage of Tropheus, the three clades of the A lineage of the Eretmodini do show strong phylogeographic structuring (Fig. 4, cen-
7. PhylogeographicPatterns of Cichlid Fishes ter map); all three clades are genetically and also geographically distinct, suggesting that their parental population may not have been as large and genetically heterogeneous as that of Tropheus and thus may not have originated from a shallow and unstable habitat. The heterogeneity in the genetic patterns of six versus a single lineage in Tropheus highlights the importance of intrinsic characteristics of species for their ability to disperse. The enormous success of the "square lineage" in colonizing newly available habitats, and even in replacing at least one original population after the vicariant event, strongly indicates significant differences in their ecology and/or behavior in comparison to the remaining lineages. One might assume that their occurrence in more unstable habitats in shallower regions of the lake, which were repeatedly affected, even by smaller lake level fluctuations, might have forced these populations to become more mobile than those populations situated at steeply sloping habitats which remained intact, regardless of the water level. It is also striking that the square lineage did not seem to be sucessful in colonizing the majority of stable habitats and in replacing indigenous lineages. This observation might point to the possibility that communities at equilibrium are resilient to invasion. S. babaultii and S. diagramma are likely to be much younger than Tropheus and the eretmodines. Because none of the genotypes were shared between the two species, lineage sorting may be complete, despite their young age. The genetic distances of about 3% to each other suggest that their origin might also be directly connected to the reported substantial drop of the lake level. The same vicariant event might thus have triggered both: speciation in Simochromis and secondary radiation in Tropheus and the Eretmodini. As suggested by the branching order in both, S. babaultii and S. diagramma, Simochromis might have speciated in the southern basin and then colonized habitats in the central and northern basin after the lake level of Tanganyika rose again. The second important observation is that both species of Simochromis lack any phylogeographic structure. This is in contrast to the findings for the six Tropheus lineages with limited distribution and the eretmodines. However, it is strikingly similar to the pattern found for the square lineage of Tropheus. Because none of the mitochondrial genotypes were shared among the two species, all identified genotypes are likely to have arisen after the speciation event, despite their lack of phylogeographic structure. This pattern might be explained by the fact that these genotypes were already present in the founder population, before they spread and colonized shores along all three basins. Populations remained genetically heterogene-
107
ous due to the young age of these populations and the species. Moreover, the genetic heterogeneity may be caused by high rates of contemporary gene flow among populations all along the lake, even if they are separated from each other by large distances. This explanation is based on the assumption that Simochromis species are less sedentary than Tropheus and the Eretmodini. Ecological differences, combined with smallscale lake level fluctuations, may result in higher rates of gene flow among populations. In Simochromis, speciation may only occur as a consequence of vicariant events, such as an actual split of the lake, allowing for secondary contacts between more strictly isolated populations, such as those from opposite shores. Similarly, the lack of phylogeographic structure such as observed in the square lineage of Tropheus and Simochromis has also been reported for Malawi cichlids (Bowers et al., 1994; Moran and Kornfield, 1993). This parallel to Lake Malawi cichlids may be interesting because most Malawi species, particularly those found at the shallow and thus more unstable southern shore regions, are substantially younger, even than Simochromis. DNA sequences of such closely related species tend to differ from each other by too few point mutations and thus will not provide enough synapomorphies for phylogenetic analyses. Due to incomplete lineage sorting, mitochondrial genotypes may even still be shared by different species which can only be characterized genetically by different frequency patterns of mitochondrial RFLP haplotypes (Moran and Kornfield, 1995). Thus, more variable genetic markers, particularly microsatellites, may actually be more useful for phylogenetic analyses than DNA sequences (A. Parker and I. Kornfield, personal communication). A study on closely related Malawi cichlids has been undertaken by Kornfield and Parker (1997). As a future perspective, mtDNA sequences and those microsatellites used for extremely young Malawi species could be analyzed in parallel in Simochromis, allowing for a better understanding of the accumulation of homoplasy in microsatellite markers with an increasing time of divergence and to sort out maternal (mitochondrial) and nuclear (microsatellite DNA) perspectives on these issues.
B. Split of Lake Tanganyika In both Tropheus and the Eretmodini, closely related genotypes were found at opposite shorelines, despite their restricted dispersal ability through open water. Their low capability of dispersal has been demonstrated at a rock habitat near Rubiza in the northern Lake Tanganyika which emerged in the early seventies after a storm. This habitat was isolated from the next
108
CHRISTIANSTURMBAUERet al.
rock habitat by 15 km of sand shore (Brichard, 1978). Tropheus, Simochromis and the Eretmodini could not bridge 15 km of sand to colonize these new habitats; they appear to be too philopatric. Thus, it is highly unlikely that any of the three groups can cross several tens of kilometers of open water to colonize shores at opposite sides of the lake. The observation of closely related genotypes on opposite shores corroborates an actual split of Lake Tanganyika during the period of the extremely low lake level which was suggested on the basis of seismic data (Scholz and Rosendahl, 1988; Tiercelin and Mondeguer, 1991). Habitats at the border regions of the basins at either side may have been interconnected by newly forming shorelines through the lake, causing a connection of previously separate populations, so that even philopatric rock cichlids could have established secondary contact across the lake. The reported low stand of the lake level 200,000 to 75,000 years ago is likely to have severely impacted the present distributions of all three groups. However, the observed breakpoints among lineages, or among populations of the same lineage, are different. None of the lineages of the Eretmodini reached a lakewide distribution, indicating that their ability for dispersal is more similar to that of the six lineages of Tropheus with restricted distribution than to that of the "square lineage." The average genetic divergences of the lineages arising during the secondary radiations, especially the finding of closely related populations at both sides of the lake in the border regions of the lake basins, may be suitable to calibrate the rates of sequence divergence in cichlids as soon as more accurate datings of the dramatically low lake level and the split of Lake Tanganyika become available.
C. Secondary Contact, Hybridization, and Speciation According to the allopatric model of speciation, temporary geographic isolation is required to form reproductive barriers. Whenever two populations gain secondary contact, they may interbreed, depending on whether they still recognize each other as members of the same reproductive unit. Species tend to lose mobility with an increasing level of specialization (Fryer and Iles, 1972). Specialists may be subdivided into populations according to the discontinuous presence of habitats, such as rocky habitats that are interleaved by sandy shores. Data of Tropheus from Wapembe, where two genotypes of two major mitochondrial lineages were found in individuals of a single geographical "race," suggest a past hybridization event between an indigenous lineage and a second lineage of in-
vaders (see Fig. 1). The three individuals assigned to the "square lineage" may represent the descendants of the invaders, whereas the fourth individual seems to represent a separate and previously unidentified primary mitochondrial lineage, a relict of the indigenous lineage. A potential case of hybridization was found between T. polli and Tropheus "Kirschfleck" at the Mahale Mountain range (see upper boxed section of Fig. 2). One species, T. polli, may be indigenous and Tropheus "Kirschfleck" might have invaded the area from the opposite shore line when the lake was split. Introgression or hybridization upon secondary contact between invading Tropheus might explain the observed distribution of genotypes. Because both species remained separate and because the observed genetic distances correspond to the splits of the secondary radiations, hybridization may have only occured during the initial phase after secondary contact, and this potential hybrid zone is limited to this one locality. Prezygotic isolation mechanisms may have been perfected due to reinforcement after secondary contact, as was suggested by Butlin (1991). The second alternative, that hybridization may still be going on at very low frequencies so that species boundaries remain intact despite ongoing hybridization, seems less likely because the Kimura distances among heterospecific individuals found so far were always about 3%. As a third alternative, an ancient polymorphism may have persisted, as was shown for cichlid species of Lake Malawi (Moran and Kornfield, 1993). Which of the alternative scenarios is correct may become known when more representative samples become available and when nuclear markers are included in future analyses. Speciation without morphological change can be viewed as one of the characteristics of geographically separated sister species living in complex species communities of similar composition (Sturmbauer and Meyer, 1992). However, sympatry of four morphologically monomorphic species has only been reported for Tropheus so far (Snoeks et al., 1994; C. Sturmbauer et al., personal observations). Whenever sympatric, they are segregated by water depth (Kohda and Yanagisawa, 1992). One species is invariably found in the very upper water layer to a depth of about 2 m, whereas the other species shifts to a greater water depth of 2-6 m. For example, wherever T. polli and Tropheus "Kirschfleck" live sympatrically, T. polli always inhabits the upper zone and Tropheus "Kirschfleck" the deeper zone. The sympatric occurrences of Tropheus species that are virtually identical morphologically may point to the possibility that long-term coexistence may occur by depth segregation rather than morphological change. In the Eretmodini, which only inhabit the up-
7. PhylogeographicPatterns of Cichlid Fishes permost shore sections, sympatric taxa always seem to differ morphologically and hence, ecologically, pointing to the possibility that ecological diversification may be more important for the coexistence of two eretmodines than for two Tropheus. Most eretmodines (it has not been reported for S. marlieri) have a highly complicated system of mouthbrooding (Kuwamura et al., 1989). They are sexually monochromatic and live in permanent pairs. The female broods the eggs and transfers the newly hatched fry to the male which completes the mouthbrooding. In contrast, Tropheus forms temporary pairs in the territory of the male, and the female leaves the male's territory after spawning takes place (Yanagisawa and Nishida, 1991; Sturmbauer and Dallinger, 1995). Tropheus seems to live sedentarily in densely packed communities where both sexes are territorial (Yanagisawa and Nishida, 1991; Sturmbauer and Dallinger, 1995). Color patterns may play a much more important role in the social system of Tropheus and sexual selection may thus be more intense than in the eretmodines. Simochromis species have been observed to move about in schools, and only dominant males keep territories (Brichard, 1989; C. Sturmbauer et al., unpublished observations). In contrast to Tropheus and the Eretmodini, they are sexually dichromatic and more similar to the cichlids of Lake Victoria and Lake Malawi. They may be much more mobile than Tropheus and the eretmodines, which could also explain their lack of phylogeographic structure.
D. Evolutionary Characteristics of Mitochondrial DNA One of the major characteristics of mtDNA results from its one-dimensional path of coalescence, mtDNA phylogenies do not necessarily reflect the true species phylogenies. Sampling error among (maternally inherited) genotypes, hybridization upon secondary contact, differential extinction due to random genetic drift, or the retention of ancient polymorphisms can result in para- or polyphyly of mitochondrial genotypes in a single biological species. However, the paraphyletic placement of taxa does not necessarily imply that mitochondrial phylogenies are incorrect; paraphyly is an inevitable consequence of the evolution of a new species: if one out of several populations of a species undergoes speciation and the other populations remain reproductively compatible, the "older" species becomes paraphyletic. The possible paraphyletic placement of S. erythrodon in lineages B1 and B3 may thus represent a consequence of the speciation of Tanganicodus (in clade B2) within an originally conspecific assemblage (see Fig. 4). In conclusion, populations may
109
have strikingly different histories. Although hybridization upon secondary contact may better explain the observed patterns in some populations (such as those along the break points of the lake basins), different alternatives may apply to other situations. Large-scale patterns of the most frequent genotypes are likely to be detected by small sample sizes. A more finely structured understanding of the history of populations, and also about underlying processes, will emerge with increasing sample sizes. The parallel analysis of mitochondrial and nuclear markers will add a second, nonlinear dimension to population studies, resulting from the genetics of nuclear DNA. Taking advantage of the information record in both genomes will be an important task for future studies.
IV. Conclusions The authors' comparative results emphasize the following hierarchy of factors influencing speciation: 1. Most lineages of Tropheus and of the eretmodines show a high degree of intralacustrine endemism. This observation has important implications for taxonomists and future taxonomic work should account for that. Genetic data should be incorporated into species descriptions. 2. A split of Lake Tanganyika during the period of a low lake level, suggested on the basis of seismic data, is independently corroborated by DNA sequences of Tropheus and the eretmodines. This major vicariant event affected the phylogeograpy of all investigated groups of rock-dwelling cichlids. 3. This general trend was overlayed by patterns resulting from specific characteristics of habitats which determine how severely abiotic factors act on each population and thus might explain some of the differences observed. 4. The degree of philopatry may be the dominating biological characteristic of a species, differentially affecting the phylogeographic structures among populations. 5. Finally, each population may adapt to living in specific habitats by evolving new ecological and behavioral characteristics which may also affect their dispersal ability.
References Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman and Hall, New York. Axelrod, H. R. 1993. "The Most Complete Colored Lexiconof Cichlids." T.F.H.,Neptune City, NJ. Barlow, G. W., 1991.Mating systems among cichlid fishes. In "Cich-
110
CHRISTIAN STURMBAUER et al.
lid Fishes: Behaviour, Ecology and Evolution" (M. H. A. Keenleyside, ed.), pp. 173-190. Chapman and Hall, London. Bowers, N., Stauffer, J. R. and Kocher, T. D. 1994. Intra- and interspecific mitochondrial DNA sequence variation within two species of rock-dwelling cichlids (Teleostei: Cichlidae) from Lake Malawi. Mol. Phyl. Evol. 3:75-82. Brichard, P. 1978. Un cas d'isolement de substrats rocheux au milieu de fonds de sable dans le nord du lac Tanganyka. Rev. Zool. Afr. 92:518-524. Brichard, P. 1989. "Cichlids of Lake Tanganyika." T.F.H. Publications, Neptune City, NJ. Butlin, R. 1991. Reinforcement of premating isolation. In Speciation and Its Consequences" (D. Otte and J. A. Endler, eds.), pp. 158197. Sinauer, Sunderland, MA. Cohen, A. S., Soreghan, M. J. and Scholz, C. A. 1993. Estimating the age of formation of lakes: An example from Lake Tanganyika, East-African rift system. Geology 21:511-514. Dominey, W. J. 1984. Effect of sexual selection and life history on speciation: Species flocks in African cichlids and Hawaiian Drosophila. In "Evolution of Fish Species Flocks (A. A. Echelle and I. Kornfield, eds.), pp. 231-249. University of Maine at Orono Press, Orono. Eccles, D. H., and Trewavas, E. 1989. "Malawian Cichlid Fishes: The Classification of Some Haplochromine Genera." Lake Fish Movies, Herten. Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39: 783-791. Fryer, G., and Iles, T. D. 1972. "The Cichlid Fishes of the Great Lakes of Africa: Their Biology and Evolution." Oliver and Boyd, Edinburgh. Greenwood, P. H. 1981. "The Haplochromine Fishes of the East African Lakes." Kraus International Publications, M(inchen. Greenwood, P. H. 1984. African cichlids and evolutionary theories. In "Evolution of Fish Species Flocks" (A. A. Echelle and I. Kornfield, eds.), pp. 141-154. University of Maine at Orono Press, Orono. Hert, E. 1992. Homing and home-site fidelity in rock-dwelling cichlids (Pisces: Teleostei) of Lake Malawi, Africa. Environ. Biol. Fish. 33:229-237. Jukes, T. H. 1980. Silent nucleotide substitutions and the molecular evolutionary clock. Science 210:973-978. Keenleyside, M. H. A. 1991. Parental care. In "Cichlid Fishes: Behaviour, Ecology and Evolution" (M. H. A. Keenleyside, ed.), pp. 191-208. Chapman and Hall, London. Kimura, M., 1980. A simple method for estimating evolutionary rate of base substitution through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120. Kluge, A. G. and Farris, J. S. 1969. Quantitative phyletics and the evolution of anurans. Syst. Zool. 18:1-32. Kocher, T. D., Conroy, J. A. McKaye, K. R. and Stauffer, J. R. 1993. Similar morphologies of cichlid fish in Lakes Tanganyika and Malawi are due to convergence. Mol. Phyl. Evol. 2:158-165. Kocher, T. D., Conroy, J. A., McKaye, K. R., Stauffer, J. R., and Lockwood, S. F. 1995. Evolution of NADH dehydrogenase subunit 2 in East African cichlid fish. Mol. Phyl. Evol. 4: 420-432. Kohda, M., and Yanagisawa, Y. 1992. Vertical distribution of two herbivorouos cichlid fishes of the genus Tropheus in Lake Tanganyika, Africa. Ecol. Freshwater Fish. 1:99-103. Kornfield, I., and Parker, A. 1997. Molecular systematics of a rapidly evolving species flock: The Mbuna of Lake Malawi and the search for phylogenetic signal. In "Molecular Systematics of Fishes" (T. D. Kocher and C. A. Stepien, eds.). Academic Press, San Diego. Kuwamura, T., Nagoshi, M. and Sato, T. 1989. Female-to-male shift of mouthbrooding in a cichlid fish, Tanganicodus irsacae, with notes on breeding habits of two related species in Lake Tanganyika. Environ. Biol. Fish. 24:187-198.
Mayr, E. 1984. Evolution of fish species flocks: A commentary. In "Evolution of Fish Species Flocks" (A. A. Echelle and I. Kornfield, eds.), pp. 3-11. University of Maine at Orono Press, Orono. McElroy, D. M., and Kornfield, I. 1990. Sexual selection, reproductive behavior, and speciation in the Mbuna species flock of Lake Malawi (Pisces: Cichlidae). Environ. Biol. Fish. 28:273-284. McKaye, K. R., and Gray, W. N. 1984. Extrinsic barriers to gene flow in rock-dwelling cichlids of Lake Malawi: Macrohabitat heterogeneity and reef colonization. In "Evolution of Fish Species Flocks" (A. A. Echelle and I. Kornfield, eds.), pp. 169-183. University of Maine at Orono Press, Orono. Meyer, A. 1993. Evolution of mitochondrial DNA in fishes. In "Molecular Biology Frontiers, Biochemestry and Molecular Biology of Fishes" (P. W. Hochacka and T. P. Mommsen, eds.), Vol. 2, pp. 1-38. Elsevier, Amsterdam. Meyer, A. 1994. DNA technology and phylogeny of fish: Molecular phylogenetic studies of fish. In "Genetics and Evolution of Aquatic Organisms" (A. R. Beaumont, ed.), pp. 219-249. Chapman and Hall, London. Meyer, A., Knowles, L., and Verheyen, E. 1996. Widespread geographic distribution of mitochondrial haplotypes in Lake Tanganyika rock-dwelling fishes. Mot. Ecol. 5:341-350. Meyer, A., Kocher, T. D., Basasibwaki, P. and Wilson, A. C. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature 347:550-553. Moran, P., and Kornfield, I. 1993. Retention of an ancestral polymorphism in the Mbuna species flock (Teleostei: Cichlidae) of Lake Malawi. Mol. Biol. Evol. 10:1015-1029. Moran, P., and Kornfield, I. 1995. Were population bottlenecks associated with the radiation of the Mbuna species flock (Teleostei: Cichlidae) of Lake Malawi? Mol. Biol. Evol. 12:1085-1093. Moran, P., Kornfield, I., and Reinthal, P. N. 1994. Molecular systematics and radiation of the haplochromine cichlids (Teleostei: Perciformes) from Lake Malawi. Copeia 1994:274-288. Nishida, M. 1991. Lake Tanganyika as an evolutionary reservoir of old lineages of East African cichlid fishes: Inferences from allozyme data. Experientia 47:974-979. Poll, M. 1986. Classification des cichlidae du lac Tanganika. Tribus, genres et aspeces. Acad. Roy. de Belg. Memories de la classe des sciences, Collection in - 8~ - 2~ serie, T. XLV--Fascicule 2:1 - 163. R~iber, L. 1994. "Phylogenetic and Phylogeographic Patterns in the Endemic Tanganyikan Cichlid Tribe Eretmodini, Inferred from mtDNA Sequences." Masters thesis, Zoological Museum of the University of Z~irich, Switzerland. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Scholz, C. A., and Rosendahl, B. R. 1988. Low lake stands in Lakes Malawi and Tanganyika, delineated with multifold seismic data. Science 240:1645-1648. Snoeks, J., R~iber, L., and Verheyen, E. 1994. The Tanganyika problem: Taxonomy and distribution of its ichthyofauna. In "Speciation in Ancient Lakes" (K. Martens, B. Goddeeris and G. Coulter, eds.), Adv. in Limnol. 44:357-374. Sturmbauer, C., and Dallinger, R. 1995. Diurnal variation of spacing and foraging behaviour in Tropheus moorii (Cichlidae) in Lake Tanganyika, eastern Africa. Neth. J. Zool. 45:386-401. Sturmbauer, C., Mark, W., and Dallinger, R. 1992. Ecophysiology of Aufwuchs-eating cichlids in Lake Tanganyika: Niche separation by trophic specialization. Environ. Biol. Fish. 35:283-290. Sturmbauer, C., and Meyer, A. 1992. Genetic divergence, speciation and morphological stasis in a lineage of African cichlid fishes. Nature 359:578-581. Sturmbauer, C., and Meyer, A. 1993. Mitochondrial phylogeny of the endemic mouthbrooding lineages of cichlid fishes of Lake Tanganyika, East Africa. Mol. Biol. Evol. 10: 751- 768.
7. Phylogeographic Patterns of Cichlid Fishes Sturmbauer, C, Verheyen, E., and Meyer, A. 1994. Mitochondrial phylogeny of the Lamprologini, the major substrate spawning lineage of cichlid fishes from Lake Tanganyika in eastern Africa. Mol. Biol. Evol. 11:691-703. Swofford, D. L. 1993. "Phylogenetic Analysis Using Parsimony (PAUP) Version 3.1.1." Smithsonian Institution, Washington, D.C. Tiercelin, J. J., and Mondeguer, A. 1991. The geology of the Tanganyika Trough. In "Lake Tanganyika and Its Life" (G. W. Coulter, ed.), pp. 7-48. Oxford University Press, London Oxford and New York.
111
Verheyen, E., R(iber, L., Snoeks, J., and Meyer, A. 1996. Mitochondrial phylogeography of rock dwelling cichlid fishes reveals evolutionary influence of historic lake level fluctuations in Lake Tanganyika, Africa. Phil. Trans. Roy. Soc. Ser. B. 351:797-805. Yamaoka, K., Hori, M., and Kuratani, S. 1986. Ecomorphology of feeding in "goby-like" cichlid fishes in Lake Tanganyika. Physiol. Ecol. Jap. 23:17-29. Yanagisawa, Y., and Nishida, M. 1991. The social and mating system of the maternal mouthbrooder Tropheus moorii (Cichlidae) in Lake Tanganyika. Jap. J. Ichthyol. 38: 43- 50.
This Page Intentionally Left Blank
C H A P T E R
8 Fish Biogeography and Molecular Clocks: Perspectives from the Panamanian Isthmus ELDREDGE B E R M I N G H A M S. SHAWN McCAFFERTY, and A N D R E W P. MARTIN Smithsonian Tropical Research Institute Balboa, Republic of Panama
through which the primary freshwater fishes of South America have invaded Central America. Bussing (1976, 1985) presented a more complete analysis of the origin of the region's freshwater fish fauna and supported an early Tertiary arrival in Central America for some characiform fishes. Nevertheless, he also recognized that the vast majority of primary freshwater fish reached Central America sometime in the Pliocene. Because the invasion of Central America by South American freshwater fish is believed recent, molecule-based biogeographic analyses provide one of the best opportunities to chronicle a major colonization episode and the subsequent assembly and diversification of a modern freshwater fish fauna. This chapter focuses on fish biogeography, particularly the geography of conspecific populations of tropical marine and freshwater fishes. At the core of this discussion lies the idea that molecule-based phylogenetic hypotheses are particularly useful for historical biogeographic analyses of recent earth history events (Bermingham et al., 1992). It is often the case that morphologically monotypic species with ranges that cross one or more centers of endemism exhibit molecular divergence. For example, mitochondrial DNA (mtDNA) phylogeography has demonstrated convincingly that differentiated conspecific populations can provide historical information about a region (Bermingham and
I. I n t r o d u c t i o n
Panama provides a rich landscape against which to study the evolution of fish and molecules. This is because roughly 3 million years ago part of Panama rose to complete the Central American fusion of the North and South American continents. In the process, a marine connection between the eastern Pacific Ocean and the Caribbean Sea was severed. Thus, the Pliocene rise of the Isthmus of Panama initiated, or perhaps continued, an evolutionary experiment of grand scale (Rubinoff and Leigh, 1990). Jordan (1908) developed his "law of geminate species" based on his observations of marine fish "sister" taxa found on either side of the Panamanian isthmus. Seventy years later Vawter and coworkers (1980) used Jordan's geminate taxa to test and support the "molecular clock" hypothesis. The Central American Isthmus has also played a prominent role in discussions pertaining to the dispersal and diversification of neotropical freshwater fishes. Darlington (1957) suggested that the South American freshwater fish fauna had been derived from Asian immigrants filtering through North and Central America, although several years later he began changing his view (Darlington, 1964). Myers (1966) and Miller (1966) formally recognized the importance of Panama's narrow isthmus as a terrestrial corridor along and MOLECULAR SYSTEMATICS OF FISHES
113
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
114
ELDREDGE BERMINGHAM et al.
Avise, 1986; Avise et al., 1987; Bermingham et al., 1992, 1996; Avise, 1994; Seutin et al., 1994; Joseph et al., 1995). Furthermore, to the extent that molecular clocks are reliable, one can establish the approximate chronology of branch points in molecular phylogenies (Bermingham et al., 1992; Page, 1991, 1993; Bermingham and Lessios, 1993; Knowlton et al., 1993). In turn, these chronologies can be used to test hypotheses related to geologically dated vicariant events (Sarich and Wilson, 1967; Beverly and Wilson, 1985; Bermingham et al., 1992; Joseph et al., 1995). Molecular data permit one to assess whether both the branching pattern and the general timing of speciation events conform to vicariant models of species diversification. Fishes collected in the marine and freshwater environments of Panama have provided opportunities to study fish molecular clocks and historical biogeography within a recent time frame of earth history. This time frame is appropriate to empirically document the role that historical contingency plays in the assembly and maintenance of biological communities. Such historical reconstructions help to convey the dynamics of evolutionary changes in species and of ecological changes in communities (Cornell and Lawton, 1992; Ricklefs and Schluter, 1993). It is already the case that studies of tropical species have documented levels of genetic divergence between geographic populations that are generally high in comparison to temperate species (Capparella, 1991; Hackett and Rosenberg, 1990; Escalante-Pliego, 1991; Patton and Smith, 1992; Peterson et al., 1992; daSilva and Patton, 1993; Seutin et al., 1993, 1994; Joseph et al., 1995; Brawn et al., 1996; Bermingham et al., 1996). At the very least, these results have important implications concerning our knowledge of species diversity and the conservation of tropical communities.
II. TemporalScaling: The Panama Isthmus and Molecular Clocks The first goal of this chapter is to provide insight into the mechanics and reliability of mitochondrial molecular clocks functioning across shallow spans of time. Generally speaking, we are interested in taxa that have diversified in the Miocene or forward. For the study of such recent speciation events one can only rarely rely on paleontological evidence to yield age estimates of taxic origins (Grande, 1985; Patterson, 1975). Temporal scaling of branch points in phylogenies is critically important for constraining hypotheses on correlation or causation between evolutionary and earth history events (Page, 1991, 1993; Lundberg, 1993) and strongly indicates an exaggerated role for molecular clocks in
studies of species with recent origins. Most biologists recognize the positive relationship between molecular differentiation and time. However, the linearity of that relationship (Zuckerkandl and Pauling, 1965) over different time scales and across different taxa and molecules is under constant review. Our own studies of fish mitochondrial clocks (Martin et al., 1992; H. A. Lessios and E. Bermingham, manuscript in preparation; E. Bermingham and S. S. McCafferty, manuscript in preparation) are nascent and mostly unpublished, but provide an affirmative prognosis concerning their utility to fish systematists with an interest in the biogeographical relationships of closely related species groups. Our studies of fish mtDNA clocks use marine taxa (not only fish) separated by the Central American Isthmus (sharks: Martin et al., 1992; sea urchins: Bermingham and Lessios, 1993; Alpheus shrimp: Knowlton et al., 1993; teleost fishes: H. A. Lessios and E. Bermingham, manuscript in preparation; E. Bermingham and S. S. McCafferty, manuscript in preparation). For more than two decades it has been recognized that "geminate" marine species separated by the Central American Isthmus provide an unmatched occasion to study the molecular evolution of taxa that have been separated for roughly 3 million years, a time period of special interest to students of speciation (Lessios, 1979, 1981; Vawter et al., 1980; Bermingham and Lessios, 1993; Knowlton et al., 1993). The marine vicariant event caused by the uplift of the Choco and Chorotega blocks in the region of present-day Panam~ (Duque-Caro, 1990) is particularly well dated (Fig. 1). Biostratigraphic correlations of molluscan fossil faunas (Coates et al., 1992), changes in coiling direction of planktonic foraminifera (Keigwin, 1978), extinction of foraminifera (Keigwin, 1982), and changes in oxygen and carbon isotope ratios in the two oceans (Keigwin, 1982) all agree in dating this event between 2.9 and 3.5 million years ago (Ma). Slightly younger dates (2.5-2.7 Ma) from South American mammal fossils found in North America (Lundelius, 1987) and North American fossils found in South America (Marshall, 1988) confirm that the terrestrial corridor connecting the two continents was completed in the Pliocene. The rise of the Panamanian isthmus permitted a realization of the classic Mayr (1963) "dumbbell" model of population division as it rendered the continuous ranges of marine taxa into two large populations (presumably without secondary contact), resulting in many pairs of closely related marine species, one on either side of Central America. Because they are numerous, phylogenetically diverse, and are thought to have been separated by the same geologic event, the geminate species pairs provide a remarkable array of organisms in which to investigate nucleotide substitution pro-
8. Fish Biogeography and Molecular Clocks
|".'e-~:,:;;CttOR/lSB~ "~,:': ";~.!~.'~'"~ "k-'"}!?.;.
MIDDLE MIOCENE 16-15 MA
M u c h w i d e r ocean;
i)".-.
......-..:-- ~by~-b~hy~e+p~ .-?": .............
~ ~:.-..~
~; :...-"
'2"':2"." ~-....
,_
."':': --- J ...-:.....-~
.
.
92-',,: , . . . '.".:-~12 '.. ;
A .
.
.
"~
~
~ . ~
-_ ~
.
'
9
PRE-COCOS RIDGE BASIN CONFIGURATION l LATE MIOCENE 7-6 MA
~-$_:,. v.
POST-COCOS RIDGE BASIN CONFIGURATION LATE PLIOCENE ca. 3 MA ~e~. -
",,..'..,~
.....
t';
T e m p i s q u e - S a n CarlosN. L i m o n C o r r i d o r ~,.
Canal C o r r i d o r
~f:'-
r
9~:;c~.. ~ '"':"
:-. ....
Atrato C o r r i d o r
""b~.~
FIGURE 1 Paleogeographic reconstructions of the Central American Isthmus region in the middle Miocene, late Miocene, and late Pliocene. Emergent land is represented by oblique parallel lines, shelf sediments by dots, and abyssal oceanic sediments by horizontal parallel lines. After Coates and Obando (1996).
cesses and rate variation in molecular clocks. Geminate taxa are often found to have similar geographic ranges and ecological distributions in their respective oceans. In addition, most geminate taxa have planktonic larvae, increasing the probability that the rise of the Panama Isthmus interrupted gene flow "simultane-
115
ously" in species that had been previously panmictic (see later; Shulman and Bermingham, 1995). The most rigorous framework for testing the molecular clock hypothesis follows an approach similar to that outlined by Muse and Weir (1992). Under the assumption of a molecular clock, transisthmian taxa should accumulate an approximately equal number of nucleotide substitutions since the two species last shared a common ancestor. In other words, Etp - /~tA, where/~ is the substitution rate at the DNA or protein level, t is the time the two species last shared a common ancestor, and the subscripts P and A denote the ocean affinity of each species. Because t is identical for the species compared, the hypothesis becomes E p "-/~A. The variance in accumulation of nucleotide substitutions across independent geminate pairs is approximately equal to the mean number of substitutions per site since a single time of origin for all taxa,/~ = Var(/~). Muse and Weir (1992) review test statistics that can be used to refute the hypothesis of a molecular clock. Molecular clock studies utilizing the isthmus depend on two assumptions concerning "geminate" taxa that usually go untested. The first regards geminate pairs as sister taxa. This assumption can be tested through a rigorous phylogenetic analysis of related species. Improved phylogenetic analyses of geminate species and closely related congeneric taxa have the added benefit that they permit a "relative rate" approach (Sarich and Wilson, 1967) for testing equalities of evolutionary rates as suggested by Muse and Wier (1992). The second assumption requires that geminate taxa were separated at the same time. Critical testing of this assumption would require a detailed and taxonomically rich fossil record. Except in rare cases, however, a paleontological perspective on geminate origins is simply not available. Therefore, failure to accept the null hypothesis in isthmian-based tests of molecular clocks may be due to differences in substitution rates across geminate species or because geminate pairs have split at very different times. The absence of a good fossil record led Bermingham and Lessios (1993) to suggest an approximate test, here termed the "concordant measures" approach, of the second assumption. The approximation uses multiple, "independent" data sets to identify pairs of transisthmian species unlikely to have been split by the Pliocene rise of the isthmus. For example, concordant and relatively deep measures of divergence in mtDNA, allozymes, and behavior in three of seven "geminate" pairs of Alpheus shrimp led us to argue that these species pairs must have separated prior to the rise of the isthmus (Knowlton et al., 1993). On the other hand, congruent and shallow allozyme and mtDNA divergence values in some fish geminates (where interspe-
116
ELDREDGE B E R M I N G H A M et al.
cific genetic divergences are approximately equal to intraspecific genetic divergences) suggest circumtropical gene flow in these species. That some "geminate" taxa were almost certainly separated by events other than the Pliocene completion of the Panama Isthmus should not be surprising given that the identification of geminates has resulted more from investigator gestalt than objective morphological criteria. Thus, allozyme and mtDNA studies may provide the most objective (and best comparatively grounded) data available for determining the probable geminate (and sister group) status of transisthmian taxa. Molecular clock studies that include "geminate" pairs that allozyme and mtDNA data concordantly suggest pre- or postdate the Pliocene isthmian barrier are extraordinarily conservative (and potentially flawed). Ultimately, we wish to compare the nucleotide substitution process in transisthmian pairs of similar age without diminishing our inferences through the inclusion of older and younger taxa less easily associated with particular earth history events. In the remainder of the discussion on fish molecular clocks, a combination of "concordant measure" approximations and statistical methods are used to center our focus on transisthmian pairs of equivalent "age." Using homologous mtDNA cytochrome oxidase I (COI) sequence data for multiple transisthmian pairs of Alpheus shrimp (Knowlton et al., 1993) and fishes (E. Bermingham and S. S. McCafferty, manuscript in preparation), nucleotide substitution processes can be compared across arthropods and fishes. Comparison
Nucleotide
of branch length distributions between transisthmian pairs suggests that Alpheus shrimp have greater levels of between-geminate sequence divergence than fishes. When we focus on the cluster of species that "concordant measures" graphical analysis suggest diverged coincident with the Pliocene rise of the Isthmus, shrimp nucleotide substitution rates are roughly twice those of fishes whether determined for all nucleotide sites or fourfold degenerate sites. These results suggest that there are differences in substitution rates between shrimp and fishes. Other studies have reported differences in substitution rates between arthropods and vertebrates (Britten, 1986; Vawter and Brown, 1986), and various hypotheses including generation time (Li et al., 1987; Gaut et al., 1992) and metabolic rate (Thomas and Beckenbach, 1989; Avise et al., 1992; Martin and Palumbi, 1993) have been proposed to account for rate variation in nucleotide substitution patterns (see Hillis et al., 1996). Insufficient data exist to test alternative hypotheses of rate variation. Nevertheless, a notable difference between the sequences from shrimp and fishes is the relative frequency of AT and GC nucleotides. To determine if nucleotide composition may be related in some fashion with rates of divergence in geminate shrimp versus fishes, the nucleotide skew (Perna and Kocher, 1995a) and nucleotide bias (Irwin et al., 1991) were studied in the two groups. Figure 2 graphically shows the GC skew, AT skew, and nucleotide bias at supposedly neutral fourfold degenerate sites for eight geminate pairs of fishes (see later) and four geminate pairs of Alpheus
bias
GC-skew
j
Rypticus saponaceus/bicolor I
AT-skew
i
!
Lutjanus apodus /argentiventris Anisotremus virginicus / taeniatus
I
Abudefduf saxatilis / troschelii
Holocanthus ciliaris / passer Paranthiasfurcifer / colonus Gerres cinereus Scorpaena plumieri / mystes Alpheus formosus /panamensis Alpheus cylindricus
I
Alpheus paracrinitus (s)/rostratus Alpheus paracrinitus (ns)/paracrinitus
0.0
!
0.1
!
0.2
I
0.3
!
0.4
I
I
I
-0.6 -0.4 -0.2
"
0.0
I
0.2
I
0.4
I
0.6
Nucleotide composition statistics for comparisons of geminate fishes and Alpheus shrimp. The left-hand portion of the figure shows the nucleotide bias at fourfold degenerate sites for each geminate pair. The filled bars represent nucleotide bias from the Atlantic taxa whereas the hollow bars represent the Pacific taxa within each comparison. Nucleotide bias is estimated according to Irwin et al. (1991). The right-hand side of the figure shows GC and AT skew at fourfold degenerate sites calculated according to Perna and Kocher (1995a).
FIGURE 2
8. Fish Biogeography and Molecular Clocks
(Knowlton et al., 1993). The average nucleotide bias is 0.291 + 0.058 for geminate fishes and 0.314 + 0.084 for geminate shrimp. The average AT skew is 0.289 + 0.158 versus 0.344 + .0.115 and the average GC skew is -0.573 + 0.129 versus -0.322 + 0.180 for geminate fishes and shrimp, respectively. Although there is little difference in nucleotide bias between the two groups, there appears to be slight differences for both AT and GC skews. These skew values are clearly discernable in the frequency of the four nucleotides at fourfold synonymous sites (fourfold synonymous sites were determined independently for each geminate pair). Fishes and shrimp both show the characteristic reduction of guanine residues at fourfold sites. However, a significant reduction of cytosine residues was observed at fourfold sites in shrimp as compared to fish. Martin (1995) hypothesized that an increase in AT nucleotides accompanies an increase in D N A damage rates resulting from mutagenic oxygen radicals that are by-products of metabolism. The accelerated rate of nucleotide substitution and shift toward A + T nucleotides observed in shrimp relative to fish may be indicative of differences in rates of endogenous D N A damage between these taxa. What-
TABLE I
117
ever the underlying mechanisms are that set the pace and pattern of nucleotide substitution, the comparison between shrimp and fishes underscores the fact that rates of molecular evolution can differ among taxonomic groups. Turning our attention to a more complete examination of the transisthmian fishes, m t D N A COI sequence divergence was analyzed in the 19 "geminate" fish pairs listed in Table I. A test of the molecular clock based on the maximum likelihood method (Felsenstein, 1993) was used to determine if any of the geminate pairs were not evolving in a clock-like fashion. Log likelihoods were calculated for a topology with both the molecular clock constraint enforced (MLMc) and relaxed (MLNc). The likelihood values were tested using 2*(lnMLMc - lnMLNc) which is distributed as a ,u with (n - 2) degrees of freedom where n is the number of terminal nodes. No significant differences were found for any of the 19 geminate fish pairs, leading to the assumption that the m t D N A COI region in the separate geminate taxa is evolving in a clock-like fashion. To test whether all 19 of the geminate fish pairs were evolving at an equal rate, maximum likelihood was used to estimate divergence in pairwise compari-
D i v e r g e n c e V a l u e s a n d M a x i m u m L i k e l i h o o d Test R e s u l t s for G e m i n a t e M a r i n e F i s h e s
Species
nb
K2 c
Ks e
ML e
a
pi
Clusters
Melichthys niger
2/4
0.0018 + 0.0028
0.0040 + 0.0080
0.0035 + 0.0023
0.350
A
Diodon hystrix Alutera scripta
3/5
0.0058 + 0.0050
0.0120 + 0.0120
0.0082 + 0.0021
0.379
A
4/2
0.0087 + 0.0030
0.0230 + 0.0050
0.0400 + 0.0042
0.391
A
5/5
0.0126 + 0.0030
0.0320 + 0.0060
0.0363 + 0.0024
0.909
A
6/8
0.0142 + 0.0030
0.0410 + 0.0080
0.0324 + 0.0016
0.446
A
Mulloidichthys martinicus
M. dentatus Abudefduf sordidus v s A. concolor Anisotremus surinamensis v s A. interruptus Rypticus saponaceus v s R. bicolor Lutjanus apodus v s L. argentiventris Anisotremus virginicus v s A. taeniatus Abudefduf saxatilis v s A. troschelii Holocanthus ciliaris v s H. passer Paranthius furcifer v s P. colonus Gerres cinereus Scorpaena plumieri v s S. mystes Chromis multilineata v s C. atrilobata Chaetodon striatus v s C. humeralis Priacanthus cruentatus Thalassoma bifasciatum v s T. lucasanum Ophioblennius atlanticus v s O. steindachneri vs
6/6
0.0162 + 0.0041
0.0310 + 0.0040
0.0359 + 0.0011
0.982
A
4/8
0.0317 + 0.0033
0.0850 + 0.0060
0.1338 + 0.0040
0.518
B
6/11
0.0348 _+ 0.0060
0.1020 + 0.0108
0.1268 + 0.0031
0.092
B
3/3
0.0436 + 0.0052
0.1270 + 0.0178
0.1633 + 0.0082
0.073
B
10/8
0.0449 + 0.0036
0.1360 + 0.0110
0.1752 + 0.0015
0.998
B
4/8 2/10
0.0487 + 0.0016 0.0479 + 0.0033
0.1460 + 0.0030 0.1460 + 0.0080
0.2204 _+ 0.0042 0.1564 + 0.0024
0.319 0.788
B B
4/3 3/5 4/8
0.0507 + 0.0029 0.0550 + 0.0050
0.1530 + 0.0070 0.1670 + 0.0120
0.1955 + 0.0062 0.2829 + 0.0117
0.893 0.344
B B
0.0935 + 0.0089
0.3020 + 0.0280
0.5591 + 0.0128
0.566
C
4/8
0.1040 + 0.0030
0.3570 + 0.0110
0.6293 + 3.0107
0.062
C
6/4
0.1071 + 0.0100
0.3910 + 0.0090
2.6690 + 0.0429
0.324
E
4/6
0.1087 + 0.0050
0.3630 + 0.0170
1.9390 + 0.0671
0.369
D
8/8
0.1237 + 0.0048
0.4310 ___ 0.0190
1.5240 + 0.0433
0.131
D
aBased on ca. 650 bp of the COI gene region of mitochondrial DNA sequenced using standard manual (Lessios et al., 1995) or automated methods (S. S. McCafferty and E. Bermingham, manuscript in preparation). bNumber of individuals sequenced (Atlantic/Pacific). cAverage geminate divergence at all sites based on the two parameter model of Kimura (1980). a Average corrected percentage divergence at synonymous sites (Li et al., 1985). eAverage maximum liklihood estimate of divergence at fourfold degenerate sites using a transition/tranversion ratio of 3.5 and correcting for unequal base composition and heterogeneity in rates of substitution among sites (~, distribution, cr = 0.25; Swofford et al., 1996). i Probability values < 0.05 indicate that maximum liklihood values with and without the molecular clock enforced are significantly different from each other (i.e. reject the molecular clock; Felsenstein, 1993). gGroup association based on a U P G M A of the difference in average ML distance between geminates.
118
ELDREDGE BERMINGHAM et al.
sons between all transisthmian individuals representing each geminate species pair. However, use of all pairwise distances would violate the assumption of independent replication (if we consider the among geminate comparisons simply to be replicates estimating some true parametric value). Instead, minimum and maximum divergence values were used to account for the range of variation within each geminate pair. Variation in substitution rates among sites was accounted for using a 9' distribution, unequal frequencies of nucleotides were accounted for using the empirical frequencies, and transition/transversion ratios estimated from data were used with a four-category substitution model (Swofford et al., 1996). Equivalent results were obtained for fourfold degenerate sites using a simple Kimura (1980) two parameter model. Estimates of divergence were used instead of branch lengths from a recent common ancestor because (a) no appropriate outgroup is available for each geminate pair and (b) it has been demonstrated that the sequences are evolving according to a molecular clock. Therefore, on average, the branch lengths from the most recent common ancestor should be approximately equal and the average divergence should be correlated with these branch lengths. A significant difference was found among group means (Hadj = 35.071, p = 0.009) using a KruskallWallace test (Sokal and Rohlf, 1981). To find homogeneous subgroups within the data, an Unweighted Pair Group Method using Arithmetic Averages (UPGMA) of the Euclidean distance among the average divergences between geminate pairs was used to search for natural clusters in the data (Sneath and Sokal, 1973; Rohlf, 1993). The resulting phenogram should not be confused with a phylogenetic hypothesis of relationship among the taxa. The UPGMA phenogram is merely a tool for grouping geminate pairs that display similar levels of divergence and for this reason has not been presented here. However, the cophenetic correlation of the cluster analysis was 0.94, suggesting that
data lend themselves to hierarchical clustering. Four clusters and one singleton were found. These are marked in Table I as clusters A, B, C, D, or E (the singleton). Each of these clusters is homogeneous within groups when tested using the Kruskall-Wallace test (A: p = 0.238; B: p = 0.178; C: p = 0.439; D: p = 1.000; E: NA). When clusters A and B or B and C are combined, they form significantly heterogeneous groups. The heterogeneity in levels of divergence just described may be due to a number of processes. One explanation would simply recognize that "geminate" pairs exhibit a considerable range of divergence dates. However, it is also well accepted that nucleotide base composition differences among taxa can have a profound effect both on estimates of divergence and on phylogenetic reconstructions (e.g., Saccone et al., 1989; Sidow and Wilson, 1990; Lockhart et al., 1992, 1994; Hasegawa and Hashimoto, 1993; Steel et al., 1993; Perna and Kocher, 1995b). To provide some insight into nucleotide compositional effects on the patterns of divergences found in the geminate fish data set, the levels of divergence at synonymous sites (Ks; Li et al., 1985) were compared to differences in the nucleotide composition for each geminate pair separately. Differences in nucleotide composition were determined in three ways: the Euclidean distance of the average nucleotide frequency between each geminate pair (termed nucleotide distance), the difference in GC and AT skew (Perna and Kocher, 1995a) between each geminate pair (termed AGC and AAT), and the difference in nucleotide bias (Irwin et al., 1991) between each geminate pair (the Abias). All estimates of nucleotide frequency, skew, and bias were based on the nucleotide composition at fourfold degenerate sites. This fish geminate data set permits 19 independent comparisons for each test. Significant, although low, correlations are found between levels of divergence at synonymous sites and the nucleotide distance and the A skews (Table II). These associations suggest that nucleotide composition differences may indeed have an effect on estimating levels
Matrix of Spearman's Rank Correlation between Euclidean Distance in Nucleotide Frequencies (dn), Difference in GC Skew (AGC), Difference in AT Skew (AAT), Difference in Nucleotide Bias (ABias), the Average Estimate of Divergence at All Sites (K2), the Average Divergence at Synonymous Sites Only (Ks), and the Average Divergence Estimated Using the LogDet Paralinear
TABLE H
dn dn
1.000
AGC AAT ABias K2 Ks LogDet
0.773 0.813 0.509 0.686 0.674 0.769
AGC
AAT
ABias
K2
Ks
LogDet
1.000 0.511 0.460 0.470 0.457 0.608
1.000 0.278 0.515 0.504 0.589
1.000 0.269 0.257 0.457
1.000 0.994 0.926
1.000 0.933
1.000
8. Fish Biogeography and Molecular Clocks of divergence in the geminate fish data. However, correlations were significant whether the model of divergence used assumed equal frequencies of the four nucleotides (e.g., Kimura's two parameter) or accounted for unequal nucleotide frequencies (LogDet; Lockhart et al., 1994; Swofford et al., 1996). If nucleotide composition effects were the primary determinate of variation in levels of divergence between geminate pairs, then it would be reasonable to expect that the correlation between divergence and nucleotide distance (or ~ skews) should approach zero when differences in nucleotide composition are removed by use of the LogDet metric. Clearly this is not the case, implicating some other process or processes underlying the heterogeneity in levels of divergence found for geminate fish. However, in some other teleost data sets (unpublished results), it does appear that compositional divergence may indeed significantly bias estimates of genetic divergence and phylogenetic relationships at deeper time intervals than what is presented here. Molecular clocks will remain an enigma until the mechanisms responsible for molecular evolution are fully revealed. The natural experiment initiated by the rise of the isthmus of Panama provides an opportunity to study the process of molecular evolution across a large number of evolutionarily independent taxa, but it is an experiment that has not been fully utilized to date. Nonetheless, even imperfect or imperfectly manipulated experiments yield results that can be used cautiously. One such result from our research relevant to this book is an evolutionary rate estimate for mtDNA COI in fishes. Based on the taxa in group B (Table I), an estimate of roughly 1.2% sequence divergence per million years at all sites between recently separated fish taxa has been derived (3.3% at synonymous sites). Preliminary data on other mitochondrial regions show similar levels of divergence. For example, estimates of roughly 1.3% sequence divergence per million years for the entire ND2 and ATPase6 genes at all sites (3.5 and 3.4%, respectively, at synonymous sites only) have been determined for these Pliocene geminates (E. Bermingham and S. S. McCafferty, manuscript in preparation). Used with caution, these estimates provide a framework for estimating divergence times for a fairly broad taxonomic range of teleost fishes encompassed by our studies of geminate marine fishes.
III. Geographic Scaling: The Panama Isthmus and Caribbean Fish This section introduces two themes that will be carried through the remainder of this chapter. First,
119
conspecific populations, if differentiated, can provide historical information about a region (Rosen, 1978; Chernoff, 1982; Bermingham and Avise, 1986). Second, molecules, particularly mtDNA, are well suited for reconstructing the evolutionary relationships among conspecific populations. In other words, for species or species groups demonstrating little or no phylogenetically informative morphological variation, molecules can provide an ~ taxonomy (albeit that of the molecule being analyzed) that can be easily and immediately placed in a phylogenetic context. Thus, molecules provide an objective measure of the geographic scale over which phylogenetically informative differentiation is occurring. Biogeographic studies of tropical fishes began with a regional assessment of dispersal, gene flow, and endemism in populations representing eight species of Caribbean reef fishes (Shulman and Bermingham, 1995). The reef fish species varied in two life history traits that may affect dispersal ability and thus population genetic structure: egg type (pelagic and nonpelagic) and length of planktonic (usually larval) life (Table III). Six populations for each species were sampled from widely separated locales in both the northern and the southern current tracks within the Caribbean. mtDNA restriction endonuclease analyses were used to estimate the degree of genetic differentiation among conspecific populations. For two genera (Abudefduf and Ophioblennius), the putative eastern Pacific geminate taxa (A. troschelii and O. steindachneri) were analyzed to provide a transisthmian mtDNA divergence measure against which to assess intra-Caribbean mtDNA distances. For each of the eight species of Caribbean reef fishes, the predominant mtDNA haplotype was widespread. Mean sequence divergence observed among conspecific Caribbean mtDNA haplotypes in each of the eight fish species was low, less than 0.7% for all but one species. This level of divergence is roughly one order of magnitude less than mtDNA divergence between Caribbean/East Pacific sister taxa. Even relatively rare mtDNA haplotypes tended to be broadly distributed. Populations located in different major surface currents were no more differentiated from one another than populations occupying the same current track. These results suggest that there is considerable gene flow throughout the Caribbean and that current tracks in the Caribbean have not acted as barriers to gene flow through evolutionary time (Shulman and Bermingham, 1995). This study suggests that the low levels of population subdivision found, although of potential ecological importance, are relatively insignificant in an evolutionary context (Moritz, 1994; McMillan and Bermingham, 1996). Comparisons to sister taxa separated by the Central
120
ELDREDGE BERMINGHAM et al. TABLE III
Life History Data for Eight Species of Coral Reef Fishes a
Length of larval life in daysc Egg locationb Mean Stegastes leucostictus
Benthic
Ophioblennius atlanticus Abudefduf saxatilis
Benthic Benthic
from drift algae Gnatholepis thompsoni Haemulon flavolineatum Halichoeres bivittatus Holocentrus ascensionis Thalassoma bifasciatum
Benthic Pelagic Pelagic Pelagic Pelagic
20.1 28.5 28.6 18.2 27.2 33.9 81.5 15 24.1 48.7 49.3
Range
Reference
p valued
Geographic Structure e
19-21 27-30 28-29 17- 20 25-29 30-55f 59-122 13- 20 22-26 46-50 38- 78
Wellington and Victor (1989) Thresher and Brothers (1989) E. Brothers (personal communication) Wellington and Victor (1989) Thresher and Brothers (1989) E. Brothers (personal communication) E. Brothers (personal communication) McFarland et al., (1985) Victor (1986) E. Brothers (personal communication) Victor (1986)
<0.001 (17.2)
Very strong
0.347 (0.3) 0.595 (0)
None None
0.009 (8.2) 0.455 (0.1) 0.002 (7.9) 0.561 (0) 0.920 (0)
Very strong None Strong None None
aAdapted from Shulman and Bermingham (1995). bType of egg (pelagic/nonpelagic). cMean and range of larval life span taken from published sources and personal communications. aProbability of significance of between-population variation in genetic diversity (and percentage of variation between populations) from the AMOVAmodel (Excoffieret al., 1992). eRelative scale of the evidence for geographic structuring. f Collected from Sargassum in Florida.
American Isthmus allowed us to exclude the possibility that mtDNA similarity across conspecific populations in the Caribbean was due to reduced rates of mtDNA evolution. Mitochondrial DNA evolutionary rates consistent with those observed for transisthmian fish (E. Bermingham and S. S. McCafferty, manuscript in preparation) and other fish species (Martin et al., 1992; Meyer, 1993) indicated that the mtDNA haplotypes surveyed in each species probably coalesce in the Pleistocene. In comparison to conspecific populations of freshwater fish species (Lepomis spp.) broadly distributed across the southeastern United States (Bermingham and Avise, 1986), it is apparent that mtDNA lineage extinction in these coral reef fishes has been especially rapid. In the freshwater fishes, mtDNA lineages were presumably buffered against loss as a result of genetic isolation in discontinuous riverine habitats. It is apparent that an analogous process has not acted across discontinuous coral reef habitats. The high rates of mtDNA lineage extinction in coral reef fishes probably resulted from two processes. First, the extremely low, and presumably highly stochastic, survival rate of pelagic fish larvae (Leis, 1991) suggested that female reef fish vary considerably with regard to the number of daughters that replace them. High variance in reproductive success among females has been theoretically shown to cause the rapid pruning of mtDNA trees leading to decreased times to coalescence (Avise et al., 1984). Second, Pleistocene reduc-
tions in sea level almost certainly led to smaller reef areas and presumably lower population sizes in most reef-associated organisms (Shulman and Bermingham, 1995). These processes working in concert have probably resulted in long-term effective population sizes that are considerably smaller than the present-day population sizes of many Caribbean reef fishes. The study of Caribbean reef fishes also focused on the effects of two early life history attributes on the genetic architectures of coral reef fishes. Using AMOVA and its associated permutation testing routine (Excoftier et al., 1992), a statistically significant population subdivision was observed for three Caribbean fish species (Table III): Stegastes leucostictus (nonpelagic eggs; short planktonic life), Gnatholepis thompsoni (nonpelagic eggs; long planktonic life), and Halichoeres bivittatus (pelagic eggs; short planktonic life). However, the between population variance accounted for only 8-17% of the total variance (the remaining 83-92% occurred within geographic populations). The results suggested that neither egg type nor length of larval life was a simple predictor of geographic structure in reef fish populations. In reconciling the apparent paradox of extensive gene flow among Caribbean reef fishes with their striking species richness, one might consider that present-day sea levels, currents, and patterns of gene flow are not representative of the past marine environments in which much of the species richness observed today developed (Vermeij, 1978).
8. Fish Biogeography and Molecular Clocks
IV. Geographic Scaling: The Panama Isthmus and the Circumtropical A b u d e f d u f (Teleostei: Pomacentridae) Species Groups
121
TABLE IV GeographicLocation, Sample Size, and Clade Designation of the Abudefduf saxatilis Group and A. sordidus Group Samples Location
To continue our investigations of the geographic scale of genetic differentiation in tropical marine fishes, the molecular cr taxonomy and phylogenetic relationships for two circumtropically distributed species groups in the damselfish genus Abudefduf have also been described (Lessios et al., 1995; E. Bermingham et al., manuscript in preparation). To the extent that marine species richness can be explained by orthodox models of allopatric speciation, the Caribbean results suggest that tropical fish diversification in the marine realm might punctuate periods of relative stasis. Building on the transisthmian and Caribbean studies, we hoped to investigate this possibility through historical biogeographic comparison of two circumtropically distributed species groups in the genus Abudefduf, the so-called A. saxatilis and the A. sordidus groups (Hensley, 1978). The A. saxatilis species group contains A. saxatilis (Caribbean Sea and Atlantic Ocean), A. troschelii (Eastern Pacific), A. abdominalis (Hawaii), and A. vaigiensis (Indo-West Pacific including the Red Sea). The A. sordidus group includes A. taurus (Caribbean Sea and Atlantic Ocean), A. concolor (eastern Pacific excluding the Gulf of California), A. declivifrons (Gulf of California), and A. sordidus (Hawaii and the Indo-West Pacific including the Red Sea). Both these species groups are characterized by a complete circumtropical distribution and each contains a transisthmian pair of species. In a study of the transisthmian members of the A. sordidus group, it was shown that the recognized taxonomy (Thomson et al., 1979; Allen, 1991) of the A. sordidus group underestimated the diversity of the group by one species based on combined analysis using allozymes, sequence data from a fragment of the mitochondrial COI region, and morphological data (Lessios et al., 1995). Our investigation resurrected A. declivifrons (Gill, 1862), which was also identified by Hensley (1978). Three points need to be emphasized. First, molecules can reinforce careful morphological and meristic analyses (and vice versa), leading to a more robust cr taxonomy. Second, molecular data naturally lead to a phylogenetic assessment of the sister group status of transisthmian species. Third, phylogenetic hypotheses relating geographic populations and closely related species are critical to historical biogeographic analyses of the earth's recent marine history. Findings for the eastern Pacific and Caribbean members of the A. sordidus group have been augmented
A. saxatilis group A. saxatilis A. saxatilis
A. abdominalis A. troschelii
A. vaigiensis
n a Cladeb
Ascension Island Belize
4 2
A B
Bocas del Toro, Panama Los Roques, Venezuela Brazil Hawaii Clarion Islands Isla San Pedro, Mexico Isla Cocos, Costa Rica Panama Galapagos Guam Kosrae Papua New Guinea Eastern Australia Solomon Islands Western Australia Taiwan Red Sea
2 4 2 4 2 2 2 2 2 3 2 3 7 2 5 1 5
B B B C D D D D D F F E E/ F E E E E
Puerto Rico
2
G
Panama Los Roques, Venezuela Panama Galapagos Mexico Hawaii Kosrae Papua New Guinea Eastern Australia Western Australia Taiwan Red Sea
4 4 3 2 4 3 2 2 2 5 1 2
G G H H I J J J J J J J
A. sordidusgroup A. taurus
A. concolor
A. declivifrons A. sordidus
aNumber of individuals sequenced for approximately 650 bp of the COI gene region of mtDNA using standard manual (Lessios et al., 1995) or automated (S. S. McCafferty and E. Bermingham, manuscript in preparation) methods. bClade designation based on phylogenetic analysis of sequence data.
by results which elaborate the historical relationships within both the A. sordidus and the A. saxatilis groups (E. Bermingham et al., manuscript in preparation). The Abudefduf taxa investigated and their collection locations are presented in Table IV and Fig. 3. Analyses of partial mtDNA cytochrome b and COI nucleotide data (---1300 bases in total) have identified two additional phylogenetic (mtDNA) lineages in the A. saxatilis group. Here we specifically refer to mtDNA lineages and not species because the complementary allozyme and morphological analyses have not been undertaken (as was the case for the resurrection of A. declivifrons).
122
ELDREDGE BEKMINGHAM et al.
FIGURE 3 Locationand clade designation of Abudefduf saxatilis and A. sordidus samples. The circles represent locations where members of the A. saxatilis group were collected, the triangles are where A. sordidus samples were collected. The letters correspond with Table IV.
Genetic analyses provide compelling evidence that A. vaigiensis includes two highly divergent mtDNA clades (Fig. 4). A derived clade was found in the western Pacific and Indian Ocean ranging from the Great Barrier Reef to the Red Sea. The more ancestral mtDNA clade was observed only in the western Pacific (Coral Sea, Guam, and Kosrae). In addition, an isolated population of A. saxatilis collected on Ascension Island in the mid-Atlantic displays a level of sequence divergence when compared to all other populations of A. saxatilis which is similar to that observed between the transisthmian species pair (A. troschelii and A. saxatilis). Finally, the distinctiveness of the Hawaiian species (A. abdominalis, n = 14), the geographic extent of the eastern Pacific species (A. troschelii, n = 35, Galapagos to Gulf of California), and the Caribbean distribution of A. saxatilis (n = 67, coastal Brazil to Bermuda; Shulman and Bermingham, 1995) are fully supported by both mtDNA sequence and restriction fragment length polymorphism (RFLP) analyses. In the A. sordidus group, mtDNA phylogenetic lineages corresponded closely to currently recognized taxa, including A. declivifrons (Fig. 5). Both mtDNA RFLP and sequence analysis confirm the widespread geographic distribution of A. sordidus sensu stricto from Hawaii to the Red Sea.
The mtDNA results permit the species and geographic populations in both the A. sordidus and the A. saxatilis groups to be placed in a phylogenetic context. In turn, these phylogenetic hypotheses can be utilized in historical biogeographic analysis. Visual inspection of Figs. 4 and 5 presents no compelling case for a common historical basis for the pattern of diversification seen in the A. sordidus and A. saxatilis groups. Both the branching orders and the branching times (under the assumption of a molecular clock) lack congruence except that an Indo-West Pacific mtDNA clade is ancestral in both species groups. If a common mtDNA molecular clock exists for these taxa, the estimated date of divergence of the geminate pairs (A. troschelii and A. saxatilis versus A. concolor and A. taurus) is quite different (however, see previous section on geminate comparisons). The relatively reduced mtDNA divergences observed for the concolor/taurus transisthmian pair is paralleled by a reduced allozyme distance in this pair as well (Vawter et al., 1980; H. A. Lessios and E. Bermingham, manuscript in preparation). Two marine species groups can do relatively little to elucidate the historical processes which underlie the diversification of tropical marine fishes across the earth's vast tropical seascape. Yet 15 years after the publication of Victor Springer's (1982) provocative and informative
8. Fish Biogeography and Molecular Clocks 0.009-!-0.002100I
123
A. saxatilis AscensionIslands A. taurus
0.034_~.008
A. saxatilis
100
0.003-+0. 0.016_+0.002~ 94
0.0O8+0.0O4
99
0.053_+0.012
0.016_+0.0Q2 ~
A. abdominalis
lOO
A. concolor
100
0.002_+0.001 0.027_+0.008
A. declivifrons
100
A. troschelii
0.021_+0.002 lOO
0.026_+0.004
~'~
99
A. vaigiensis b
A. sordidus A. vaigiensis
0.049-~-0.004 100
Neighbor-joining tree summarizing the relationships among taxa in the Abudefduf saxatilis group. Pairwise divergences were estimated using the two parameter model of Kimura (1980) for 570 bp of the COI mtDNA region. The pomacentrid Amblygliphidodon curacao was used as an outgroup for rooting the tree. The triangles summarizing the terminal nodes within each clade have a depth proportional to the longest branch length and a height proportional to the number of OTUs found in that clade. Branch lengths plus or minus their standard deviations are shown above the branches. The probability that the branch length was greater than zero is presented below the line. Probabilities were calculated according to Rzhetsky and Nei (1994a) using the program METREE (Rzhetsky and Nei, 1994b). Equivalent topologies were found using weighted and unweighted parsimony analysis, minimum evolution, maximum likelihood, and whether estimates of divergence were corrected for unequal frequencies of nucleotides (Tamura and Nei, 1993; Lockhart et al., 1994), unequal rates of evolution among sites (Tamura and Nei, 1993), or a simple two parameter model (Kimura, 1980). FIGURE 4
monograph on Pacific Plate biogeography in which he relied mostly on "inferences and intuitive assessments of relationships" of shorefishes, our studies of Abudefduf represent one of the few phylogenetic assessments of species relationships in a genus with a Pacific Plate and/or circumtropical distribution. Meeting Springer's challenge "to correct, elaborate, falsify or corroborate" his study of pattern and process in tropical shorefish speciation requires that phylogenetic hypotheses be accumulated for many more species groups. Our investigations of Abudefduf relationships suggest that mtDNA-based phylogenies provide one useful means to that end.
Neighbor-joining tree summarizing the relationship among the various taxa in the Abudefduf sordidus group. Refer to the legend for Fig. 4 for details.
FIGURE 5
V. Geographic Scaling: The Panama Isthmus and Neotropical Freshwater Fishes Generally speaking, it is also the case that phylogenetic analyses of geographic populations and closely related species of tropical freshwater fishes are unavailable. Therefore, we have virtually no knowledge of the geographic scale of genetic differentiation across tropical freshwater organisms. Although our studies of Panamanian freshwater fish are preliminary, they have revealed a considerable degree of phylogeographic structuring in species exhibiting distributions that span large distances across physically isolated drainages. Levels of mtDNA differentiation between populations of some neotropical species are typical of interspecific or even intergeneric differences among temperate fish species. Such marked genetic divergence among populations of neotropical fish suggests that high levels of genetic divergence between populations may be a general feature of tropical species. Figure 6 shows a phylogenetic reconstruction of 80 Roeboides (Characiformes) individuals representing
124
ELDREDGE BERMINGHAMet al.
0.0058_+ 0.0005
0.0059_+~ 0.0010~ 99
ColombiaAtrato/SanJuan Ada
0.0063_+ ~ ~
0.0074+
0.000699
0.0056_+ 0.0008 -'~ 99 ~ o.o012o_+ 0. 009 4 99
Mandinga Tuira Pacora
------Chagres/CcKIeNorte
Bayano
0.0226_+0.0005 99
0.0290fff).0007 99
0.0251+0.0010 99
Azuero/Santa Maria
~ ~
CostaRica Northand Atlantic
WesternPanama Coto,CostaRica
0.0441_+0.002099 Venezuela 0.0242_+0.0041
Magdalena
Neighbor-joining tree summarizing the relationship among 80 individuals of Roeboides collected from Costa Rica, Panama, and South America. Pairwise divergenceswere estimatedusing the two parameter model of Kimura (1980)for 852 bp spanning the entire mtDNA ATPase6 and 8 genes. R. affinis individuals from Peru were used as outgroup sequencesin rooting the tree (not included in figure). Referto the legend for Fig. 4 for tree details.
FIGURE 6
five nominal species collected widely from Venezuela to northern Costa Rica (refer to the map in Fig. 7). Several points can be made using the Roeboides phylogeny that are generally valid for most of the neotropical freshwater fish species that have been studied. First, reciprocal monophyly of mtDNA lineages is observed between different regional assemblages of Roeboides. Second, although mtDNA clades were easily identified and typically separated from their geographic or genetic neighbors by roughly 2.5% sequence divergence, phylogenetic resolution was weak. This can be taken as evidence of rapid colonization of the emerging Panama landscape, followed by in situ diversification of mtDNA lineages. Third, the group identified as Costa Rica, North, and Atlantic includes the species R. ilsae and R. guatemalensis. The group labeled Mandinga, Tuira, Pacora contains representatives of R. guatemalensis and R.
occidentalis. Thus, when viewed from an mtDNA perspective, the named Roeboides species are both polyand paraphyletic. Fourth, the geographical distribution of mtDNA variation often shows a poor congruence with the ranges of named species, and it has been concluded that taxonomic distinction provides poor estimates of genetic divergence in lower Central American fish. Finally, to the degree that mtDNA in fish evolves in an approximately clock-like manner, it appeared that not all taxa within a group are historically and evolutionarily equivalent. This finding has important consequences for the study of biogeographic patterns, and it has been pointed out elsewhere that cross-taxa biogeographic analyses would benefit from having phylogenetic branch points sorted by age (Bermingham et al., 1992). Our results also provide examples of recent expansions of specific lineages within the ranges of old established or extinct populations and suggest a pattern of alternation between geographical expansion and quiescence. Although genetical analyses of additional fish species are required to assess the generality of this pattern, data to date suggest that different populations of the same species may be in different phases of colonizing activity at a specific time. For example, in the Siluriform genus Pimelodella, limited geographic distribution of derived mtDNA haplotypes in eastern Pacific Panama contrasting the more widespread distribution of ancestral haplotypes in the rest of Panama has been found. If founder effects resulting in reduced withinpopulation genotypic variability accompany colonization, the pattern of haplotypic diversity suggests that the spread of Pimelodella within Panama occurred in two waves: an older one that extended to central Panama and a more recent progression, probably from the Atrato through the Bayano. Because the Tuira is the largest of Panama's rivers, its reduced mtDNA variability may be explained most readily by a founder effect resulting from recent colonization instead of a postfounding bottleneck. Prominent barriers to dispersal in lower Central America are evidenced by congruence between phylogeographic patterns across species. For example, the Sona Peninsula in central Pacific Panama marks either the end of a distribution (Hypopomus) or a genetic disjunction (Pimelodella: A. P. Martin and E. Bermingham, manuscript in preparation; Roeboides: E. Bermingham and A. P. Martin, manuscript in preparation; Aequidens: S. S. McCafferty et al., manuscript in preparation) in the species examined to date. The cause of this phylogenetic break may relate to a transverse range of hills that bisect Panama in this region. Early results somewhat belie the depauperate condition of the middle American primary freshwater fish
8. Fish Biogeography and Molecular Clocks
125
S ~-\...
\ Rio Mandinga
..) /
..)..... { : W es,ternPanarr~.h
t
........ / . . . . .
'\ ..... ."~.[Ji~
LI;":.,.,.. .... /,:; ..........., <'".,.,..~ /., ;< .(
t
Panama
' :
~Maria
/ ;: /
"'k
,:..,
.,i.: ['';" L-. .................. "i
<~. .,>
....)
""
]
":...........: 5 .........( ,,:( ! 1,-"
/
" .....
C61ombia ~
,.>
,.) .>~'
i RiOMagd~ena t "~
:-,,......)
i~.,
i
/
~oA~o
i
;:i..: i
"".-
t!.....
,
:,: ,.'.
S2"
/
;:
i
S":
G.~:
i FIGURE 7 L o c a t i o n of m a j o r d r a i n a g e s in C o s t a Rica, P a n a m a , a n d C o l o m b i a w h e r e s a m p l e s of Roeboides w e r e collected.
fauna which led Myers (1966) to state that he could "see no escape from the conclusion that Central America possessed no obligatory freshwater ostariophysans until the Pliocene or even the Pleistocene, since which time the most aggressive and ubiquitous of all characoid genera (Astyanax) has, in a geological sense, raced northward to the Rio Grande, trailed a little more slowly by Hyphessobrycon, Brycon, Roeboides, Gymnotus, and a few others." Furthermore, the significant mtDNA divergence and reciprocal monophyly of fish mtDNA lineages among lower Central American drainage basins foreshadow an analysis of the pattern and rate of freshwater fish exchange that took place before and following the Pliocene completion of the Panamanian isthmus.
VI. Concluding Remarks Surveying large numbers of individuals across moderate numbers of species with overlapping distribu-
tions should be a goal in evolutionary biology for both theoretical and applied reasons. On the theoretical side, species richness may be influenced more strongly by extrinsic biogeographical relationships and historical circumstances than by such intrinsic, local processes as competiton and predation (Ricklefs, 1987; Cornell and Lawton, 1992; Ricklefs and Schluter, 1993). The sheer magnitude of systematic description required in the tropics indicates a pervasive role for molecular systematics if we are to determine the dependence of local richness on regional species richness in tropical ecosystems. On the practical side, molecular genetic analyses can provide a reasonably rapid means for surveying regional biotic diversity. Indices of species richness, sometimes taking into account abundance, have been the traditional measures of diversity. When used to make decisions regarding the preservation of biodiversity, however, it has been argued that these indices fail because they consider all species to be equal or nearly equal. Erwin (1991), Vane-Wright et al. (1991), and others (Crozier, 1992; Faith, 1992; Weitzman, 1992; Solow
126
ELDREDGE BERMINGHAM et al.
et al., 1993; r e v i e w e d b y K r a j e w s k i , 1994) h a v e s u g gested that phylogenetic history and/or genetic divers i t y s h o u l d b e u s e d in b i o d i v e r s i t y i n d i c e s to e m p h a -
size t h e p h y l o g e n e t i c a n d g e n e t i c d i s t i n c t i v e n e s s of s o m e g r o u p s c o m p a r e d to o t h e r s . To t h e d e g r e e this v i e w is a d o p t e d b y c o n s e r v a t i o n b i o l o g i s t s , m o l e c u l a r s y s t e m a t i c s w i l l u n d o u b t e d l y b e c a l l e d u p o n to p r o v i d e m e a s u r e s of t a x o n o m i c d i s t i n c t i v e n e s s . T h e res u l t i n g taxic d i v e r s i t y m e a s u r e s , w h e n c o u p l e d to d e t a i l e d k n o w l e d g e of o r g a n i s m a l d i s t r i b u t i o n p a t t e r n s , c a n b e u s e d to i d e n t i f y p r i o r i t y a r e a s for c o n s e r v a t i o n ( V a n e - W r i g h t et al., 1991).
Acknowledgments
The research reported in this chapter results from collaborations intitiated by EB, SM, and Haris Lessios on marine fishes and AM and EB on freshwater fishes. We gratefully acknowledge the financial support of the Smithsonian Institution (Tupper Postdoctoral Fellowship to AM and the STRI Molecular Systematics program), the National Geographic Society, and NSF (BSR-8607403 to Myra Shulman and EB). We thank the following for granting scientific collecting/research permits: INRENARE, Panama; The Comarcas of the Kuna, Ngobe, Embera, and Waunaan; Recursos Marinos, Panama; Ministerio de Recursos Naturales, Energia, y Minas, Costa Rica; and the Museo Nacional de Colombia. Most of all, we owe a very heartfelt thanks to the following people for extensive help in the field and laboratory: Heidi Banford, Bill Bussing, German Galvis, Luifer Garcia, Nimiadina Gomez, Myra Shulman, Ross Robertson, and Gustavo Ybazeta.
References
Allen, G. R. 1991. "Damselfishes of the World." Mergus Publishers, Germany. Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman and Hall, New York. Avise, J. C., Neigel, J. E., and Arnold, J. 1984. Demographic influences on mitochondrial DNA lineage survivorship in animal populations. ]. Mol. Evol. 20:99-105. Avise, J. C., Bowen, B. W., Lamb, T., Meylan, A. B., and Bermingham, E. 1992. Mitochondrial DNA evolution at a turtle's pace: Evidence for low genetic variability and reduced microevolutionary rate in the Testudines. Mol. Biol. EvoI. 9:457-473. Avise, J. C., Arnold, J., Ball, R. M., Bermingham, E., Lamb, T., Neigel, J. E., Reeb, C. A., and Saunders, N. C. 1987. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 18: 489-522. Bermingham, E., and Avise, J. C. 1986. Molecular zoogeography of freshwater fishes in the southeastern United States. Genetics 113: 939-965. Bermingham, E., and Lessios, H. 1993. Rate variation of protein and mtDNA evolution as revealed by sea urchins separated by the Isthmus of Panama. Proc. Natl. Acad. Sci. USA 90: 2734-2738. Bermingham, E., Rohwer, S., Wood, C., and Freeman, S. 1992. Vicariance biogeography in the Pleistocene and speciation in North American wood warblers: A test of Mengel's model. Proc. Natl. Acad. Sci. USA 89: 6624-6628. Bermingham, E., Seutin, G., and Ricklefs, R. E. 1996. Regional approaches to conservation biology: RFLPs, DNA sequences, and Caribbean birds. In "Molecular Genetic Approaches to Conser-
vation Biology" (T. Smith and R. Wayne, eds.), pp. 104-124. Oxford University Press, London. Beverly, S. M., and Wilson, A. C. 1985. Ancient origin for Hawaiian Drosophiliniae inferred from protein comparisons. Proc. Natl. Acad. Sci. USA 82:4753-4757. Brawn, J. D., Collins, T. M., Medina M., and Bermingham, E. 1996. Associations between physical isolation and geographical variation within three species of Neotropical birds. Mol. Ecol. 5:33-46. Britten, R. J. 1986. Rates of DNA sequence evolution differ between taxonomic groups. Science 231:1393-1398. Bussing, W. A. 1976. Geographic distribution of the San Juan ichthyofauna of Central America with remarks on its origin and ecology. In "Investigations of the Ichthyofauna of Nicaraguan Lakes" (T. B. Thorson, ed.), pp. 157-175. University of Nebraska, Lincoln, NE. Bussing, W. A. 1985. Patterns of distribution of the Central American ichthyofauna. In "The Great American Biotic Interchange" (F. G. Stehli and S. D. Webb, eds.), pp. 453-473. Plenum, New York. Capparella, A. P. 1991. Neotropical avian diversity and riverine barriers. In "Acta XX Congressus Internationalis Ornithologici," pp. 307-316. Washington, D.C. Chernoff, B. 1982. Character variation among populations and the analysis of biogeography. Am. Zool. 22: 425-439. Coates, A. G., Jackson, J. B. C., Collins, L. S., Cronin, T. M., Dowset, H. J., Bybell, L. M., Jung, P., and Obando, J. A. 1992. Closure of the Isthmus of Panama: The near-shore marine record of Costa Rica and western Panama. Bull. Geol. Soc. Am. 104:814-828. Coates, A. G., and Obando, J. A. 1996. The geologic evolution of the Central American isthmus. In "Evolution and Environment in Tropical America" (J. Jackson, A. F. Budd, and A. G. Coates, eds.). pp. 21-56. The University of Chicago Press, Chicago, IL. Cornell, H. V., and Lawton, J. H. 1992. Species interactions, local and regional processes, and limits to the richness of ecological communities: A theoretical perspective. J. Anim. Ecol. 61:1-12. Crozier, R. H. 1992. Genetic diversity and the agony of choice. Biol. Conserv. 61:11-15. Darlington, P. J. 1957. "Zoogeography: The Geographical Distribution of Animals." Wiley, New York. Darlington, P. J. 1964. Drifting continents and Late Paleozoic geography. Proc. Natl. Acad. Sci. USA 52:1084-1091. daSilva, M., and Patton, J. 1993. Amazonian phylogeography: mtDNA sequence variation in arboreal echimyid rodents (Caviomorpha). Mol. Phylogenet. Evol. 2:243-255. Duque-Caro, H. 1990. Neogene stratigraphy, paleoceanography and paleogeography in northwest south America and the evolution of the Panama Seaway. Paleogeog. Paleoecl. Palaeoec. 77:203-234. Erwin, T. L. 1991. An evolutionary basis for conservation strategies. Science 253: 758- 761. Escalante-Pliego, B. P. 1991. Genetic differentiation in yellowthroats (Parulinae: Geothlypis). In "Acta XX Congressus Internationalis Ornithologici," pp. 333-343. Washington, D.C. Excoffier, L., Smouse, P. E., and Quattro, J. M. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA restriction data. Genetics 131: 479-491. Faith, D. P. 1992. Conservation evaluation and phylogenetic diversity. Biol. Conserv. 61:1-10. Felsenstein, J. 1993. "Phylogeny Inference Package (PHYLIP) 3.5 edition." University of Washington, Seattle, WA. Gaut, B. S., Muse, S. V., Clark, W. D., and Clegg, M. T. 1992. Relative rates of nucleotide substitution at the rbcL locus of monocotyledonous plants. J. Mol. Evol. 35:292-303. Gill, T. N. 1862. Catalogue of fishes of lower California in the Smithsonian Institution, collected by Mr. J. Xantus. Proc. Natl. Acad. Sci. Philadelphia 14:140-151.
8. Fish Biogeography and Molecular Clocks
Grande, L. 1985. The use of paleontology in systematics and biogeography, and a time control refinement for historical biogeography. Paleobiology 11:234-243. Hackett, S. J., and Rosenberg, K. V. 1990. Comparison of phenotypic and genetic differentiation in South American antwrens (Formicariidae). Auk 107: 473-489. Hasegawa, M., and Hashimoto, T. 1993. Ribosomal RNA trees misleading? Nature 361:23. Hensley, D. A. 1978. "Revision of the Indo-West Pacific Species Abudefduf (Pisces: Pomacentridae)." Unpublished Ph.D. dissertation, University of South Florida, Tampa, FL. Hillis, D. M., Mable, B. K., and Moritz, C. 1996. Applications of molecular systematics: The state of the field and a look to the future. In "Molecular Systematics" (D. M. Hillis, C. Moritz and B. K. Mable, eds.), 2nd Ed., pp. 515-543. Sinauer, Sunderland, MA. Irwin, D. M., Kocher, T. D., and Wilson, A. C. 1991. Evolution of the cytochrome b gene of mammals. J. Mol. Evol. 32:128-144. Jordan, D. S. 1908. The law of geminate species. Am. Nat. XLII(494) : 73-80. Joseph, L., Moritz, C., and Hugall, A. 1995. Molecular support for vicariance as a source of diversity in rainforest. Proc. R. Soc. Lond. B 260:177-182.
Keigwin, L. D. 1978. Pliocene closing of the Isthmus of Panama based on biostratigraphic evidence from nearby Pacific Ocean and Caribbean Sea cores. Geology 6:630-634. Keigwin, L. D. 1982. Isotopic paleoceanography of the Caribbean and east Pacific: Role of Panama uplift in late Neogene time. Science 217:350-353. Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparitive studies of nucleotide sequences. J. Mol. Evol. 16:111-120. Kimura, M. 1983. "The Neutral Theory of Molecular Evolution." Cambridge University Press, Cambridge, England. Knowlton, N., Weigt, L. A., Sol6rzano, L. A., Mills, D. K., and Bermingham, E. 1993. Divergence in proteins, mitochondrial DNA, and reproductive compatability across the Isthmus of Panama. Science 260:1629 - 1632. Krajewski, C. 1994. Phylogenetic measures of biodiversity: A comparison and critique. Biol. Conser. 69:33-39. Leis, J. M. 1991. The pelagic stage of reef fishes. In "The Ecology of Fishes on Coral Reefs" (P. E Sale, ed.), pp. 183-230. Academic Press, San Diego. Lessios, H. A. 1979. Use of Panamanian sea urchins to test the molecular clock. Nature 280:599-601. Lessios, H. A. 1981. Divergence in allopatry: Molecular and morphological differentiation between sea urchins separated by the Isthmus of Panama. Evolution 35:618-634. Lessios, H. A., Allen, G. R., Wellington, G. M., and Bermingham, E. 1995. Genetic and morphological evidence that the Eastern Pacific damselfish Abudefduf declivifrons is distinct from A. concolor (Pomacentridae). Copeia 1995(2):277-288. Li, W.-H., Tanimura, M., and Sharp, P. M. 1987. An evaluation of the molecular clock hypothesis using mammalian DNA sequences. J. Mol. Evol. 25:330-342. Li, W.-H., Wu, C.-I., and Luo, C.-C. 1985. A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol. 2:150-174. Lockhart, P. J., Howe, C. J., Bryant, D. A., Beanland, T. J., and Larkum, A. W. D. 1992. Substitutional bias confounds inference of cyanelle origins from sequence data. J. Mol. Evol. 34:153-162. Lockhart, P. J., Steel, M. A., Hendy, M. D., and Penny, D. 1994. Recovering evolutionary trees under a more realistic model of sequence evolution. Mol. Biol. Evol. 11:605-612. Lundberg, J. G. 1993. African-South American freshwater fish clades
127
and continental drift: Problems with a paradigm. In "Biotic Relationships between Africa and South America" (P. Goldblatt, ed.), pp. 156-198. Yale University Press, New Haven, CT. Lundelius, E. L. 1987. The North American quaternary sequence. In "Cenozoic Mammals of North America" (M. O. Woodburne, ed.), pp. 211-235. Univ. Calif. Press, Los Angeles, CA. Marshall, L. G. 1988. Land mammals and the great American interchange. Am. Sci. 76:380-388. Martin, A. P. 1995. Metabolic rate and directional nucleotide substitution in animal mitochondrial DNA. Mol. Biol. Evol. 12(6): 1124-1131. Martin, A. P., Naylor, G. J. P., and Palumbi, S. R. 1992. Rates of mitochondrial DNA evolution in sharks is slow compared with mammals. Nature 357:153-155. Martin, A. P., and Palumbi, S. R. 1993. Body size, metabolic rate, generation time, and the molecular clock. Proc. Natl. Acad. Sci. USA 90: 4087-4091. Mayr, E. 1963. "Animal Species and Evolution." Belknap Press, Cambridge, MA. McFarland, W. N., Brothers, E. B., Ogden, J. C., Shulman, M. J., and Bermingham, E. L. 1985. Recruitment patterns in young French grunts Haemulon flavolineatum (family Haemulidae) at St. Croix, U.S.V.I. Fish. Bull. 83:413-426. McMillan, W. O., and Bermingham, E. 1996. The phylogeographic pattern of mitochondrial DNA variation in the Dall's porpoise Phocoenoides dalli. Mol. Ecol. 5: 47-61. Meyer, A. 1993. Evolution of mitochondrial DNA of fishes. In "The Biochemistry and Molecular Biology of Fishes" (P. W. Hochachka and P. Mommsen, eds.), pp. 1-38. Elsevier, Amsterdam. Miller, R. R. 1966. Geographical distribution of freshwater fish fauna of Central America. Copeia 1966(4):773-802. Moritz, C. 1994. Defining evolutionary significant units for conservation. Trends Ecol. Evol. 9:373-375. Muse, S. V., and Weir, B. S. 1992. Testing for equality of evolutionary rates. Genetics 132:269-276. Myers, G. S. 1966. Derivation of the freshwater fish fauna of Central America. Copeia 1966(4):766-773. Page, R. D. M. 1991. Clocks, clades, cospeciation: Comparing rates of evolution and timing of cospeciation events in host-parasite assemblages. Syst. Zool. 40:188-198. Page, R. D. M. 1993. Genes, organisms, and areas: The problem of multiple lineages. Syst. Biol. 42(1):77-84. Patterson, C. 1975. The distribution of Mesozoic freshwater fishes. Mem. Mus. Natl. Hist. Nat. Ser. Paris. A Zool. 88. Patton, J., and Smith, M. F. 1992. mtDNA phylogeny of Andean mice: A test of diversification across ecological gradients. Evolution 46: 174-183. Perna, N. T., and Kocher, T. D. 1995a. Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes. J. Mol. Evol. 41:353-358. Perna, N. T., and Kocher, T. D. 1995b. Unequal base frequencies and the estimation of substitution rates. Mol. Biol. Evol. 12(2) :359-361. Peterson, A. T., Escalante, P., and Navarro, A. 1992. Genetic variation and differentiation in Mexican populations of common bushtanagers and chestnut-capped brush-finches. Condor 94:244-253. Ricklefs, R. E. 1987. Community diversity: Relative roles of local and regional processes. Science 235:167-171. Ricklefs, R. E., and Schluter, D. 1993. Species diversity: An introduction to the problem. In "Species Diversity in Ecological Communities: Historical and Geological Perspectives" (R. E. Ricklefs and D. Schluter, eds.), pp. 1-10. University of Chicago Press, Chicago, IL. Rohlf, F. J. 1993. "NTSYS-pc: Numerical Taxonomy and Multivariate Analysis System." Exeter Software, Applied Biostatistics, Setauket, New York.
128
ELDREDGE B E R M I N G H A M et al.
Rosen, D. E. 1978. Vicariant patterns and historical explanation in biogeography. Syst. Zool. 27:158-188. Rubinoff, I., and Leigh, E. G. 1990. Dealing with diversity: The Smithsonian Tropical Research Institute and tropical biology. Trends Ecol. Evol. 5(4): 115-118. Rzhetsky, A., and Nei, M. 1994a. A simple method for estimating and testing minimum-evolution trees. Mol. Biol. Evol. 9(5):945967. Rzhetsky, A., and Nei, M. 1994b. METREE: A program package for inferring and testing minimum-evolution trees. Cambios 10(4): 409-412. Saccone, C., Pesole, G., and Preparata, G. 1989. DNA microenvironments and the molecular clock. J. Mol. Evol. 29:407-411. Sarich, V. M., and Wilson, A. C. 1967. Immunological time scale for hominid evolution. Science 158:1200-1203. Seutin, G., Brawn, J., Ricklefs, R. E., and Bermingham, E. 1993. Genetic divergence among populations of a tropical passerine, the Streaked Saltator (Saltator albicollis). Auk 110:117-126. Seutin, G., Klein, N. K., Ricklefs, R. E., and Bermingham, E. 1994. Historical biogeography of the bananaquit (Coerebaflaveola) in the Caribbean region: A mitochondrial DNA assessment. Evolution 48(4):1041-1061. Shulman, M. J., and Bermingham, E. 1995. Early life histories, ocean currents, and the population genetics of Caribbean reef fishes. Evolution 49(5):897-910. Sidow, A., and Wilson, A. C. 1990. Compositional statistics: An improvement of evolutionary parsimony and its deep branches in the tree of life. J. Mol. Evol. 31:51-68. Sneath, P. H. A., and Sokal, R. R. 1973. "Numerical Taxonomy." Freeman, San Francisco. Sokal, R. R., and Rohlf, F. J. 1981. "Biometry." Freeman, San Francisco. Solow, A. R., Broadus, J. M., and Tonring, N. 1993. On the measurement of biological diversity. J. Environ. Econ. Manag. 24:60-68. Springer, V. G. 1982. "Pacific Plate Biogeography, with Special Reference to Shorefishes." Smithsonian Institution Press, Washington, D.C. Steel, M. A., Lockhart, P. J., and Penny, D. 1993. Confidence in evolu-
tionary trees from biological sequence data. Nature 364: 440-442. Swofford, D. L., Olsen, G. J., Waddell, P. J., and Hillis, D. M. 1996. Phylogenetic inference. In "Molecular Systematics" (D. M. Hillis, C. Moritz, and B. K. Mable, eds.), 2nd Ed., pp. 407-514. Sinauer, Sunderland, MA. Tamura, K., and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512-526. Thomas, W. K., and Beckenbach, A. T. 1989. Variation in salmonid mitochondrial DNA: Evoutionary constraints and mechanisms of substitution. J. Mol. Evol. 29:233-245. Thomson, D. A., Findley, L. T., and Kerstitch, A. N. 1979. "Reef Fishes of the Sea of Cortez." Wiley, New York. Thresher, R. E., and Brothers, E. B. 1989. Evidence of intra- and interoceanic regional differences in the early life history of reefassociated fishes. Mar. Biol. Progr. Ser. 57:187-205. Vane-Wright, R. I., Humphries, C. J., and Williams, P. H. 1991. What to protect? Systematics and the agony of choice. Biol. Conserv. 55: 235-254. Vawter, A. T., Rosenblatt, R., and Gorman, G. C. 1980. Genetic divergence among fishes of the eastern Pacific and the Caribbean: Support for the molecular clock. Evolution 34:705-711. Vawter, L., and Brown, W. M. 1986. Nuclear and mitochondrial DNA comparisons reveal extreme rate variation in the molecular clock. Science 234:194 - 196. Vermeij, G. J. 1978. "Biogeography and Adaptation." Harvard University Press, Cambridge, MA. Victor, B. C. 1986. Duration of the planktonic larval stage of one hundred species of Pacific and Atlantic wrasses (family Labridae). Mar. Biol. 90:317-326. Weitzman, M. L. 1992. On diversity. Quart. J. Econ. 107:363-405. Wellington, G. M., and Victor, V. C. 1989. Planktonic larval duration of one hundred species of Pacific and Atlantic damselfishes (Pomacentridae). Mar. Biol. 101:557-567. Zuckerkandl, E., and Pauling, L. 1965. Evolutionary divergence and convergence in proteins. In "Evolving Genes and Proteins" (V. Bryson and H. J. Vodel, eds.), pp. 97-166. Academic Press, New York.
C H A P T E R
9 The Utility of Mitochondrial DNA Control Region Sequencesfor Analyzing Phylogenetic Relationships among Populations, Species, and Genera of the Percidae JOSEPH E. FABER and CAROL A. STEPIEN Department of Biology Case Western Reserve University Cleveland, Ohio 44106
I. I n t r o d u c t i o n
Acipenser transmontanus among rivers of the Pacific coast of North America (Brown et al., 1993). Lack of nucleotide diversity in this rapidly evolving region has also suggested stock depression and genetic bottlenecks in the white sturgeon A. transmontanus (Brown et al., 1993) and lack of substructure of Atlantic cod Gadus morhua populations in the north Atlantic Ocean (Arnason and Rand, 1992). At higher taxonomic levels, nucleotide divergence in control region sequences was used to construct phylogenies of relationships among species and genera of morphologically variable and homoplastic African rift lake cichlids (Meyer et al., 1990; Sturmbauer and Meyer, 1992, 1993). The more slowly evolving central conserved section of the control region appears to contain phylogenetically reliable information up to the family level and higher in teleost fishes, despite rapid evolutionary rates of flanking "left" and "right" domains of the control region (Lee et al., 1995; Stepien, 1995). Thus, research involving several different fish taxa indicates that the mtDNA control region "bridges the gap" (Avise et al., 1987) between phylogenetics and population genetics. However, little research has been conducted to de-
The rapid mutation rate and predominantly maternal inheritance of vertebrate mitochondrial (mt)DNA (Brown et al., 1979; Wilson et al., 1985) provide a valuable tool for evaluating evolutionary genetic divergence (Stepien and Kocher, Chapter 1). Comparison of sequences among vertebrates shows that both relatively fast and slowly evolving areas lie within the mtDNA control region (Brown, 1986; Lee et al., 1995). Control region DNA sequences may therefore reveal evolutionary relationships at various taxonomic levels (Moritz et al., 1987). Recent studies of fishes illustrate the use of the mtDNA control region to study problems involving different evolutionary time scales. For example, variability in sequences has been used to detect the evolutionarily recent population structure of the Dover sole Microstomus pacificus and the thornyhead Sebastolobus alascanus in biogeographic provinces of the Pacific continental slope (Stepien, 1995), the spotted sand bass Paralabrax maculatofasciatus between the Pacific Coast and Sea of Cortez (Stepien, 1995), and white sturgeon MOLECULAR SYSTEMATICS OF FISHES
129
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
130
JOSEPHE. FABERAND CAROLA. STEPIEN
termine the rates and patterns of mtDNA control region nucleotide evolution and the phylogenetic signal at various taxonomic levels within single lineages. Phylogenetic relationships among species and genera of rainbow fishes (Melanotaeniidae) were investigated, but only a small portion (--- 330 bp) of the left domain of the mtDNA control region was utilized (Zhu et al., 1994). Relationships among genera and among higher taxonomic levels were reviewed by Lee et al. (1995), but among- and within-species relationships were not addressed. The purpose of this chapter is to compare the genetic divergence of mtDNA control region sequences within and among closely related species and genera in the teleost fish family Percidae. Gene trees from control region data are compared to morphology-based phylogenies, and population genetic statistics are compared to hypotheses of population structure derived from geological evidence and tagging studies, to test phylogenetic signal (phylogenetic information in the data set), and to determine the utility of mtDNA control region sequences across a range of evolutionary time scales.
similar suprageneric phylogeny based on reproductive behavior characters, including the subfamilies Percinae (the tribe Percini including Gymnocephalus, Perca, and Percarina) and Etheostomatinae (the tribes Romanichthyini, including Romanichthys and Zingel; Luciopercini, including Stizostedion; and Etheostomatini, including Ammocrypta, Crystallaria, Etheostoma, and Percina) (Fig. 1C). The phylogenies of Page (1985) and Wiley (1992) differed at the generic level: Wiley included only Perca in the Percini whereas Page included Perca, Gymnocephalus, and Percarina. Disparate phylogenetic relationships were also suggested within the Etheostomatinae, as Page (1985) hypothesized that the Luciopercini is the sister group of Etheostomatini and both share a common ancestor with Romanichthyini (Fig. 1C). Wiley (1992) regarded Romanichthys as the
A
Percinae
I
Perches
Romanichthys Zingel
I
Darters
Ruffe
A. Morphological Evolution of the Percidae The holarctic family Percidae includes 162 described species in 10 genera (Nelson, 1994). The darters, comprising the genera Ammocrypta, Crystallaria, Etheostoma, and Percina, are Nearctic, and Gymnocephalus, Percarina, Zingel, and Romanichthys are Palearctic in distribution. Perca and Stizostedion are endemic to both North America and Eurasia. Several species, including the ruffe Gymnocephalus cernuus, the perch Percayquviatilis, and the walleye Stizostedion vitreum, have become established outside their historical ranges through accidental and intentional introductions (Nelson, 1994), which may affect geographic patterns of genetic diversity and local adaptations of resident stocks (Billington and Hebert, 1991). The taxonomy of the Percidae was described by Collette (1963) and Collette and Banarescu (1977). Two subfamilies are recognized, and suprageneric relationships are presently disputed in four different morphological phylogenetic hypotheses, which are given in Fig. 1 (reviewed in Coburn and Gaglione, 1992). Collette's (1963) subfamily Percinae included the tribes Percini (Gymnocephalus, Perca, and Percarina) and Etheostomatini (Ammocrypta, Crystallaria, Etheostoma, and Percina), and the subfamily Luciopercinae contained the Luciopercini (Stizostedion) and Romanichthyini (Romanichthys and Zingel) (Fig. 1A). Wiley (1992) alternatively divided the Percidae into the subfamilies Percinae (including only Perca) and Etheostomatinae (including all other genera except Percarina, which was not analyzed) (Fig. 1B). Page (1985) hypothesized a
Luciopercinae
Perca Ammocrypta, Crystallaria Percarina Etheostoma, Gymnocephalus Percina Stizostedion
I
Pikeperches
Europeandarters
i Percinae
B
Etheostomatinae
Perca Gymnocephalus Stizostedion
Ruffe/
Perches
Pike-
I
darters
Ruffe
Etheostomatinae
l European darters
Pike-I
Ruffe
I Darters
I
Etheostomatinae
Perca Gymnocephalus
Perches
Ammocrypta, Crystallaria Etheostoma, Percina
perches I
I
Percinae
I Darters
1
Percinae
Perches
[
I
Perca Percarina Romanichthys Gymnocephalus Zingel Stizostedion
C
D
Romanichthys Ammocrypta, Crystallaria Etheostoma, Percina
European I .......
perches
9-
...
Zingel
Stizostedion
Pikeperches
Zingel
Romanichthys
I European I darters
Ammocrypta, Crystallaria Etheostoma, Percina
Darters
I .......... F .........
FIGURE1 Hypothesesof phylogeneticrelationshipsamongtaxaof the teleost fish familyPercidae: (A) Collette (1963)and Colletteand Banarescu (1977);(B) Wiley(1992);(C) Page (1985);and (D) Coburn and Gaglione(1992).Adaptedfrom Coburn and Gaglione(1992).
9. PhylogeneticAnalysis of the Percidae sister taxon of Etheostomatini, and the clade containing both as the sister group to Zingel (Fig. 1B). Wiley (1992) also suggested that Stizostedion is the sister group to the Zingel-Romanichthys-Etheostomatini clade and that Gymnocephalus is, in turn, the sister clade to the Stizostedion-Zingel-Romanichthys-Etheostomatini group (Fig. 1B). Coburn and Gaglione (1992) presented a phylogeny similar to Wiley (1992), except for placing Romanichthys as the sister group to Zingel, which is the sister taxon to Etheostomatini, and placing Gymnocephalus as the sister taxon to Perca in the Percinae (Fig. 1D). Phylogenetic relationships at lower taxonomic levels in the Percidae are poorly understood. Bailey and Gosline (1955) recognized subgenera, and Page (1981) and Bailey and Etner (1988) examined subgeneric relationships in Etheostoma, but group assignments differed between these studies. Essentially, few phylogenetic studies have attempted to resolve relationships within this large genus (n = 150 species; Nelson, 1994).
B. Genetics o f the Percidae Genetic relationships of most percid taxa have not been studied. However, some North American taxa have been examined using allozyme electrophoresis and restriction fragment length polymorphism (RFLP) analysis of mtDNA. Allozyme polymorphisms indicated that the darter genus Percina was the sister group of a clade containing other morphologically derived darter genera, including Crystallaria, Ammocrypta, and Etheostoma (Page and Whitt, 1973a,b; Page, 1974). Simons (1989, 1992) used mtDNA RFLPs to find that species assigned to the Ammocrypta (sand darters) by Bailey and Gosline (1955) are not monophyletic and instead should be assigned subgeneric status in Etheostoma. Wiley and Hagen (Chapter 6) sequenced the cytochrome b gene of mtDNA to test intra- and interspecific relationships of sand darters. Intraspecific variability of allozymes in Etheostoma has also been used to resolve biogeographic variability, both among drainage systems (among populations; Wiseman et al., 1978) and within drainages (among and within populations; Echelle et al., 1975). Evolutionary relationships among and within species of the genus Stizostedion have been studied using genetic techniques. The morphological hypothesis of the European taxa being the sister group to the North American taxa and the evolutionary hypothesis of North American colonization from Eurasia via Beringia during the Pliocene (Collette and Banarescu, 1977) have been tested and supported with allozyme and mtDNA RFLP data by Billington et al. (1990, 1991). The majority of genetic research on this genus, however, has focused on determining the population structure
131
of the economically important North American walleye, S. vitreum. Identification of allozyme polymorphisms and mtDNA RFLP haplotypes has resolved broad-scale biogeographic patterns across much of the North American range of the walleye, including the Great Lakes (Billington and Hebert, 1988; Ward et al., 1989; Todd, 1990; Billington et al., 1992; Billington and Strange, 1995). However, among more closely spaced sites, representing populations (stocks) identified by tag and recapture methods (Ferguson and Derksen, 1971; Bodaly, 1980), allozyme analyses appear to lack the resolving power necessary to discern significant population-level divergences (i.e., within the Great Lakes; Todd 1990; Hawley et al. 1991). MtDNA RFLP analysis has revealed population divergence among female walleye from spawning sites in two closely spaced tributaries in Lake Erie (Mercker and Woodruff, 1996), suggesting that walleye population markers can be identified. Population genetics of the yellow perch, P.flavescens, based on allozymes have suggested that variability is low or nonexistent in Green Bay, Lake Michigan (Leary and Booke, 1982), and in Lake Erie, Lake Champlain, and Lake Oneida (Strittholt et al., 1988). RFLP analysis of mtDNA indicates variability in a small sample of yellow perch from Lake Erie (Billington, 1993), suggesting that useful population markers may be available in the mitochondrial genome.
II. Materials and Methods A. Collection o f Specimens Eight species representing five genera of the family Percidae, including the banded darter Etheostoma zonale, bluebreast darter Etheostoma camurum, ruffe Gymnocephalus cernuus, yellow perch Percaflavescens, blackside darter Percina maculata, sauger Stizostedion canadense, walleye S. vitreum, and zander S. lucioperca, were collected from sites in eastern North America and Eurasia (see Section V). Intraspecific variation was studied in four species. One hundred and seventeen specimens of S. vitreum were collected from spawning sites in four tributary rivers and one nearshore reef in Lake Erie, as well as from one tributary to Lake St. Clair (Fig. 2). Twenty G. cernuus were collected from the site of recent introduction in Lake Superior (ca. 1987; Simon and Vondruska, 1989) and from one site near St. Petersburg, Russia. Four S. canadense and 10 P. flavescens were also collected to examine genetic variability in these species. Specimens were collected by seine, electroshocking, or hook and line. Whole individuals or tissues (fin, muscle, eggs, or liver) were ei-
132
JOSEPH E. FABER AND CAROL A. STEPIEN
ONTARIO ClintonR i v e r ~ t LAKE - ~ s ~ e r n
MICHIGAN ~ / ..... {
__f ~
bG'."Oi ~ _/ ,
.... j ....... ~ ~I ....\, Sandus
~
LAKEERIE J basin ~
\
i
~ - : / ~
NEWGRK ~______ PENNSYLVANIA
I
/ GrandRiver i j.,.,..,___t~ ..... ! o ~ '\
';~1111111~
I I Haplotype1
I
~ Hapio~pe4 100 km
I
Haplotype 5
end-labeled with biotin (Hultman et al., 1989) and the strands were separated using Dynabeads M-280 streptavidin (Dynal Corp., Oslo, Norway) for single strand sequencing (Hultman et al., 1989; Uhlen, 1989). Both strands were sequenced separately using diluted PCR primers with Sanger dideoxy sequencing (Sanger et al., 1977) and Sequenase PCR product sequencing kits (Amersham/U.S. Biochemical Corp., Cleveland, OH). Sequencing reactions were run on 6% polyacrylamide gels for 2, 5, and 8 hr in order to resolve approximately 600 bp from the primer and were visualized by autoradiography.
~ Haplotypes6, 7, 8
FIGURE 2 Collection sites of spawning walleye, Stizostedion vitreum, and relative frequencies of eight mtDNA control region se-
quence haplotypes in Lake Erie and Lake St. Clair, including the Clinton River, Maumee River, Sandusky River, Grand River, and Van Buren Bay.
ther immediately frozen at -80 ~C or preserved in 95% ethanol in the field. Frozen samples were stored at - 8 0 ~C, and ethanol preserved materials were stored at room temperature prior to DNA extraction.
B. Genetic Analysis Whole small fishes or tissues of larger individuals were frozen and ground in liquid nitrogen using a cylindrical mortar and pestle. DNA was extracted in a guanidine thiocyanate buffer and purified using proteinase K, RNase, phenol, and chloroform for each individual following standard protocols (Stepien et al., 1993; Stepien, 1995). The entire mtDNA control region was amplified in three sections using conserved primers (Kocher et al., 1989; Meyer et al., 1990) and the polymerase chain reaction (PCR). The 5' end or "left" domain of the control region from tRNApr~ t o the central conserved section was amplified using oligonucleotide primers L15926, 5'-TCA AAG CTT ACA CCA GTC TTG TAA ACC-3' (Kocher et al., 1989), and H16498, 5'-CCT GAA GTA GGA ACC AGA TG-3' (Meyer et al., 1990). The 3' end or "right" domain of the control region from the central conserved section to tRNA phe was amplified with the light strand complement of H16498, L16498 5'-CAT CTG GTT CCT ACT TCA GG-3', and H503 5'-GCA CGA GAT TTA CCA ACC C-3' (Titus and Larson, 1995). One hundred and sixty-eight nucleotides of the central conserved section were amplified with primers designed from sequences conserved among fishes sequenced in this study, using the light chain primer L16378, 5'-AAT GTA GTA AGA GCC TA-3', and the heavy chain primer H16578, 5'-GGG TAA CGA GGA GTA TG-3'. Heavy chain primers were
C. Data Analysis DNA sequences were read into a Macintosh computer using an IBI/Kodak digitizer, aligned using MacVector-AssemblyLIGN software (International Biotechnologies, Inc., 1992), checked by hand, and aligned among species to identify evolutionarily conserved and variable sections. Percentage nucleotide composition and relative proportion of polymorphic nucleotides (pn, Nei, 1987) were calculated by hand. Secondary folding structures of nucleotide sequences were explored using the RNAdraw program (Version 1.01; Matzura and Wennborg, 1995). Phylogenies were analyzed for different sections of the control region, based on apparently variable evolutionary rates and phylogenetic signal among these sections in primates (Kocher and Wilson, 1991) and teleost fishes (Lee et al., 1995). In addition to calculating phylogenies for data from the entire control region, the rapidly evolving left domain from tRNA pro to the central conserved section, the slowly evolving central conserved section, and the rapidly evolving right domain from the central conserved section to tR_N~Aphewere considered separately. Phylogenies were analyzed using two methods: distance analysis of percentage sequence divergence using pairwise genetic distances, and calculating parsimonious relationships using character states (cladistics). y genetic distances that approximate substitution rate variation among nucleotide sites in mtDNA control regions (Kocher and Wilson, 1991; Tamura and Nei, 1993) were calculated using the Tamura-Nei model of nucleotide substitution, which accounts for variable substitution rates between purines and pyrimidines (Tamura and Nei, 1993), using MEGA (Molecular Evolutionary Genetics Analysis, Version 1.01; Kumar et al., 1993). A y parameter of 0.11 was used for analysis of the entire control region and separate analysis of the central conserved section (Kocher and Wilson, 1991; Tamura and Nei, 1993), and a parameter of 0.50 was implemented for separate analysis of the more rapidly evolving left and right domains
9. Phylogenetic Analysis of the Percidae (Wakely, 1993). Distance phylogenies were inferred with a neighbor-joining algorithm (Saitou and Nei, 1987), and support for individual nodes of resulting trees was examined with 1000 bootstrap replicates (Felsenstein, 1985). Most parsimonious relationships among species and genera were determined with the branch and bound algorithm using PAUP (Phylogenetic Analysis Using Parsimony, Version 3.1.1; Swofford, 1993), and support for nodes of cladograms was estimated using 1000 bootstrap replications (Felsenstein, 1985) and 50% majority rule consensus analysis of the shortest trees (Margush and McMorris, 1981). The percid species with the greatest pairwise genetic distance between itself and all other species examined (the zander, Stizostedion lucioperca) was utilized as the outgroup for parsimony analysis because no mtDNA control region sequence data were available for closely related outgroup taxa. Parsimony and distance trees were both midpoint rooted. Exhaustive maximum parsimony searches (excluding intraspecific variability) were conducted using PAUP Version 3.1.1 (Swofford, 1993) for the entire control region and for the left domain, central conserved section, and right domain separately. The skewness of the frequency distribution of tree lengths relative to the most parsimonious tree length (gl statistic) was used to determine phylogenetic signal in the data set (Hillis and Huelsenbeck, 1992). A topology-dependent cladistic permutation tail probability analysis (T-PTP; Faith, 1991) was also used to determine if most parsimonious trees differed significantly from morphologically derived phylogenetic hypotheses (Fig. 1). Sequence data were randomly permuted 1000 times, and the frequency distribution of 1000 branch and bound tree lengths was calculated using T-PTP in PAUP* (Swofford, 1996). Tree length changes between most parsimonious branch and bound trees and the morphological trees (excluding taxa not surveyed in this study) were determined in MacClade (Version 3.0; Maddison and Maddison, 1992), and these lengths were compared to the frequency distributions of tree length differences between the 1000 randomly permuted trees and the most parsimonious trees to test the statistical support of control region data for their relationships in morphological hypotheses (for a discussion of this method see Bernardi, Chapter 12). T-PTP analyses were repeated for the entire control region and for the left domain, central conserved section, and right domain to test their relative phylogenetic signals. Intraspecific variability was analyzed using population genetic statistics. Haplotypic diversity (h) was calculated following Nei (1987). Geographic heterogeneity in frequency distributions of haplotypes was analyzed with a/~2 test, using a Monte Carlo simula-
133
tion approach with 1000 randomizations to account for small sample sizes and empty cells in the contingency matrix (Roff and Bentzen, 1989), with the MONTE option in REAP (Restriction Enzyme Analysis Package, Version 4.0; McElroy et al., 1992). In addition, molecular variability was ascribed to variance components including among regions, among populations within regions, and within populations, and the Fst analog 9st (Weir and Cockerham, 1984) was calculated using AMOVA (Analysis of Molecular Variance, Version 1.53; Excoffier et al., 1992).
III. Results Sequences are deposited in GenBank (accession numbers 90617-90624). Alignment of mtDNA control region sequences reveals a common pattern of conserved sections with intervening variable sections (Fig. 3). The left domain flanking tRNA pr~ contains a putative termination associated sequence (TAS) 5'AAA CTA TTC TTT-3', which appears homologous to that of other vertebrates (e.g., mouse; Doda et al., 1981). In the downstream 3' direction, alignment reveals conserved sequences that also appear homologous to conserved sequence blocks (CSBs) of mammals (Southern et al., 1988; Saccone et al., 1991) and other fishes (Lee et al., 1995), including the central conserved section, a pyrimidine rich "tract," and conserved sequence boxes 2 and 3 (CSB-2, 3). The total length of the control region is similar among the percids analyzed, ranging from 908 nucleotides in yellow perch (P. flavescens) to 1248 nucleotides in walleye (S. vitreum) (Fig. 3). Variability in length is primarily due to variable numbers of repeated sequences (e.g., tandem repeats; Buroker et al., 1990) located immediately downstream of the TAS (Fig. 3). Sequences of 10 to 12 nucleotides, similar in sequence to the TAS, are repeated from 5 times for 50 total nucleotides in zander (S. lucioperca) to 34 times for 388 nucleotides in walleye. Nucleotide sequences of tandem repeats vary both among species and within species. For example, in sauger (S. canadense) the same motif is repeated 18 times for all individuals (n = 10), but for walleye the same motif is repeated from 7 to 14 times, followed by 17 to 20 repeats that are variable in both primary sequence and number among individuals (n = 117). Analyses of folding kinetics indicate that tandem repeats may potentially form hairpin-shaped secondary folding structures with negative associated energies (Fig. 4). Excluding tandem repeats, lengths of control region sequences are more similar among species of percids,
134
JOSEPH E. FABER A N D CAROL A. STEPIEN
908 - 1248 nucleotides total length 20
855 - 861 nucleotides
5 0 - 388 nucleotides
TAS
tRNApi~
Repeated Sequences
Central Conserved Section walleye substitutions 114 165 225 250
CSB-2
walleye substitutions 573 646
11
CSB-3 tRNAphe
I 11
11131 g 85 r8~e
g
;201 396 suDstetutions
FIGURE 3 Structure and sites of variability in the mtDNA control region of fishes from the teleost fish family Percidae. The control region is flanked by sequences that code for transfer RNA. The length of the control region varies between 908 and 1248 nucleotides and consists of several conserved sections, including the termination associated sequence (TAS; Doda et al., 1981; Southern et al., 1988), central conserved section (Lee et al., 1995), and conserved sequence boxes 2 and 3 (CSB-2 and CSB-3; Southern et al., 1988; Saccone et al., 1991), intervened by variable sections, including tandem repeats and base substitutions. Length heterogeneity is due to variable numbers of tandem repeats, totaling 50 to 388 nucleotides. Sites of intraspecific base substitutions for the walleye Stizostedion vitreum and the ruffe Gymnocephalus cernuus are indicated.
.~..., . ,, I
I
.'k . ~ ' ~
CZ-';": "":_-P2a'-i
..r
CG :y. -,,._i.,
: ~.-~ '--.--" C.-',
,.:-:,~..
x._~.
:
~,l. I I i " - ; "
......... v-- .......
,"
..,"
Z/ ;~ L:'-
FIGURE 4 Secondary structures of tandem repeat sequences in the mtDNA control region of fishes from the family Percidae, determined using DNAdraw (Matzura and Wennborg, 1995). Concentric circles denote 5' end of sequence. (A) Folding structure of tandem repeats (274 bases) for sauger, Stizostedion canadense, with estimated free energy of -139.80 kJ (37~ A similar stem and loop-shaped structure was found for walleye, S. vitreum. (B) Folding structure of tandem repeats (132 bases) for blackside darter, Percina macutata, with estimated free energy of -278 kJ (37~ Similar hairpin-shaped structures were found for the banded darter Etheostoma zonate, bluebreast darter E. camurum, zander S. lucioperca, yellow perch Perca flavescens, and ruffe Gymnocephalus cernuus.
ranging from 855 bp in the bluebreast darter E. camurum to 861 bp in sauger S. canadense. Alignment of nonrepeated sequences shows 196 variable sites (pn -0.227), with 61 transitions, 82 transversions, 26 that show both transitions and transversions, 20 insertions/deletions, 6 that include both transversions and insertions/deletions, and 1 site that includes transitions, transversions and insertions/deletions. Higher levels of nucleotide polymorphism are found in the left domain (pn = 0.282) and in the right domain (pn ---- 0.248), with a considerably lower level in the central conserved section (pn -- 0.088). Despite sequence variability among species, nucleotide compositions are similar, with high A-T content and significantly different frequencies among the four nucleotides than expected at random (X2, df = 3, p < 0.05). For example, nucleotide frequency ratios for sauger (S. canadense) are (G)0.14:(A)0.34: (T)0.29: (C)0.23; these ratios differ by no more than two percentage points among species.
A. Phylogenetic Analysis Genetic distance analysis of the entire control region (excluding tandem repeats) produced the tree shown in Fig. 5. Two primary branches were supported, with one containing walleye (S. vitreum) and sauger (S. canadense) as the sister group to zander (S. lucioperca) and
9. PhylogeneticAnalysis of the Percidae F
135
Walleye (haplotype 2) Stizostedion vitreum
I2 /Walleye (haplotype 7) S. vitreum I ~/ Walleye (haplotype 1)S. vitreum .002~~
I 13,1
.043
.037
100
I I
99
I
.015
.016J 711 .028 .006
98
.042
Walleye(haplotype3) S. vitreum
Walleye(haplotype 6) S. vitreum
"~-Walleye (haplotype 4) S. vitreum Walleye (haplotype 8) S. vitreum
.034
73
~
Walleye(haplotype 5) S. vitreum
Sauger, Stizostedion canadense Zander, Stizostedion lucioperca
.101
Yellow perch, Perca flavescens
~ Ruffe (Lake Superior), Gymnocephalus cernuus Ruffe (Russia), G. cernuus .048
.077 100
I
_009 ! 60
~
.039 .034
Banded darter, Etheostoma zonale Bluebreast darter, E. camurum Blackside darter, Percina maculata
FIGURE 5 MEGA(Molecular Evolutionary Genetic Analysis; Kumar et al., 1993)neighbor-joining distance tree for the mtDNA control region of eight species of fishes from the family Percidae. Distanceswere estimated using pairwise ~, genetic distances (a=0.50) and the Tamura-Nei model of nucleotide substitution (Tamura and Nei, 1993);the tree is midpoint rooted. Distances are listed above branches, and values for 1000bootstrap replications (Felsenstein, 1985)are below branches.
yellow perch (P. flavescens), and the other clade depicting ruffe (G. cernuus) as the group sister to the darters, with the banded darter (Etheostoma zonale) as the sister taxon to the bluebreast (E. camurum) and blackside darter (Percina maculata). All relationships, except those among walleye haplotypes, were supported by bootstrap values greater than 60%. Genetic distance analysis of the right domain alone produced a tree with the same topology and similar bootstrap values (not shown), except that the blackside darter was the sister group to the other two species of darters (bootstrap = 39%). Similar trees were produced using the left domain and the central conserved section alone; however, the position of yellow perch and ruffe and relationships among the darters differed. The left domain supported sister group relationships between zander and ruffe (bootstrap = 37%) and between yellow perch and the walleye/sauger (bootstrap = 37%). The central conserved section supported a sister group relationship between ruffe and walleye/sauger (bootstrap = 72%), the blackside darter was the sister group of the other two darter species (bootstrap = 46%), and the yellow perch was the sister taxon to all other species.
Cladistic analysis produced six most parsimonious trees of 313 steps, and the 50% majority rule consensus cladogram is shown in Fig. 6. Excluding the walleye haplotypes produced one most parsimonious tree with the same topology (not shown). All relationships supported by distance analysis were also supported by cladistics. For example, two primary clades were produced: one containing walleye (S. vitreum) as the group sister to sauger (S. canadense), and the zander (S. lucioperca) as the group sister to yellow perch (P. flavescens) and the other containing ruffe (G. cernuus) as the taxon sister to the darters, with the banded darter (E. zonale) as the taxon sister to the bluebreast (E. camurum) and blackside darter (P. maculata). Separate cladistic analyses of the control region by section supported the same general topology (trees not shown). However, the right domain (one shortest tree, 170 steps) suggested that the bluebreast darter is the sister group of the other two species of darters (bootstrap = 99%), the left domain (50% majority rule of 3 trees, 112 steps) suggested that ruffe is the sister group of the clade containing walleye/sauger and the darters (bootstrap = 67%), and the central conserved section (50% majority rule of 3 trees,
136
JOSEPH E. FABER A N D CAROL A. STEPIEN
Sauger, Stizostedion canadense
100 97
67i oo[ 1 6, i
Walleye (haplotype 1), Stizostedion vitreum Walleye (haplotype 2), S. vitreum Walleye (haplotype 7), S. vitreum Walleye (haplotype 3), S. vitreum Walleye (haplotype 5), S. vitreum Walleye (haplotype 6), S. vitreum Walleye (haplotype 4), S. vitreum
100 61
Walleye (haplotype 8), S. vitreum Zander, S. lucioperca
Yellow perch, Perca flavescens 100 ~ 100
I
69
1O01
Ruffe (Lake Superior), Gymnocephaluscernuus
L__ Ruffe (Russia), G. cernuus
10011O01
Banded darter, Etheostoma zonale Bluebreast darter, E. camurum
63 !____ Blackside darter, Percina maculata FIGURE 6 PAUP (Phylogenetic Analysis Using Parsimony, Version 3.1.1; Swofford, 1993) branch and bound 50% majority-rule consensus (Margush and McMorris, 1981) tree of the six most parsimonious trees of 313 steps, for the mtDNA control region of eight species of fishes from the family Percidae, with the zander, (Stizostedion lucioperca) as the outgroup. The tree is midpoint rooted. Consensus values are listed above branches, and values for 1000 bootstrap replications (Felsenstein, 1985) are below branches.
18 steps) suggested that yellow perch is included in the same clade with the darters (bootstrap = 55%). The gl skewness statistic was -0.859, indicating significant skew of the most parsimonious (shortest) tree relative to all other trees produced (p K 0.05; Hillis and Huelsenbeck, 1992). The gl statistic was also significant (p < 0.05) for separate analyses of the left and right domains and the central conserved section. T-PTP tests indicated that control region data supported relationships among taxa (Figs. 5 and 6) that were not significantly different than those supported by morphology (Fig. 1). In other words, for the five genera surveyed, control region data statistically supported relationships found in the morphological trees. Similar levels of support for morphological hypotheses were found for the left and right domains (p > 0.05), but central conserved section data produced relationships that were significantly different than those indicated by morphology (p < 0.05).
B. Population Genetics Seven intraspecific polymorphic nucleotide sites (pn = 0.007) were identified in the control region of
walleye (S. vitreum), with four sites in the left domain (nucleotides 114, 165, 225, and 250; Fig. 3), and two rarer substitutions (found in one or two individuals) in the right domain (nucleotides 573 and 646; Fig. 3). For 117 individuals examined, eight primary haplotypes (excluding differences in tandem repeats) were identified. The most parsimonious relationships among these haplotypes were uncertain (Fig. 6), and genetic distance analysis support for these relationships was weak, with bootstrap values less than 50% (Fig. 5). Most haplotypes were distributed widely among sites in Lake St. Clair and Lake Erie (h = 0.707 + 0.025), with haplotypes 2 and 3 found in each population sampled (Fig. 2). However, four haplotypes were identified only in the Sandusky River or Van Buren Bay populations. The most variance in molecular data was attributable to within-population variability, followed by variability between regions and lakes. Both within-population and between region variance and associated 9 statistics were significantly different than expected by chance (Table Ia). Among-population variability accounted for the smallest variance component, and cI)st, although greater than zero, was not significantly different from that expected by chance (0.05 K
9. Phylogenetic Analysis of the Percidae TABLE Ia
137
Hierarchical Analyses of Molecular Variance among Haplotypes of Walleye (Stizostedion vitreum) a
Variance component Among regions Among populations/regions Within populations
O'a 2 O'b 2 O'c 2
Variance
% total variance
pb
9 statistic
0.0628 0.0214 0.4827
11.08% 3.78% 85.14%
~0.001 ~ 0.096 ~0.001 ~
~ct = 0.100 ~sc = 0.042 (I) st "- 0.149
From Excoffier et al. (1992). bProbability of finding a greater variance component and 9 statistic than the observed values by chance. An asterisk denotes significance at p ~ 0.05. TABLE Ib X 2a and Standard Error Values, and Their Significance, for Haplotype Frequency Distributions of Walleye (S. vitreum) among All Sites and for Unplanned Pairwise Comparisons of Frequency Distributions between Sites b
Lake St. Clair Clinton River Clinton River Maumee River Sandusky River Grand River Van Buren Bay
m 0.001 0.001 0.001 0.001
Lake Erie MaumeeRiver Sandusky River 0.001 ~ m 0.004 0.013 0.005
0.001~ 0.018~ ~ 0.002 0.013
Grand River 0.001~ 0.216 0.006~ ~ 0.004
Van Buren Bay 0.001~ 0.022~ 0.801 0.019 9
aMonte Carlo simulation (Roff and Bentzen, 1989). bAmong all sites: df = 4, X2 = 49.44, p = 0.001 + 0.001~. Tabled matrix includes p values in the upper right and associated standard errors in the lower left. An asterisk denotes significance at p ~ 0.05.
p K 0.10; Table Ia). However, h a p l o t y p e frequencies were not distributed r a n d o m l y a m o n g sites (X 2, p < 0.001 + 0.001; Table Ib), and additional unp l a n n e d pairwise ,~,,2 tests suggested significantly different frequencies of haplotypes b e t w e e n sites (p 0.05), except b e t w e e n the M a u m e e River and Grand River and b e t w e e n Van Buren Bay and the S a n d u s k y River (Table Ib). Four nucleotide substitutions were identified in the control region of ruffe, G. cernuus, with three in the left d o m a i n at nucleotides 85, 87, and 201, and one in the central conserved section at nucleotide 396 (Fig. 3). Two haplotypes were identified, one in the North American Lake Superior region (n = 10) and one in St. Petersburg, Russia (n = 10) (h = 0.500). Intraspecific sequence variability was not found in the control regions of sauger, S. canadense (n = 4), or yellow perch, P. flavescens (n = 10).
IV. D i s c u s s i o n The m t D N A control region of vertebrate taxa, including b o n y fishes, contains highly variable sequences p u n c t u a t e d by conserved sections that are involved in
m t D N A replication (reviewed by Lee et al., 1995). Apparently, h o m o l o g o u s conserved sections revealed in this s t u d y i m p l y similar function in the m t D N A control region of the Percidae. T a n d e m repeats identified in Percidae are also similar to those identified in several other vertebrate taxa (for a review see Hoelzel et al., 1994). Repeats have been d o c u m e n t e d at several sites in the vertebrate control region, but are k n o w n from only the RS1 (repeat site 1; Hoelzel et al., 1994) in some fishes, including Pacific white sturgeon A. transmontanus (Buroker et al., 1990) and Atlantic cod G. morhua (Arnason and Rand, 1992). This s t u d y found that t a n d e m repeats occur at the RS1 in all the Percidae examined. Several m e c h a n i s m s (reviewed in Arnason and Rand, 1992) have been suggested to explain the formation, maintenance, and variability (in p r i m a r y sequence and repeat number) of t a n d e m repeats, including slipped strand mispairing (Levinson and Gutman, 1987), recombination (Rand and Harrison, 1989), and illegitimate elongation (Buroker et al., 1990). Essentially, these hypotheses require the secondary folding of sequences in or near the TAS, causing imperfect termination of strand replication that produces repeated TAS-like sequences. Energetically favorable hairpin structures associated w i t h t a n d e m repeats in the Percidae (Fig. 4) s u p p o r t these hypotheses.
138
JOSEPH E. FABER AND CAROL A. STEPIEN
Comparison of control region sequences also reveals different amounts of phylogenetic signal and phylogenetic noise at different taxonomic levels in the Percidae. The data set appears to contain more phylogenetic noise at higher taxonomic levels and more phylogenetic signal at lower taxonomic levels.
A. Comparisons among Genera Mitochondrial DNA control region sequences show phylogenetic utility at the generic level in the family Percidae. Significant skew of tree lengths from the exhaustive cladistic search indicates phylogenetic signal in the data set, which should support at least some "correct" lineages (Hillis and Huelsenbeck, 1992), and T-PTP analysis indicates significant support in data for aspects of all four morphological hypotheses of relationships among genera (Fig. 1). Genetic distance and most parsimonious tree topologies support congruent relationships among all species examined (Figs. 5 and 6), and both show pike-perches (Stizostedion) and darters (Etheostoma and Percina) correctly assigned to separate clades. Despite support for phylogenetic signal in the data set, inconsistencies between the genetic distance and cladistic trees and the morphological alternatives indicate phylogenetic noise. For example, Stizostedion is paraphyletic, with S. lucioperca (zander) as the sister group to P. flavescens (yellow perch) in the distance and cladistic trees. The sister relationship of yellow perch and zander is inconsistent with all morphological hypotheses and appears unlikely. Finally, the sister relationship of ruffe to the darters (distance and cladistic analyses) supports only Collette and Banarescu's (1977) morphological hypothesis (Fig. 1A), which is not widely accepted. Our study is presently missing several genera (including the probably extinct Percarina, the darters Ammocrypta and Crystallaria, and the European darters Romanichthys and Zingel), which precludes the effective testing of alternative morphological hypotheses. For example, the hypotheses of Page (1985) and Coburn and Gaglione (1992) (Figs. 1C and 1D, respectively) differ only in placement of the European darters, which are not included here. In addition, low sample size (few genera) and/or inherent sensitivity problems of the T-PTP test (Alroy, 1994) may have falsely provided statistical support of our data set for all four morphological hypotheses (Fig. 1).
B. Comparisons among Species Two species of S tizostedion and one of the two genera of darters examined (Etheostoma) were paraphyletic
in both genetic distance and cladistic trees (Figs. 5 and 6), suggesting that the phylogenetic signal may be imperfect at this taxonomic level. However, both distance and cladistic trees support the well-accepted hypothesis of a sister relationship between S. vitreum (walleye) and S. canadense (sauger), with both more distantly related to the S. lucioperca (zander) (Collette and Banarescu, 1977). Evolutionary divergence times estimated from genetic distances appear to support divergence time estimates from other molecular data. If divergence time is estimated at 1 million years per 2% sequence divergence (Brown et al., 1979), as has been assumed for several other taxa of bony fishes (Thomas et al., 1986; Grewe et al., 1990; Bermingham and Avise, 1986), our distance data (Fig. 5) suggest that North American walleye and sauger and the European zander last shared a common ancestor 4.75 + 1.45 million years before present (mybp), supporting the hypothesis of North American colonization from Eurasia via Beringia during the Pliocene, previously supported with allozyme and mtDNA RFLP data (Billington et al., 1990, 1991). Genetic distances also estimate a divergence time of 3.85 + 0.90 mybp between walleye and sauger similar to that proposed using allozymes and mtDNA RFLPs by Billington et al. (1990, 1991). Although both genetic distance and most parsimonious trees show the bluebreast and blackside darters (E. camurum and P. maculata, respectively) as more closely related to each other than to the banded darter (E. zonale), bootstrap values indicate little support for this relationship. Alternatively, the relationships of Percina may need to be reexamined. Genetic distances are consistent with the hypothesis that darters have originated and diversified in North America since the Pliocene (reviewed by Page, 1983). Assuming 2% sequence divergence per million years (Brown et al., 1979), divergence time estimates among darters range from 3.70 + 0.78 mybp between E. camurum and P. maculata to 5.02 + 1.02 mybp between E. zonale and P. maculata.
C. Comparisons among Populations Sequencing the mtDNA control region of walleye detected divergence at a finer geographic scale in Lake Erie and Lake St. Clair than either allozyme or mtDNA RFLP analyses (Todd, 1990; Billington and Hebert, 1988; Ward et al., 1989; Billington et al., 1992). Although the majority of molecular variance was found within populations (Clinton River, Maumee River, Sandusky River, Grand River, and Van Buren Bay) and between regions (Lakes Erie and St. Clair), the nonrandom distribution of haplotypes among populations strongly suggests that walleye do not mate randomly and prob-
9. Phylogenetic Analysis of the Percidae
ably home to natal spawning sites. The population substructure is most easily definable if unique genetic traits arise in isolated populations over long evolutionary time periods (e.g., among sunfish populations in the southeastern United States; Bermingham and Avise, 1986). Unique populations may also arise as indicated by nonrandom frequency distributions of genetic markers due to random extinctions of lineages in isolated populations (Avise, 1994; e.g., Caribbean reef fishes; Shulman and Bermingham, 1995). Most presentday genetic variability of Great Lakes walleye probably originated before or during the Wisconsin glaciation, which lasted from at least 1,000,000 years bp to approximately 10,000 years bp (Pielou, 1991), when populations were putatively isolated in glacial refugia (Billington et al., 1992). The present-day nonrandom distribution of walleye haplotype frequencies in Lake Erie may be due to the postglacial dispersal of walleye haplotypes, followed by random extinction of haplotypes among subsequently isolated spawning populations. Work in progress demonstrates that control region sequences are also useful at a broader geographic range in walleye, with fixed nucleotide substitutions among geographic regions that were probably colonized by populations from different glacial refugia. Some of these differences were also found with allozymes and mtDNA RFLPs by Billington et al. (1990, 1991). Lack of within-population variation in ruffe suggests genetic drift or founding events in the populations sampled. Low variability in Eurasia may be due to genetic drift, and low variability in the recently introduced (ca. 1987) North American sample may be due to a founder effect (a small number of founding individuals in cargo vessel ballast discharge). Alternatively, small sample sizes may have precluded the detection of within-population variability. Distinct fixed genetic differences at least indicate that the North American Lake Superior and the Russian Lake Komsomolskoe populations are genetically divergent and that the Lake Superior introductions did not originate from the Lake Komsomolskoe region of the ruffe's native range. Lack of variability in yellow perch (n = 10) and sauger (n = 4) is also likely due to small sample sizes, but may reflect lack of true population variability. For example, yellow perch populations tend to experience boom and bust cycles (Ney, 1978), which may reduce genetic variability (Strittholt et al., 1988). However, previously reported allozyme variability in sauger from the Ohio River (White, 1993) and mtDNA RFLP variability in yellow perch from Lake Erie (Billington, 1993), indicate that variability should be detected in larger sample sizes. Alternatively, the Percidae
139
as a family may be characterized by low populationlevel variability in the mtDNA control region. D. General Discussion
Relatively poor resolution of relationships among genera and higher resolution of phylogenetic and biogeographic patterns among and within species indicate phylogenetic noise and reduced signal in the control region data set among taxa that have diverged for longer evolutionary periods of time. Unlikely associations among distantly related taxa are probably due to multiple mutations at polymorphic nucleotide sites, which is evidenced by a relatively high nucleotide polymorphism (pn = 0.227) and a transversion to transition ratio of 2.2:1. DNA sequences with nucleotide polymorphism greater than 0.25 and transversion to transition ratios greater than approximately 1:10 may accumulate mutations and homoplasies (reversals of character state changes) that obscure accurate phylogeny reconstruction (Brown, 1983). Genetic distance phylogenetic analyses of mtDNA sequences may also be inaccurate when genetic distances approach or exceed 15% (Brown, 1983; e.g., pairwise distances ranged from about 15 to 28.8% between the bluebreast darter and yellow perch). Once p values exceed 15%, the molecular clock calibration becomes nonlinear (Brown, 1983) and phylogenetic resolution is reduced by homoplasy (Moritz et al., 1987). Alternatively, genetic distance estimates among species of Stizostedion and darters (Etheostoma and Percina) that probably diverged for shorter evolutionary periods were less than 15%, suggesting that fewer homoplasies (and less phylogenetic noise) have probably accumulated. Although the molecular clock estimate for mtDNA divergence is only generally applicable among different regions of the molecule and among different taxa (reviewed by Avise, 1994), these results indicate that control region sequences are at least partially predictive for more closely related genera and species of pike-perches and darters. More recent divergences of walleye and ruffe populations preclude homoplasy in control region sequences within species, indicating that they are accurate population genetic markers. Separate analyses of control region sections indicated that the central conserved section (CCS) evolves more slowly than do the left and right domains. The central conserved section exhibited lower p n and produced phylogenetic trees with very different topologies. Lower polymorphism in the CCS and apparent usefulness in other phylogenetic studies up to the family level (Lee et al., 1995) indicate fewer homoplasies and probable phylogenetic utility at higher taxonomic
140
JOSEPH E. FABER A N D CAROL A. STEPIEN
levels. However, both cladistic and genetic distance trees supported unrealistic phylogenies, and T-PTP analysis of the CCS data set did not support any of the four currently accepted morphological phylogenetic hypotheses (Fig. 1). These results probably do not indicate the true phylogenetic utility of the CCS due to few available characters; only 15 genetic distance and 7 cladistically informative characters were found out of 168 total nucleotides. Control region sequences, though, may help to clarify relationships that cannot be clearly resolved using more slowly evolving genes. For example, rapidly evolving left domain data have been used in tandem with more slowly evolving cytochrome b data to examine the phylogeny of cichlids (Sturmbauer and Meyer, 1993) and rainbow fishes (Zhu et al., 1994). It may also prove instructive to include control region CCS data with such analyses, as this section evolves more slowly than the rest of the control region. Conversely, the left and right domains (including most of the control region) evolve at higher rates, with relatively high nucleotide polymorphism (pn) and support for phylogenetic trees approximately the same as that with the entire control region. Population genetic markers for walleye and ruffe were found primarily in the left domain, suggesting that this section may house population genetic markers in other percids. Population genetic analyses of other fishes, including the pleuronectid Dover sole Microstomus pacificus (Stepien, 1995), scorpaenid thornyhead Sebastolobus alascanus (Stepien, 1995), serranid spotted sand bass Paralabrax maculatofasciatus (Stepien, 1995), acipenserid white sturgeon A. transmontanus (Brown et al., 1993), xiphiid swordfish Xiphias gladius (Bremer et al., 1995), and the salmonid rainbow trout (steelhead) Oncorhynchus mykiss (Nielson et al., Chapter 5), each identify population genetic markers in the left domain of the mtDNA control region, further supporting its population genetic utility across a wide range of fish taxa. The utility of mtDNA sequences for population genetics and biogeography for several taxa of bony fishes indicate that mtDNA control regions of other teleosts may evolve at similarly rapid rates. We expect that mtDNA control region sequences will continue to prove valuable for elucidating within-species and congeneric relationships. Higher-level phylogenies that rely solely on control region data will probably be affected by homoplasy and may suggest unrealistic associations among taxa. It is prudent (although too rarely practiced) to examine variability above and below the taxonomic level being studied in order to evaluate whether the level of variability is appropriate for the problem. Relative evolutionary rates of DNA sequences can only be elucidated through multispecies
comparisons as have been performed here, and such comparisons will undoubtedly help determine phylogenetic signal and utility for other regions of the mitochondrial and nuclear genomes.
V. Material Examined Etheostoma zonale: Tuscarawas River, Ohio, May 1994, n = 1. Gymnocephalus cernuus: St. Louis River, Lake Superior, Wisconsin, April 1994, n = 10; Komsomolskoe Lake, St. Petersburg, Russia, 1994, n = 10. Perca flavescens: South Bass Island, Lake Erie, Ohio, May 1993, n = 2; Sandusky Bay, Lake Erie, Ohio, May 1993, n = 8. Percina maculata: Shade River, Ohio, May 1994, n = 1. Stizostedion canadense: Hannibal lock spillway, Ohio River, Ohio, January 1993, n = 2; Racine lock spillway, Ohio River, Ohio, December 1994, n = 2. Stizostedion lucioperca: unknown European site, 1994, n = 1. Stizostedion vitreum: Clinton River, Lake St. Clair, Michigan, April 1995, n = 24; Maumee River, Lake Erie, Ohio, April 1993, n = 23; Sandusky River, Lake Erie, Ohio, April 1993, n = 24; Grand River, Lake Erie, Ohio, April 1993, n = 25; Van Buren Bay, Lake Erie, Dunkirk, New York, April 1993, n = 21.
Acknowledgments We thank C. Baker, C. Knight, R. Knight, T. Bader, and the Ohio Division of Wildlife; D. Einhouse and the New York Department of Environmental Conservation; and R. Haas and the Michigan Division of Wildlife for collecting spawning walleye. The Ohio Division of Wildlife also provided zander. Ruffe were provided by J. Gunderson and Minnesota Sea Grant. Other samples were collected by J.E.F. and C.A.S. under Ohio scientific permits. Thanks also to N. Billington for help in identifying our zander sample. M. Chandler (C.A.S. laboratory) collected ruffe DNA sequence data and A. Hubers provided technical assistance. E. Bermingham, M. Coburn, T. Kocher, R. Wilson, and an anonymous reviewer provided constructive criticism of this manuscript. This work was supported by Ohio Sea Grant Project RF-726750 (1994), National Sea Grant Project RF-707294 (1995-1998), and Lake Erie Protection Fund Grant LEPF-07-94 (1995 to 1998) to C. Stepien.
References Alroy, J. 1994. Four permutation tests for the presence of phylogenetic structure. Syst. Biol. 43:430-437. Arnason, E., and Rand, D. M. 1992. Heteroplasmy of short tandem repeats in mitochondrial DNA of Atlantic cod, Gadus morhua. Genetics 132: 211- 220. Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman and Hall, New York. Avise, J. C., Arnold, J., Ball, R. M., Bermingham, E., Lamb, T., Neigel,
9. Phylogenetic Analysis of the Percidae J. E., Reeb, C. A., and Saunders, N. C. 1987. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 18: 489-522. Bailey, R. M., and Gosline, W. A., 1955. Variation and systematic significance of vertebral counts in the American fishes of the family Percidae. Misc. Publ. Mus. Zool. Univ. Michigan 93:1-44. Bailey, R. M., and Etner, D. A. 1988. Comments on the subgenera of darters (Percidae) with descriptions of two new species of Etheostoma (Ulocentra) from southeastern United States. Misc. Publ. Mus. Zool. Univ. Michigan 175:1-48. Bermingham, E., and Avise, J. C. 1986. Molecular zoogeography of freshwater fishes in the southeastern United States. Genetics 113: 939-965. Billington, N. 1993. Genetic variation in Lake Erie yellow perch (Perca flavescens) demonstrated by mitochondrial DNA analysis. J. Fish. Biol. 41:941-950. Billington, N., Barrette, R. J., and Hebert, P. D. N. 1992. Management implications of mitochondrial DNA variation in walleye stocks. N. Am. J. Fish. Mng. 12:276-284. Billington, N., Danzmann, R. G., Hebert, P. D. N., and Ward, R. D. 1991. Phylogenetic relationships among four members of Stizostedion (Percidae) determined by mitochondrial DNA and allozyme analyses. J. Fish. Biol. 39:251-258. Billington, N., and Hebert, P. D. N. 1988. Mitochondrial DNA variation in Great Lakes walleye (Stizostedion vitreum) populations. Can. J. Fish. Aquat. Sci. 45:643-654. Billington, N., and Hebert, P. D. N. 1991. Mitochondrial DNA diversity in fishes and its implications for introductions. Can. J. Fish. Aquat. Sci. 48:80-94. Billington, N., Hebert, P. D. N., and Ward, R. D. 1990. Allozyme and mitochondrial DNA variation among three species of Stizostedion (Percidae): Phylogenetic and zoogeographic implications. Can. J. Fish. Aquat. Sci. 47:1093-1102. Billington, N., and Strange, R. M. 1995. Mitochondrial DNA analysis confirms the existence of a genetically divergent walleye population in northeastern Mississippi. Trans. Am. Fish. Soc. 124: 770-779. Bodaly, R. A. 1980. Pre- and post-spawning movements of walleye, Stizostedion vitreum, in southern Indian Lake, Manitoba. Can. Tech. Rep. Fish. Aquat. Sci. 931:1-30. Bremer, J. P. A., Baker, A. J., and Mejuto, J. 1995. Mitochondrial DNA control region sequences indicate extensive mixing of swordfish (Xiphias gladius) populations in the Atlantic Ocean. Can. J. Fish. Aquat. Sci. 52:1720-1732. Brown, G. G. 1986. Structural conservation and variation in the Dloop-containing region of vertebrate mitochondrial DNA. J. Mol. Biol. 192:503-511. Brown, J. R., Beckenbach, A. T., and Smith, M. J. 1993. Intraspecific DNA sequence variation of the mitochondrial control region of white sturgeon (Acipenser transmontanus). Mol. Biol. Evol. 10: 326-341. Brown, W. M. 1983. Evolution of animal mitochondrial DNA. In "Evolution of Genes and Proteins" (M. Nei and R. K. Koehn, eds.). Sinauer, Sunderland, MA. Brown, W. M., George, M., Jr., and Wilson, A. C. 1979. Rapid evolution of animal mitochondrial DNA. Proc. Natl. Acad. Sci. USA 76: 1967-1971. Buroker, N. E., Brown, J. R., Gilbert, T. A., O'Hara, P. J., Beckenbach, A. T., Thomas, W. K., and Smith, M. J. 1990. Length heteroplasmy of sturgeon mitochondrial DNA: An illegitimate elongation model. Genetics 124:157-163. Coburn, M. M., and Gaglione, J. I. 1992. A comparative study of Percid scales (Teleostei: Perciformes). Copeia 1992:986-1001. Collette, B. B. 1963. The subfamilies, tribes, and genera of the Percidae (Teleostei). Copeia 1963:615-623.
141
Collette, B. B., and Banarescu, P. 1977. Systematics and zoogeography of the fishes of the family Percidae. J. Fish. Res. Board Can. 34: 1450-1463. Doda, J. N., Wright, C. T., and Clayton, D. A. 1981. Elongation of displacement-loop strands in human and mouse mitochondrial DNA is arrested near specific template sequences. Proc. Natl. Acad. Sci. USA 78:6116-6120. Echelle, A. A., Echelle, A. F., Smith, M. H., and Hill, L. G. 1975. Analysis of genic continuity in a headwater fish, Etheostoma radiosum (Percidae). Copeia 1975:197-204. Excoffier, L., Smouse, P. E., and Quattro, J. M. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA data. Genetics 131: 479-491. Faith, D. P. 1991. Cladistic permutation tests for monophyly and nonmonophyly. Syst. Zool. 40:366-375. Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783-791. Ferguson, R. G., and Derksen, A. J. 1971. Migration of adult and juvenile walleyes (Stizostedion vitreum vitreum) in southern Lake Huron, Lake St. Clair, Lake Erie and connecting waters. J. Fish. Res. Board Can. 28:1133-1142. Grewe, P. M., Billington, N., and Hebert, P. D. N. 1990. Phylogenetic relationships among members of Salvelinus inferred from mitochondrial DNA divergence. Can. J. Fish. Aquat. Sci. 47:984-991. Hawley, G. J., D. H., Dehayes, and Labar, G. W. 1991. "Genetic Analysis of Lake Champlain and New York Walleye Populations." Report to Vermont Dept. of Fish and Wildlife. Hillis, D. A., and Huelsenbeck, J. P. 1992. Signal, noise, and reliability in molecular phylogenetic analyses. J. Hered. 83:189-195. Hoelzel, A. R., Hancock, J. M., and Dover, G. A. 1991. Evolution of the cetacean mitochondrial D-loop region. Mol. Biol. Evol. 8: 475493. Hoelzel, A. R., Lopez, J. V., Dover, G. A., and O'Brien, S. J. 1994. Rapid evolution of a heteroplasmic repetitive sequence in the mitochondrial DNA control region of carnivores. J. Mol. Evol. 39:191-199. Hultman, T., Stahl, S., Homes, E., and Uhlen, M. 1989. Direct solid phase sequencing of genomic and plasmid DNA using magnetic beads as solid support. Nucleic Acids Res. 17: 4937-4946. International Biotechnologies, Inc., Kodak. 1992. Assembly LIGN Sequence Assembly Software. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., Paabo, S., Villablanca, F. X., and Wilson, A. C. 1989. Dynamics of mitochondrial DNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86:6196-6200. Kocher, T. D., and Wilson, A. C. 1991. Sequence evolution of mitochondrial DNA in humans and chimpanzees: Control region and a protein-coding region. In "Evolution of Life" (S. Osawa and T. Honjo, eds.), pp. 391-413. Springer-Verlag, New York. Kumar, S., Tamura, K., and Nei, M. 1993. "MEGA, Molecular Evolutionary Genetics Analysis, Vers. 1.01." Institute of Molecular Evolutionary Genetics, The Pennsylvania State University, University Park, PA. Leary, R., and Booke, H. E. 1982. Genetic stock analysis of yellow perch from Green Bay and Lake Michigan. Trans. Am. Fish. Soc. 111:52-57. Lee, W., Conroy, J., Howell, W. H., and Kocher, T. D. 1995. Structure and evolution of teleost mitochondrial control regions. J. Mol. Evol. 41:54-66. Levinson, G., and Gutman, G. A. 1987. Slipped-strand mispairing: A major mechanism for DNA sequence evolution. Mol. Biol. EvoI. 4: 203-221. Maddison, W. P., and Maddison, D. R. 1992. "MacClade Version 3.0. Analysis of Phylogeny and Character Evolution." Sinauer, Sunderland, MA.
142
JOSEPH E. FABER A N D CAROL A. STEPIEN
Margush, T., and McMorris, F. R. 1981. Consensus n-trees. Bull. Math. Biol. 43:239-244. Matzura, R., and Wennborg, G. 1995. RNAdraw Version 1.01. McElroy, D., Moran, P., Bermingham, E., and Kornfield, I. 1992. REAP: An integrated environment for the manipulation and phylogenetic analysis of restriction data. J. Hered. 83:157-158. Mercker, R. J., and Woodruff, R. C. 1996. Molecular evidence for divergent breeding groups of walleye (Stizostedion vitreum) in tributaries to western Lake Erie. J. Great Lakes Res. 22:280-288. Meyer, A., Kocher, T. D., Basasibwaki, P., and Wilson, A. C. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature 347:550-553. Moritz, C., Dowling, T. E., and Brown, W. M. 1987. Evolution of animal mitochondrial DNA: Relevance for population biology and systematics. Annu. Rev. Ecol. Syst. 18:269-292. Nei, M. 1987. "Molecular Evolutionary Genetics." Columbia University Press, New York. Nelson, J. S. 1994. "Fishes of the World." Wiley, New York. Ney, J. J. 1978. A synoptic review of yellow perch and walleye biology. Am. Fish. Soc. Spec. Publ. 11:1-12. Page, L. M. 1974. The subgenera of Percina (Percidae: Etheostomatini). Copeia 1974:66-86. Page, L. M. 1981.The genera and subgenera of darters (Percidae, Etheostomatini). Occ. Pap. Mus. Nat. Hist. Univ. Kansas 90:1-69. Page, L. M. 1983. "Handbook of Darters." TFH Publications, Inc., Champaign, IL. Page, L. M. 1985. Evolution of reproductive behaviors in Percid fishes. Ili. Nat. Hist. Surv. 33:275-295. Page, L. M., and Whitt, G. S. 1973a. Lactate dehydrogenase isozymes, malate dehydrogenase isozymes and tetrazolium oxidase mobilities of darters (Etheostomatini). Comp. Biochem. Physiol. B 44: 611-623. Page, L. M., and Whitt, G. S. 1973b. Lactate dehydrogenase isozymes of darters and the inclusiveness of the genus Percina. Ill. Nat. Hist. Surv. 82:1-7. Pielou, E. C. 1991. "After the Ice Age: The Return of Life to Glaciated North America." University of Chicago Press, Chicago, IL. Rand, D. M., and Harrison, R. G. 1986. Mitochondrial DNA transmission genetics in crickets. Genetics 114:955-970. Roff, D. A., and Bentzen, P. 1989. The statistical analysis of mitochondrial DNA polymorphisms: X2 and the problem of small samples. Mol. Biol. Evol. 6:539-545. Saccone, C., Pesole, G., and Sbisa, E. 1991. The main regulatory region of mammalian mitochondrial DNA: Structure-function model and evolutionary pattern. J. Mol. Evol. 33:83-91. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Sanger, F., Nicklen, S., and Coulson, A. R. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74: 5463-5467. Shulman, M. J., and Bermingham, E. 1995. Early life histories, ocean currents, and the population genetics of Caribbean reef fishes. Evolution 49: 897-910. Simon, T. P., and Vondruska, J. T. 1989. Larval identification of the ruffe, Gymnocephalus cernuus (Linnaeus) (Percidae: Percini), in the St. Louis River estuary, Lake Superior drainage basin, Minnesota. Can. J. Zool. 69:436-441. Simons, A. M. 1989. "The Phylogenetic Relationships of the Sand Darters (Teleostei: Percidae)." Masters thesis, University of Kansas, Lawrence, KS. Simons, A. M. 1992. Phylogenetic relationships of the Boleosoma species group (Percidae: Etheostoma). In "Systematics, Historical Ecol-
ogy, and North American Freshwater Fishes" (R. L. Mayden, ed.), pp. 268-292. Stanford University Press, Stanford, CA. Southern, S. O., Southern, P. J., and Dizon, A. E. 1988. Molecular and phylogenetic studies with a cloned dolphin mitochondrial genome. J. Mot. Evol. 28:32-42. Stepien, C. A. 1995. Population genetic divergence and geographic patterns from DNA sequences: Examples from marine and freshwater fishes. In "Evolution and the Aquatic Ecosystem: Defining Unique Units in Population Conservation." (J. U Nielson, ed.), pp. 263-287. American Fisheries Society, Bethesda, MD. Stepien, C. A., Dixon, M. T., and Hillis, D. M. 1993. Evolutionary relationships of the fish families Clinidae, Labrisomidae, and Chaenopsidae: Congruence between DNA sequence and allozyme data. Bull. Mar. Sci. 52:873-896. Strittholt, J. R., Guttman, S. I., and Wissing, T. E. 1988. Low levels of genetic variability of yellow perch (Perca flavescens) in Lake Erie and selected impoundments. In "The Biogeography of the Island Region of Western Lake Erie." (J. F. Downhower, ed.). Ohio State University Press, Columbus, OH. Sturmbauer, C., and Meyer, A. 1992. genetic divergence, speciation and morphological stasis in a lineage of African cichlid fishes. Nature 358:578-581. Sturmbauer, C., and Meyer, A. 1993. Mitochondrial phylogeny of the endemic mouthbrooding lineages of cichlid fishes from Lake Tanganyika in Eastern Africa. Mol. Biol. Evol. 10: 751- 768. Swofford, D. L. 1993. PAUP (Phylogenetic Analysis Using Parsimony) vers. 3.1. for Macintosh Computers. Ill. Nat. Hist. Surv., Champaign, IL. Swofford, D. L. 1996. "PAUP* (Phylogenetic Analysis Using Parsimony): Preliminary Test Version." Sinauer, Sunderland, MA. Tamura, K. and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mot. Biol. Evol. 10:512-526. Thomas, W. K., Withler, R. E., and Beckenbach, A. T. 1986. Mitochondrial DNA analysis of Pacific salmonid evolution. Can. J. Zool. 64: 1058-1064. Titus, T. A., and Larson, A. 1995. A molecular phylogenetic perspective on the evolutionary radiation of the salamander family Salamandridae. Syst. Biol. 44:125-151. Todd, T. N. 1990. "Genetic Differentiation of Walleye Stocks in Lake St. Clair and Western Lake Erie." U.S. Department of the Interior, Fish and Wildlife Service, Fish and Wildlife Technical Report 28: 1-19. Uhlen, M. 1989. Magnetic separation of DNA. Nature 340: 733-734. Wakely, J. 1993. Substitution rate variation among sites in hypervariable region I of human mitochondrial DNA. J. Mol. Evol. 37:613623. Walberg, M. W., and Clayton, D. A. 1981. Sequence and properties of the human KB cell and mouse L cell D-loop regions of mitochondrial DNA. Nucleic Acids Res. 9:5411-5420. Ward, R. D., Billington, N., and Hebert, P. D. N. 1989. Comparison of allozyme and mitochondrial variation in populations of walleye, Stizostedion vitreum. Can. J. Fish. Aquat. Sci. 46:2074-2084. Weir, B. S., and Cockerham, C. C. 1984. Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370. White, M. M. 1993. "An Evaluation of the Genetic integrity of Ohio River Walleye and Sauger Stocks." Report to the Ohio Department of Natural Resources. Wiley, E. O. 1992. Phylogenetic relationships of the Percidae (Teleostei: Perciformes): A preliminary hypothesis. In "Systematics, Historical Ecology, and North American Freshwater Fishes" (R. L. Mayden, ed.), pp. 247-267. Stanford University Press, Stanford, CA.
9. Phylogenetic Analysis of the Percidae Wilson, A. C., Cann, R. L., Carr, S. M., George, M., Jr., Gyllensten, U. B., Helm-Bychowski, K. M., Higuchi, R. G., Palumbi, S. R., Prager, E. M.. Sage, R. D., and Stoneking, M. 1985. Mitochondrial DNA and two perspectives on evolutionary genetics. Biol. J. Linn. Soc. 26:375-400. Wiseman, E. D., Echelle, A. A., and Echelle, A. F. 1978. Electropho-
143
retic evidence for subspecific differentiation and intergradation in Etheostoma spectabile (Teleostei: Percidae). Copeia 1978:320-327. Zhu, D., Jamieson, B. G. M., Hugall, A., and Moritz, C. 1994. Sequence evolution and phylogenetic signal in control-region and cytochrome b sequences of rainbow fishes (Melanotaeniidae). Mol. Biol. Evol. 11:672-683.
This Page Intentionally Left Blank
C H A P T E R
10 Phylogenetic Relationships among the Salmoninae Based on Nuclear and Mitochondrial DNA Sequences R U T H B. PHILLIPS and T O D D H. OAKLEY
Department of Biological Sciences University of Wisconsin-Milwaukee Milwaukee, Wisconsin 53201
have been used in phylogenetic analyses of Oncorhynchus (Thomas et al., 1986; Shedlock et al., 1992; Domanico and Phillips, 1995; McKay et al., 1996) and Salvelinus (Grewe et al., 1990; McVeigh and Davidson, 1991; Pleyte et al., 1992; Phillips et al., 1994). Analysis of these molecular data has resulted in clarification of the species relationships in these two genera. Despite all of these studies, major questions remain concerning the relationships among genera, species, subspecies, and populations of these fishes. Salmonid relationships are especially problematic for systematists for the following reasons. First, these fishes underwent a rapid adaptive radiation following tetraploidization around 50-100 million years ago (A1lendorf and Thorgaard, 1984). Rapid radiations are characterized by star phylogenies, and branches are supported by only a few shared derived characters (synapomorphies). Second, hybridization and introgression appear to have been quite common in this group (reviewed in Utter and Allendorf, 1994), leading to inconsistencies between maternally inherited characters [such as mitochondrial (mt) DNA] and characters that are biparentally inherited. Finally, recolonization of lakes released from glaciation within the past 10,000 years has resulted in assemblages of sympatric morphotypes in different degrees of reproductive isolation in different northern lakes; these are especially
I. I n t r o d u c t i o n
Clarification of salmonid relationships is important for conservation of these fishes, many of which are threatened or endangered. In this chapter, major questions concerning the relationships of the fishes in the subfamily Salmoninae are reviewed and conflicting data are evaluated. Trees based on recently obtained molecular data from sequencing specific nuclear and mitochondrial genes are presented and explanations for conflicts are discussed. Assuming that the consensus tree based on molecular data is correct, the implications for the evolution of chromosomes and life history traits are considered. Salmonid fishes have been one of the most intensively studied fish groups. There are three subfamilies of salmonid fishes: Coregoninae (whitefishes and ciscoes), Thymallinae (graylings), and Salmoninae (lenoks, huchen, trouts, chars, and salmons). These fishes have been the subject of many systematic studies using morphological (Behnke, 1989; Sanford, 1990; Stearley and Smith, 1993), karyological (Hartley, 1987; Cavender, 1984; Cavender and Kimura, 1989; Phillips et al., 1989), ontogenetic (Pavlov, 1980; Kendall and Behnke, 1984), and allozyme markers (Utter and Allendorf, 1994; Crane et al., 1994). Molecular data including sequences from both mitochondrial and nuclear genes
MOLECULAR SYSTEMATICS OF FISHES
145
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
RUTH B. PHILLIPS AND TODD H. OAKLEY
146 TABLE I
Scientific and Common Names of Fishes of the Subfamily Salmonine
Scientific name
Common name
Range
Brachymystax lenok Hucho hucho hucho H. hucho taimen H. perryi Salvelinusfontinalis S. namaycush S. alpinus S. malma S. confluentus S. leucomaenis Salmo trutta S. salar Salmothymus Oncorhynchus mykiss
Lenok Danubesalmon Taimen Huchen Brook trout Lake trout Arctic c h a r Dolly Varden char Bull trout Spotted char Brown trout Atlantic salmon Adriatic salmon Rainbowtrout
0. clarki
Cutthroat trout
O. tshawytscha O. kisutch O. keta O. nerka O. gorbuscha
Chinook salmon Coho salmon Chum salmon Sockeye salmon Pink salmon
Russia DanubeRiver Russia Japan North America North America Circumpolar regions North Pacific North America Japan, Russia Europe Atlanticocean EasternEurope Western North America, Asia Western North America North Pacific North Pacific North Pacific North Pacific North Pacific
c o m m o n in fishes of the genus Salvelinus (reviewed in Savvaitova, 1989, 1995), leading to a large n u m b e r of species and subspecies being n a m e d in this genus. A. Intergeneric Relationships
1. Taxonomy and Status of Proposed Genera The subfamily Salmoninae contains b e t w e e n five and nine extant genera. There is general a g r e e m e n t
TABLE H
Disputed genus
Specificname
regarding the status of five genera: Brachymystax, Hucho, Salvelinus, Salmo, and Oncorhynchus (Table I). Four other genera, Acantholingua, Parahucho, Salmothymus, and Salvethymus, have been proposed, but a consensus has not been reached w i t h respect to their status. All of the disputed genera have been erected for single species w i t h restricted ranges with the exception of Salmothymus, which has been p r o p o s e d to include from one (Hadzisce, 1961) to as m a n y as five species (Stearley and Smith, 1993). The competing hypotheses for the disputed genera are listed in Table II. As a result of recent genetic data (Chereshnev and Skopets, 1992), a consensus has been attained regarding the position of Salvethymus. Salvethymus svetovidovi is an endemic species discovered recently in Lake E1g y g y t g y n (Chereshnev and Skoptets, 1992). Despite its u n u s u a l primitive morphology, which led initially to its being placed in a n e w genus, genetic distances based on allozyme data suggest that it is very closely related to the two forms of Arctic char found in the same lake and it should be considered a subgenus of Salvelinus (reviewed in Glubokovsky and Frolov, 1994; Savvaitova, 1995). The k a r y o t y p e of S. svetovidovi is highly derived (Frolov, 1992, 1993), which also contradicts the original hypothesis that the fish is more primitive than other extant species of Salvelinus. The status of another controversial genus, Parahucho, has also been the subject of osteological (reviewed in Holcik, 1988; Dorofeyeva, 1989), c h r o m o s o m a l (Kang and Park, 1973; Viktorovsky et al., 1985; Rab and Liehman, 1982), and allozyme (Osinov, 1991) studies. This controversy involves the relationships of three species: Brachymystax lenok, Hucho hucho sp., and Hucho perryi or Parahucho perryi, referred to as perryi in the follow-
Summary of Disputed Generic Classifications Range
Salvethymus
svetovidovi
Endemic to the Siberian Lake Elgygytgyn
Salmothymus
From one (obtusirostris) to as many as five species
The species obtusirostris is native to rivers in Bosnia-Herzegovina which flow into the Adriatic Sea
Acantholingua
ohridana
Endemic to Lake Ohrid in Macedonia
Parahucho
perryi
Japan
Hypotheses 1. Salvethymus is a monotypic genus (Chereshnev and Skopets, 1992) 2. Salvethymus is a subspecies of Salvelinus derived from a sympatric form of Arctic char (Chereshnev and Skopets, 1992; Glubokovsky et al., 1992; reviewed in Glubokovsky and Frolov, 1994) 1. Salmothymus is a monotypic genus (Hadzisce, 1961) 2. Salmothymus is a subgenus of Salmo (Behnke, 1968) 3. Salmothymus is a genus containing two species: S. ohridana and S. (Acantholingua) obtusirostris (Svetovidov, 1975) 4. Salmothymus is a genus containing as many as five species (Stearley and Smith, 1993) 1. Acantholingua is a monotypic genus (Hadzisce, 1961) 2. Acantholingua is a subgenus of Salmo (Behnke, 1968) 3. Acantholingua is a subgenus of Salmothymus (Svetovidov, 1975) 1. Parahucho is a subgenus of Hucho (Vladykov, 1963) 2. Recognition of Parahucho is not necessary, and perryi is a member of the genus Hucho (Jordan and Snyder, 1902; Stearley and Smith, 1993) 3. Parahucho is a monotypic genus (Viktorovskyet al., 1985; Dorofeyeva, 1989; Phillips et al. 1995a)
10. Salmoninae
147
OUTGROUP
OUTGROUP
OUTGROUP
Brachymystaxlenok
Brachymystaxlenok
Brachymystaxlenok
Hucho hucho
Huchohucho
Parahuchoperryi
Parahuchoperryi
Hucho hucho
1
Huchoperryi
A
B
C
FIGURE 1 Threehypotheses regarding the relationships among Brachymystax lenok, Hucho hucho, and H. (or Parachucho) perryi. (A) H. hucho and perryi are closely related sibling species. Parahucho is not necessary and perryi should be H. perryi (Smith and Stearley, 1993). (B) H. hucho and perryi are diverged enough that perryi can be placed in its own genus. (C) H. hucho and B. lenok are sister species, so including perryi in Hucho makes Hucho paraphyletic.
ing discussion. The three hypotheses are illustrated in Fig. 1. The first possibility (Fig. 1A) is that perryi and H. hucho are closely related species and Parahucho is not a valid genus (Stearley and Smith, 1993). The other two alternatives support the legitimacy of Parahucho. If perryi is not closely related to H. hucho, as in Fig. 1B, it could validate the existence of Parahucho. The third possibility (Fig. 1C) is that B. lenok and H. hucho are more closely related than are H. hucho and perryi. If this is the case, placing perryi in the genus Hucho would make Hucho a paraphyletic taxon. Placing perryi in Parahucho w o u l d prevent this unnatural grouping. Both karyotypic and allozyme data support the third hypothesis. Osinov (1991) calculated genetic distance based on allozymes and found that B. lenok and H. hucho are the closest pair (Table III). Karyotypic data also support this hypothesis because perryi has a very different karyotype (2n = 62) c o m p a r e d with H. hucho (2n = 82-84) and B. lenok (2n = 92), which share karyotypic features not present in perryi (reviewed in Rab et al., 1994). H. hucho is divided into two major subspecies: H. hucho hucho from the Danube in eastern Europe
TABLE III
and H. hucho taimen from Siberia (reviewed in Holcik, 1982a,b). The presence of natural hybrids between H. hucho taimen and B. lenok in the A m u r river system reported by Hsieh et al. (1959) is also consistent with a close relationship between these two species. 2. M o l e c u l a r Data o n D i s p u t e d G e n e r a
Molecular data are available only for the problem of the controversial genus Parahucho. Collection of fresh specimens of the other disputed genera has been difficult because they are located in remote areas. M u s e u m specimens are available for all of them and could be the focus of future studies using ancient D N A techniques. Data relevant to the Parahucho problem are derived from two nuclear genes: ribosomal D N A (rDNA) and intron C of growth h o r m o n e 2 (GH-IIC), as well as mitochondrial D N A restriction fragment length polymorphisms (RFLPs) (Table III). Restriction m a p s of the ribosomal D N A from B. lenok, H. hucho taimen, and perryi were p r e p a r e d and a phylogenetic analysis of these data was done (Phillips et al., 1995a) using m a x i m u m p a r s i m o n y as i m p l e m e n t e d in PAUP (Swofford, 1993). These results (Fig. 2) s u p p o r t the placement of perryi in
Genetic Distances among the Huchonini
lenok/ lenok/ lenok/ taimen/ taimen/ hucho/ blunt/ taimen hucho perryi hucho perryi perryi sharp
Allozyme" mtDNA RFLPb rDNA RLFPc GH2 C a,e GH2 C a,f
0.335 3.25 1.97 2.45 1.77
2.46 1.77
0.891 7.08 3.89 4.49 4.50
1.11 1.32
0.755 6.54 2.68 3.57 3.58
0.103 2.24 3.57 3.58
aData from Osinov (1991). bData from Shedko and Ginatulina (1993). cData from Phillips et al. (1995a). ~Growth hormone 2 intron C data from T. H. Oakley and R. B. Phillips (manuscript in preparation). eDistances were calculated including the 80-bp insertion. fDistances were calculated without the 80-bp insertion.
148
RUTH B. PHILLIPSAND TODD H. OAKLEY
7874~_ 63-~ 83~
Thymallus arcticus Hucho hucho taimen Brachymystax lenok Parahucho perryi Salvelinus leucomaenis Salvelinus fontinalis Salvelinus namaycush
FIGURE 2 Relationshipsamong Brachymystax, Hucho, and Salvelinus based on maximum parsimony analysis of molecular data from
RFLPs of ribosomal DNA. Majority-rule consensus cladogrambased on 500 bootstrap replications using the branch and bound search option of PAUP. Numbers represent bootstrap percentages (Phillips et al., 1995a).
a separate genus Parahucho and suggest that B. lenok is a sister species to H. hucho taimen. Evidence supporting the genus Parahucho was also obtained from analysis of the sequences of GH-IIC. B. lenok and the two H. hucho subspecies have an 80-bp insertion at base pair 28,
A
which is not found in any other salmonids, including perryi (T. H. Oakley and R. B. Phillips, manuscript in
preparation). Genetic distances based on allozymes, mtDNA RFLPs, nuclear rDNA RFLPs, and the nuclear GH-IIC all support the placement of perryi in a separate genus, Parahucho (Table III).
3. Relationships among Genera There is a consensus that the genus Oncorhynchus is the most derived genus in the subfamily Salmoninae, followed by the genus Salmo and then the genus Salvelinus (Fig. 3). Four major hypotheses have been proposed for the relationships among Brachymystax, Hucho, and Salvelinus (Fig. 3): (1) each genus is monophyletic and branches off separately from the main stem (Fig. 3A, Norden, 1961); (2) all three genera form a monophyletic group (Fig. 3B, Kendall and Behnke, 1984); (3) Brachymystax, Hucho, and Parahucho form a monophyletic group, and Salvelinus is on a separate branch (Fig. 3C, Dorofeyeva, 1989); and (4) Brachymystax is on the most basal branch, and Hucho (including perryi) and Salvelinus form a separate monophyletic group (Fig. 3D, Stearley and Smith, 1993). Part of the confusion probably stems from the fact that some of the investigators
Thymallus
Thymailus
Brachymystax
Brachymystax
Hucho
Hucho
Salvelinus
Salvelinus
Salmo
Salmo
Oncorhynchus
Oncorhynchus
Thymailus Thymallus
Brachymystax
Brachymystax
Hucho
C
Parahucho
D
Salvelinus Salmothymus Salmo Oncorhynchus
Salmothymus Hucho Salvelinus Salmo Oncorhynchus
FIGURE 3 Fourmajor hypotheses for relationships among Brachymystax, Hucho, and Salvelinus: (A) Norden
(1961), (B) Kendall and Behnke (1984),(C) Dorofeyeva(1989),and (D) Stearley and Smith (1993).
10. Salmoninae
149
examined only one or two of the species within the genus Hucho, and the species in this genus do not appear to be a monophyletic group as described earlier.
Brachymystaxlenok sharp-snoutedform
5t
Brachymystaxlenok blunt-snoutedform Hucho hucho taimen
4. Molecular Data on Relationships among Genera Molecular data relevant to intergeneric relationships have been obtained from sequences of the GH-IIC (T. H. Oakley and R. B. Phillips, manuscript in preparation). Salmonid fishes have at least two unlinked growth hormone genes (GH-I and GH-II). There are six exons and five introns in the GH genes. The two largest introns are C and D, which average 750 bp for GH-IC, 450 bp for GH-IIC, 1100 bp for GH-ID, and 1300 bp for GHIID (Devlin, 1993; Blackhall, 1994). The GH-IIC intron was sequenced from the two morphotypes of B. lenok, the two subspecies of Hucho (H. hucho hucho and H. hucho taimen), P. perryi, three species of Salvelinus, Salmo trutta, Oncorhynchus kisutch, and Oncorhynchus gorbuscha. These sequences were combined with previously published GH-IIC sequences from Salmo salar (Johansen et al., 1989), five other species of Oncorhynchus (Devlin, 1993; D u e t al., 1993), and two sex-linked pseudogenes from O. kisutch and O. tshawytscha (Du et al., 1993; Forbes et al., 1994) and were analyzed with maxim u m parsimony using the PAUP program. The tree based on this analysis is shown in Fig. 4. This analysis shows that 13 synapomorphies support the placement of P. perryi with Salvelinus, Salmo, and Oncorhynchus and suggests that these other genera radiated rapidly. The additional GH intron sequences should contain enough informative sites to resolve these relationships. Although rDNA RFLP data are available for almost all of the species in the Salmoninae, there are very few informative sites for intergeneric relationships (Phillips et al., 1992, 1995a,b). Maximum parsimony analysis of the ITS1 rDNA sequences produced one most parsimonious tree in which Salmo and Oncorhynchus form one clade and Parahucho is on a branch between this clade and Salvelinus (Fig. 5).
Hucho huchohucho
Salvelinus namaycush
2 I
Salvelinus alpinus Parahuchoperryi
2 I
13
Salmo trutta Salmo salar Oncorhynchus kisutch Pseudogene
13
Oncorhynchustshawytscha Pseudogene
2 I 3]
Oncorhynchus clarki Oncorhynchus mykiss Oncorhynchus kisutch Oncorhynchus tshawyscha Oncorhynchus nerka
3
Oncorhynchusgorbuscha Oncorhynchus keta
FIGURE 4 Intergeneric relationships based on maximum parsimony analysis of molecular data from the GH-IIC sequences. Strict consensus tree of the 12 most parsimonious trees obtained using the branch and bound search option of PAUP. Numbers represent synapomorphies.
Coregonus Brachymystax
B. Relationships among Chars of
Salvelinus
the genus Salvelinus
1. Summary of Morphological, Karyological, and Allozyme Data Behnke (1980) and Cavender (1980) have proposed that the genus Salvelinus includes six major morphologically distinct species. Three major lineages in North America have been designated as subgenera by Behnke (1980) (Fig. 6A). These are the subgenus Cristovomer with one species, S. namaycush (lake trout), confined to lakes in northern North America; the subgenus Baione with one species, S. fontinalis (brook trout), found in
-100
Parahucho
67-~
~
Salmo
90Oncorhynchus
FIGURE 5 Intergeneric relationships based on maximum parsimony analysis of molecular data from the rDNA ITS1.Majority-rule consensus tree based on 500 bootstrap replications using the branch and bound search option of PAUP. Numbers represent bootstrap percentages.
150
RUTH B. PHILLIPS A N D TODD H. OAKLEY
BAIONE
CRISTOVOMER
A
Hucho
Hucho
S. fontinalis
S. namaycush
S. namaycush
S. fontinafis
S. confluentus
SALVELINUS
C
S. albus
S. confluentus
S. leucomaenis
S. malma
S. malma
S. alpinus Hucho
S. alpinus Hucho
S. fontinalis
S. larsoni*
S. namaycush
S. leucomaenis
S. leucomaenis
S. fontinalis
B
S. leucomaenis
D
S. confluentus
S. confluentus
S. malma (Northern AK)
S. namaycush
S. malma (Southern AK)
S. malma
S. malma (Japan)
S. alpinus
S. alpinus (Southern AK) S. alpinus (Norway)
Suggested relationships among chars of the genus Salvelinus: (A) Behnke .(1984), (B) Stearley (1990), (C) Cavender and Kimura (1989), and (D) Crane et al. (1994).
FIGURE 6
streams in eastern and southern North America; and the Arctic char complex, named subgenus Salvelinus. The subgenus Salvelinus includes S. alpinus (Arctic char) with a circumpolar distribution in the Arctic, S. malma (Dolly Varden char) which occurs sympatrically with S. alpinus in the North Pacific, and S. confluentus (bull trout) which is found in the North American Rocky Mountains. Initially, S. confluentus was confused with S. malma because they are very similar in external appearence, but a detailed osteological analysis by Cavender (1978, 1980) showed that they are distinct species. In North America, S. malma is found along the coast from the Olympic peninsula through northern Alaska and S. confluentus is found inland from southern Washington to the Yukon Territory (Haas and McPhail, 1991). In eastern Asia, another distinct species, S. leucomaenis (Japanese char), was considered to be more closely related to S. namaycush based on morphology and life history data by Savvaitova (1980) and Viktorov-
sky (1975), but was placed in the subgenus Salvelinus by Behnke (1965). Many additional char species and subspecies have been named by various Russian researchers (Glubokovsky, 1976; Glubokovsky and Chereshnev, 1982; Glubokovsky and Frolov, 1994), but Savvaitova (1980, 1995) makes a convincing argument that most of these are recently derived forms of the S. alpinus-S, malma complex. Savvaitova (1995) suggests that Salvethymus svetovidovi from Lake Elgygytgyn should be recognized as an additional species in the genus Salvelinus. The description of a fossil (Salvelinus larsoni, formerly Paleolox larsoni) with characteristics of both Parahucho and Salvelinus (Smith et al., 1982) suggested that P. perryi might be a good outgroup for the phylogenetic analysis of Salvelinus species. A close relationship has been confirmed by genetic distances based on allozymes and molecular data. A cladistic analysis of Salvelinus using osteological characters (Stearley, 1992)
10. Salmoninae has produced the cladogram shown in Fig. 6B. There is weak support for most of the nodes, and the placement of S. namaycush and S. conJluentus as sister species is not supported by other data. In fact, a different cladogram for Salvelinus was obtained by Cavender and Kimura (1989) based on osteological data which places S. confluentus as a sister species with S. leucomaenis from Asia (Fig. 6C). Cavender and Kimura (1989) obtained the same cladogram for Salvelinus using karyotype data as they did with osteological data, and this was similar to the one obtained by Phillips et al.(1989) using karyological data. Crane et al. (1994) completed a comprehensive study of allozymes in this genus and analyzed data using the Fitch-Margolish algorithm (Fig. 6D). This study has confirmed a close relationship between S. malma and S. alpinus and between S. leucomaenis and S. confluentus. S. fontinalis was closest to the outgroup, P. perryi, followed by S. namaycush.
151 Hucho S. fontinalis S. leucomaenis S. namaycush
A
S. malma
S. alpinus alpinus S. alpinus erythrinus S. confluentus Hucho S. namaycush 99-
S. fontinafis
2. Molecular Data from Mitochondrial Genes
Trees based on data from mitochondrial genes differ from those based on allozymes and morphological data. The analysis based on restriction site data of the mtDNA molecule (Grewe et al., 1990) found a close relationship among S. alpinus, S. malma, and S. confluentus, but it did not include the Asian species S. leucomaenis. McVeigh and Davidson (Memorial University, Newfoundland, personal communication) have examined 289 bp of the cytochrome b gene in all six species with P. perryi as an outgroup. Their data suggest that S. malma, S. alpinus, and S. confluentus are very closely related (1- to 2-bp differences), but none of them are closely related to the Asian species S. leucomaenis (13-bp differences from the others). The 451-bp mtND3 gene has been sequenced from all of the Salvelinus species, including one member of each of the subspecies in the S. alpinus-S, malma (Arctic char-Dolly Varden) complex (Phillips et al., 1995b). The ND3 gene is evolving faster than cytochrome b in all of the Salvelinus species and P. perryi (Phillips et al., 1995b). However, almost all of the nucleotide differences among species within the genus are transition mutations, suggesting that the mtDNA is not very diverged. Phylogenetic analysis of the ND3 sequence data confirms the close relationship among S. malma, S. alpinus, and S. confluentus, but S. namaycush is located on a branch next to S. alpinus. A maximum parsimony analysis of all of the available mtDNA data produced a tree in which most of the branches were unresolved, suggesting that recent genetic contact has occurred among these species. In the neighbor-joining tree based on these combined mtDNA data (Fig. 7A), bull trout is placed within the S. alpinus-S, malma complex. These results
B
s confluentus 97-
S. leucomaenis 60
S. malma 00-
S. alpinus Relationships among species of the genus Salvelinus based on molecular data (A) Neighbor-joining tree calculated using MEGA based on the two-parameter distance of Kimura (1980) from the combined sequences of the mtND3 gene and a portion of the cytochrome b gene (data from Domanico and Phillips, 1995; H. P. McVeigh and W. S. Davidson, personal communication). Numbers represent bootstrap percentages based on 500 replications. (B) Majority-rule cladogram based on a maximum parsimony analysis of the sequences from the ITS1 and ITS2 of the nuclear ribosomal DNA (Phillips et al., 1994) using the branch and bound search option of PAUP. Numbers represent bootstrap values based on 500 replications. FIGURE 7
conflict with those obtained from allozymes and nuclear DNA sequences from the ribosomal DNA spacers (see later). 3. Molecular Data from Nuclear Genes
When the internal transcribed spacers of the ribosomal DNA were sequenced from the six species of Salvelinus, they were found to be more divergent than the mtDNA ND3 gene in these same species (Phillips et al., 1994; Table IV). These sequences were easy to align because the sequence divergence between any two species pairs did not exceed 7%. Most of the length variation was produced by runs of G's and C's which show
152
RUTH B. PHILLIPSAND TODDH. OAKLEY TABLE I V
Pairwise Nucleotide Differences in Selected Genes in Salmonid Fishes
Selected Genesa Species Pink/sockeye Rainbow/sockeye Hucho/lake trout Lake trout/Arctic char Arctic char-NWT/Alaska
mtDNA ND3b mtDNAcyt b c 0.091 0.117 0.111 0.034 0.017
0.048 0.093 0.070 0.025 0.010
rDNAITS1-2~,e 0.024, 0.062 0.037, 0.073 0.87, 0.137 0.050, 0.065 0.03
GH-IIC~,/ 0.020, 0.017 0.028, 0.027 0.020, 0.037 0.02, 0.03
amtDNA, mitochondrial DNA; rDNA ITS1-2, ribosomal DNA internal transcribed spacers 1 and 2; GH2, growth hormone 2. bData from Thomas and Beckenbach (1989) and Domanico and Phillips (1995).
intraspecific variation and were eliminated from the analysis. The single most parsimonious tree based on an analysis of combined data from the ITS1 and ITS2 is shown in Fig. 7B. In this tree there are three pairs of sister taxa: S. alpinus and S. malma, S. confluentus and S. leucomaenis, and S. fontinalis and S. namaycush. Bootstrap values indicate strong support for the three sister groups, but relatively weak support for the arrangement of these subgroups. The same tree topology was obtained using the neighbor-joining method and m a x i m u m likelihood methods from PHYLIP (Felsenstein, 1992). The topology of this tree is identical to the topology of the trees obtained by Cavender and Kimura (1989) (Fig. 6C) with osteological and karyological data. The tree obtained from the analysis of the sequences of the ribosomal transcribed spacers (Fig. 7B) is similar to the one obtained from allozyme data (Fig. 6D), except that S. fontinalis and S. namaycush are on the same branch rather than on adjacent branches of the tree. It is important to note that strong support was obtained from both ITS1 and ITS2 sequences for a sister relationship between these two species and that synapomorphic sites were found to be distributed fairly evenly along both sequences (Phillips et al., 1994). The GH-IIC intron in S. namaycush and S. alpinus has been sequenced. The introns of S. namaycush and S. alpinus are 625 bp and have a 171-bp insertion not found in P. perryi, the outgroup species for the genus. The introns of the other species have been amplified and the sizes noted. This insertion is apparently not present in the intron of S. leucomaenis, which is approximately the same size as the intron in P. perryi (425 bp). The sizes of the introns in the other species suggest that the 171-bp insertion is also present in S. malma and S. fontinalis, the sister species to S. alpinus and S. namaycush, respec-
tively, and is absent in S. confluentus, which is consistent with its being the sister species to S. leucomaenis. These data support the tree derived from rDNA data (Fig. 7B), including the hypothesis that S. leucomaenis and S. confluentus are the basal genera in Salvelinus. 4. Analysis of the Conflict b e t w e e n Data from Mitochondrial D N A and Nuclear D N A
The trees based on mitochondrial data differ from those based on nuclear genes. These data include RFLPs of the entire molecule (Grewe et al., 1990), 289 bp of the cytochrome b gene (McVeigh and Davidson, 1991, personal communication), and 351 bp from the ND3 gene (Phillips et al., 1995b). In the trees based on these data, S. confluentus (bull trout) is placed either within the S. alpinus-S, malma complex (Fig. 7A) or in an unresolved polytomy with most the other species, suggesting recent gene flow. This is in contrast to the trees based on nuclear genes (e.g., Fig. 7B) which give strong support to a sister relationship between S. confluentus and S. leucomaenis from Japan. If extensive hybridization had occurred between S. malma and S. confluentus, a small genetic distance between S. confluentus and the species in the S. alpinus-S, malma complex would be expected; this result was obtained for both m t D N A and nuclear genes. Evidence has been obtained for hybridization between S. malma and S. confluentus and also for introgression of mtDNA (McPhail and Taylor, 1995). This is the most likely explanation for the conflict with data from nuclear genes. A hybrid zone between these two species was analyzed using fixed RFLP differences between the two species in two nuclear genes (rDNA ITS1 and the growth hormone gene) and RFLPs in the maternally inherited mtDNA. All of the F1 hybrids had mtDNA from S. confluentus, suggesting unidirectional
10. Salmoninae hybridization. In addition, some fish that were "pure" S. malma in morphology and nuclear genotypes had the mtDNA of S. confluentus. Introgression and fixation of the mtDNA genome of S. alpinus (Arctic char) in an allopatric population of S. fontinalis (brook trout) have also been reported (Bernatchez et al., 1995). The fish were unambiguously identified as brook trout from morphology and allozyme data. Restriction analysis of the entire mitochondrial genome from 48 fish from Lake Alain with eight enzymes and the ND 5/6 segment with three enzymes confirmed that these fish have the mtDNA typical of S. alpinus in the area, even though the latter species is not present in the watershed today. Finally, evidence for hybridization between S. namaycush and S. alpinus in the high Arctic has been reported (Wilson and Hebert, 1993) and may be the reason for the small genetic distance between these two species based on mtND3 sequences. In summary, maximum parsimony analysis of combined mtDNA data for all of the Salvelinus species generates a tree in which most of the branches are unresolved, suggesting recent gene flow among these species. In contrast to this result, mtDNA data contained many informative sites for the relationships among Pacific salmon, although the nuclear GH-IIC intron appears to be evolving at the same rate in both genera (Table IV). It is well known that mtDNA can be "captured" by closely related species as a result of introgressive hybridization (reviewed in Avise, 1994). As described earlier, several examples are best explained by unidirectional introgression of mtDNA following hybridization for different Salvelinus species pairs. Therefore, the authors believe that the tree based on the nuclear rDNA sequences is probably the correct one, especially since it is consistent with other genetic data, including allozymes and the GH-IIC intron. If hybridization has occurred repeatedly among these chars, it would make it difficult to determine the correct branching pattern from mtDNA data. Analysis of sequence data from other nuclear genes such as the other GH introns should be done to confirm the phylogeny based on rDNA sequences. This is important because there have been a few cases described in other organisms in which concerted evolution of nuclear ribosomal sequences has resulted in only one of the parental rDNA types being retained in hybrids (Hillis et al., 1991; Wendel et al., 1995).
5. Relationships among Taxa in the S. alpinusS. malma complex The relationships among the various populations of S. alpinus and S. malma in the North Pacific to each other and to the other Asian chars have been con-
153
troversial (reviewed in Behnke, 1980; 1989; Cavender, 1980; Johnson, 1980; Glubokovsky and Frolov, 1994; Savvaitova, 1995). In Europe and Asia, a large number of species and subspecies have been named by different authors (Glubokovsky and Chereshnev, 1982; Glubokovsky and Frolov, 1994), whereas Savvaitova (1980, 1989, 1995) has maintained that these would be more accurately classified as morphotypes in the S. alpinusS. malma complex. Savvaitova (1995) suggests that the high Arctic, malmoid and alpinus morphotypes may have evolved repeatedly in many areas throughout the region. One of the reasons for the large number of subspecies is the tendency of S. alpinus to diverge into morphologically and ecologically different forms within a single lake. Lakes with two to four morphologically divergent forms of char are found throughout their circumpolar range. Genetic evidence suggests that most of these forms have diverged since the last glaciation (reviewed in Behnke, 1989; Savvaitova, 1989, 1995). Because there are very little molecular data available yet on the Russian chars, this chapter will only discuss North American species. Behnke (1984) has summarized the evidence for distinct S. alpinus and S. malma species groups, which were probably separated during an early Pleistocene glacial period when the ancestral char was divided into a northern (S. alpinus) and a southern (S. malma) group. The two species, which occur sympatrically throughout the North Pacific basin, differ in morphology and life history and are reproductively isolated. The lack of gene flow between S. alpinus and S. malma has been confirmed by the finding of fixed allozyme differences by Gharrett et al. (1991) between sympatric populations of the two species in Karluk Lake, Alaska, and by Reist et al. (1997) between allopatric Alaskan populations. In addition, the karyotype of S. malma (2n = 82) is different from S. alpinus (2n = 78) in regions where the two species are sympatric (Cavender and Kimura, 1989; Phillips et al., 1989; unpublished results). In North America, S. malma and S. alpinus are further subdivided by Behnke (1984) into several subspecies (Table V). Behnke's (1984) suggestion of a close relationship between eastern North American (S. alpinus oquassa) and western European (S. alpinus alpinus) subspecies is supported by allozyme data that show that the S. alpinus population from New Brunswick is more closely related to the population from Norway than to other forms of S. alpinus (Crane et al., 1994). S. malma, which is almost always anadromous in North America, is divided into a northern (S. malma malma) and southern form (S. malma lordi), which Behnke (1984) and McPhail (1961) believe were isolated by a later glacial episode. The southern form has fewer gill rakers and fewer vertebrae than the northern
154
RUTH B. PHILLIPSAND TODD H. OAKLEY TABLE V North American Taxa in the S. alpinus-S, malma Complex
Species
Subspecies
Range
S. alpinus
oedessa erythrinus taranetzi
New England lakes High Arctic, east of McKensie River Lakes in Alaska and Russia, west of McKensie River Southcentral Alaska-British Columbia, the "southern" form Northern Alaska, Russia, the "northern" form
S. malma
lordi malma
S. namaycush S. a. alpinus --96S. m. lordi
--63-
S. a. taranetzi S. albus 100-
S. m. malma S. a. alpinus
form which has a different karyotype (2n = 78) (R. B. Phillips, unpublished observations). Crane et al. (1994) found a fixed difference at one allozyme locus between the northern and southern forms of S. malma in North America. The northern S. malma malma was originally considered a "western Arctic" form of S. alpinus by McPhail (1961) and McPhail and Lindsey (1970). Detailed meristic studies by Morrow (1980) and McCart (1980) and combined allozyme and meristic studies by Reist et al. (1997) suggested that this form is more like S. malma lordi from south central Alaska than the high Arctic S. alpinus erythrinus, although some federal agencies still consider it a form of S. alpinus. 6. Molecular Data from m t D N A and rDNA for the S. alpinus-S, malma Complex
An RFLP analysis of the rDNA from several individuals from at least two populations of each subspecies of S. alpinus showed that there is a restriction site difference between S. alpinus erthyrinus and S. alpinus taranetzi in the 5' external spacer (5' ETS) of the ribosomal DNA which is adjacent to the 18S coding region (Phillips and Pleyte, 1991). In order to further examine subspecific relationships, the two internal transcribed spacers of the rDNA (the ITS1 and ITS2) were sequenced from one individual of each of Behnke's proposed subspecies (Table V) of S~ alpinus and S. malma in North America (Phillips et al., 1995b). The sequence of the ITS2 was very conserved and contained no informative sites. There were six phylogenetically informative sites in the ITS1, and a maximum parsimony analysis of the data has produced a tree in which there are three subgroups: (1) S. alpinus alpinus; (2) S. alpinus erythrinus, S. alpinus taranetzi, and S. malma lordi; and (3) S. malma malma and S. albus from the Kamchatka River (Fig. 8). S. albus was included because Behnke (1965) had suggested that it was closely related to S. confluentus. The branch with S. albus and S. malma malma was supported by five synapomorphies (a bootstrap value of 100%) and is the only branch with significant sup-
FIGURE 8 Relationships among members of the S. alpinus-S. malma complex based on molecular data. Majority-rule cladogram based on a maximum parsimony analysis of sequences of the nuclear ribosomal ITS1 using the branch and bound search option of PAUP. Numbers are bootstrap values based on 500 replications (Phillips et al., 1995a).
port. This result suggests that S. albus from the Kamchatka River is closely related to S. malma malma and is not a sister species to S. confluentus as Behnke (1984) hypothesized. This result is also consistent with allozyme data (Osinov, personal communication), karyological data (Frolov, 1991), and osteological data (Glubokovsky and Chereshnev, 1982). The authors are currently obtaining additional sequence data from the ITS1 of Russian chars, which should allow them to evaluate the various hypotheses regarding Asian and North American species. Because the North American populations are closely related, it was thought that a rapidly evolving region such as the ND3 gene from the mtDNA might be more appropriate for determining subspecific relationships. The 351-bp mtND3 gene was sequenced from each of the six different subspecies of the S. alpinus-S, malma complex (Phillips et al., 1995b). Results show that the sequence of the ND3 gene from each of the three geographically separate subspecies of S. alpinus differs from the others at 3 - 5 bp, but the sequences of all of the subspecies of S. alpinus and S. malma from northwestern North America are identical, except for that of the southern S. malma lordi, which differs at I bp from the others. Thus there is support for the subspecific designation of the two forms of S. malma from both nuclear (ITS1) and mitochondrial genes (ND3). It is surprising that the sequences of the rDNA spacers are more diverged than the mitochondrial ND3 gene in the Salvelinus species. This is in contrast to the Oncorhynchus species, in which the ND3 gene is evolving three times faster than the rDNA spacers (Table IV). The authors propose that the rate of evolution of rDNA spacers may be faster in species (e.g., S. alpinus) with
10. Salmoninae the rDNA at multichromosomal locations compared to species (e.g., Oncorhynchus sp.) in which all of the copies of the rDNA are found at one chromosomal location (Phillips et al., 1995b). However, the apparently slower rate of evolution of the mtDNA in Salvelinus may also be the result of introgression of mtDNA. The rate of sequence divergence in the GH-IIC intron is similar in Oncorhynchus and Salvelinus (Table IV) so both hypotheses may be correct. Analysis of mtDNA of sympatric morphotypes of S. alpinus found in the same lake has also revealed that such morphotypes are very closely related. The four morphotypes found in Thingvallvatn (Lake), Iceland, share polymorphisms in mtDNA (Volpe, personal communication). The benthic and pelagic morphotypes from Loch Rannoch had identical cytochrome b sequences. An RFLP analysis of the mtDNA genome revealed that the two morphotypes were identical except for a polymorphism with HindIII in which the A pattern was fixed in the benthic fish and 90% of the pelagic fish had the B pattern (Hartley et al., 1992).
C. Relationships among Pacific Trouts of
the Genus Oncorhynchus 1. Summary of Morphological, Karyological, and Allozyme Data Pacific trouts were originally classified in the genus Salmo, although morphological and life history similarities with the Pacific salmon were noted by Regen (1914). As a solution to this problem, Vladykov (1963) proposed that Pacific trouts be included in a new subgenus, Parasalmo. Pacific trouts and Pacific salmon form a monophyletic group in trees based on allozyme data (Tsuyuki and Roberts, 1966). A reconsideration of all the data led Smith and Stearley (1989) to propose inclusion of the Pacific trouts in the genus Oncorhynchus; this change was accepted by the AFS-ASIH Committee on Names of Fishes in 1988. The relationships among Pacific trouts have been reviewed (Behnke, 1992; Smith and Stearley, 1989; Stearley and Smith, 1993; Utter and Allendorf, 1994). A summary of the data, referred to as a "consensus tree" (Fig. 9), has been prepared by Utter and Allendorf (1994), although they did not use any algorithm in its construction. Morphological data support the monophyly of the four subgroups of Oncorhynchus clarki (cutthroat trout), which together form a sister group to Oncorhynchus mykiss (rainbow trout) and its allied species. However, allozyme data place O. mykiss within a group of O. clarki lewisi (westslope cutthroat trout), O. clarki henshawi (Lahontan cutthroat trout) and O. clarki clarki (coastal cutthroat trout) that is distinct from the O. clarki bouvieri (Yellowstone cutthroat group) (Allen-
155 O. gorbuscha O. nerka O. keta O.
tschawytscha
O. kisutch O. masou O. mykiss O. chrysogaster O. apache O. gilae O. clarki lewisi O. c. henshawi O. c. clarki O. c. bouvieri O. c. pleuriticus O. c. stomias O. c. utah O. c. virginalis
FIGURE9 Consensustree of relationships amongspeciesin the genus Oncorhynchus(from Utter and Allendorf, 1994). No mathematical algorithmwas used in the construction of this tree.
dorf and Leary, 1988). This has been interpreted as the result of introgression between O. mykiss and O. clarki. Among the three major karyotypic groups in O. clarki, the coastal group has 2n = 68, the west slope group has 2n = 66, and the other two groups have 2n = 64 (reviewed in Behnke, 1992). The diploid chromosome number of O. mykiss varies from 2n = 58-64, with the populations having 2n = 64 found only on the California coast (Thorgaard, 1983). Morphological data have shown that O. apache (Apache trout), O. gilae (Gila trout), and O. chrysogaster (Mexican golden trout) have primitive characters not present in O. mykiss (reviewed in Smith and Stearley, 1989), whereas allozyme data place them all together in a monophyletic group.
156
RUTH B. PHILLIPSAND TODDH. OAKLEY
2. Molecular Data from m t D N A Genes
Mitochondrial RFLP data support the monophyly of the O. clarki group (R. N. Williams, Boise State University, personal communication), but place O. mykiss in a group with two Pacific salmon: O. kisutch (coho salmon) and O. tshawtyscha (chinook salmon) (Thomas et al., 1986; McVeigh and Davidson, 1991). Shedlock et al. (1992) have sequenced the D loop of Pacific trout and salmon but did not find any synapomorphies between O. clarki and O. mykiss. However, examination of transition/transversion data (Table VI) suggests that the mtDNA D loop may be too diverged to be useful for phylogenetic analysis of the genus Oncorhynchus. The percentage sequence divergence is similar for all pairwise comparisons, even if only tranversions are used. Data in Table VI show that the protein coding genes, ND3 and ATPase6, are evolving more slowly and that O. clarki and O. mykiss form a clade in the tree based on these data (Thomas and Beckenbach, 1986).
tive sites were available to resolve the relationships within the Pacific trout group.
D. R e l a t i o n s h i p s a m o n g the Pacific S a l m o n
of the Genus Oncorhynchus 1. Summary of Morphological, Karyological, and Allozyme Data
The major point of uncertainty in the phylogeny of the Pacific salmon is the relationship of the three most derived species, shown as a trichotomy in the consensus tree prepared by Utter and Allendorf (1994). Trees based on morphological data (Stearley, 1992) and allozyme data (Utter et al., 1973) support a sister relationship between O. gorbuscha (pink salmon) and O. nerka (sockeye salmon), whereas life history and the mtDNA RFLP studies (see later) support a sister relationship between O. gorbuscha (pink salmon) and O. keta (chum salmon) (reviewed in Smith, 1992).
3. Molecular Data from Nuclear Genes
2. Molecular Data from m t D N A
Phillips et al. (1992) have examined RFLPs in the nuclear ribosomal DNA from 17 salmonid species including O. mykiss and O. clarki and found one site not shared by Pacific salmon, but this site is also shared by two species of char. Blackhall (1994) sequenced intron D of both growth hormone genes from O. mykiss, O. clarki bouvieri, and O. clarki lewisi. A phylogenetic analysis was done on sequence data from these three taxa along with two Pacific salmon (O. kisutch and O. tshawytscha) and Salmo salar (Atlantic salmon). The GH-I introns form one monophyletic group and the GH-II introns form another group. This is expected because salmonids are ancestral tetraploids (reviewed in Allendorf and Thorgaard, 1984). Within each group of introns, there is support for an O. mykiss-O, clarki clade with bootstrap values of 93 and 94%. Not enough informa-
In contrast to morphological and allozyme data, phylogenetic analysis of RFLPs of the mtDNA from the five Pacific salmon species of North America (Thomas et al., 1986) gave strong support to a sister relationship between O. gorbuscha and O. keta with five out of seven synapomorphies supporting this relationship. Unlike the RFLP results, analysis of data from the mt D-loop generated by Shedlock et al. (1992) gave weak support to a sister relationship between O. gorbuscha and O. keta. Because one would expect congruence among trees based on data from the same molecule (mtDNA), sequence data from two additional mtDNA genes were examined (Domanico and Phillips, 1995). Because mitochondrial genes evolve rapidly, T s / T v ratios and percentage sequence divergence between outgroups and closely related species were compared
TABLEVI Pairwise Nucleotide Differences in mtDNA Genes between Pacific Salmon Species (Transitions/Transversions)
Species comparison Gene
Rainbow/ coho
D-loopa ATPase6b ND3b Cytochromebc
19/9 36/8 28/8 11/1
Rainbow/ pink 37/19 44/13 35/6 20/7
Rainbow/ sockeye 28/15 36/13 31/7 13/3
Chinook/ coho
Pink/ sockeye
Pink/ chum
20/10 28/5 17/4
25/19 37/12 27/5 14/9
40/23 26/1 31/2
aData from Shedlocket al. (1992). bData from Thomas and Beckenbach(1989)and Domanicoand Phillips (1995). cData from McVeighand Davidson (1991).
10. Salmoninae (Table VI) to determine which genes would be best for a combined analysis. Inspection of data from the D-loop showed that the percentage sequence divergence was similar for all taxa throughout the matrix, suggesting saturation due to multiple hits. The D-loop analysis was based on sequences from only one individual per species and considerable intraspecific variation in the D-loop sequences of O. mykiss has been found (J. Nielsen, personal communication). Sequences with a high frequency of intraspecific variation are probably not appropriate for interspecific phylogenetic analysis unless large numbers of individuals are sampled for each taxon, so these data were not used in the combined analysis of mtDNA sequence data. Thomas and Beckenbach (1989) had sequenced several genes including ATPase6, ND3, and COII from most of the Pacific salmon but did not include O. keta. Examination of Ts / T v ratios and number of informative sites found in these genes suggested that the ND3 and ATPase6 genes might be the most useful for a phylogenetic analysis of Oncorhynchus. A 512-bp portion of the ATPase6 gene from all six species was sequenced, including both Asian and North American representatives of O. gorbuscha, O. nerka, and O. keta. The ND3 gene was also sequenced from Asian and North American representatives of the three derived species and a combined analysis was done on these data (Domanico and Phillips, 1995). They are in agreement with RFLP data and give strong support for the sister relationship between O. gorbuscha and O. keta (Fig. 10A). In the calculation of the neighbor-joining tree using MEGA (Kumar et al., 1993) and the m a x i m u m likelihood tree using PHYLIP (Felsenstein, 1992), corrections were made for the T s / Tv ratio for the ND3 gene, which is evolving a little faster than ATPase6 in the Oncorhynchus species. If this correction is not made, ND3 data give weak support to the O. gorbuscha-O, nerka clade. An analysis of RFLP data from Russian species (Ginatulina et al., 1988) gave different rates of divergence among the three advanced salmon compared to a similar study on North American salmon (Thomas et al., 1986). This result and the fact that mtDNA divergence times based on 1-2% sequence divergence per million years were not consistent with estimates of species divergence based on fossil evidence (reviewed in Smith, 1992) led Smith to suggest that hybridization and introgression had occurred in opposite directions in Asia and North America. However, our sequence data have not provided any support for this hypothesis. We have found that the sequences of the ND3 and ATPase6 genes were either identical or 1 bp different between Asian and North American representatives of the same species. More recent work on m t D N A divergence rates in fishes (reviewed in Martin et at., 1992)
157 O. masou
87 98
A
9.1
621
89 86
O. kisutch O. tschawytscha O. gorbuscha O. kern O. nerka O. mykiss
100 94
O. kisutch O. tschawytscha
B 98
100 I 100
O. gorbuscha O. keta O. nerka
FIGURE 10 Relationships among members of the Pacific salmon based on molecular data. (A) This tree was based on a maximum parsimony analysis of sequence data from the ITS1 and ITS2 of the rDNA using PAUP (bootstraps below the node) (Domanico, 1994) and a neighbor-joininganalysis of the two-parameter distance of Kimura (1980)based on combined data from the mtND3 and ATPase6 genes using MEGA(bootstraps above the node) (Domanicoand Phillips, 1995). (B) Maximum parsimony tree based on combined sequence data from nuclear genes (GH-IIC and GH-IID; ITS1 and ITS2 of the rDNA) and mitochondrial genes (ND3, ATPase6) (Domanico et al., 1996). Bootstraps above the node are from a neighbor-joining analysis (MEGA) whereas bootstraps below the node are based on a maximum parsimony analysis using PAUP. Bootstrap values are based on 500 replications.
indicates that rates of evolution may be up to five times slower in fishes, so that fossil data suggesting a minim u m time of 5.5-5.7 million years divergence for these Pacific salmon species is consistent with mtDNA sequence data. 3. Molecular Data from Nuclear Genes
A phylogenetic analysis of restriction map data on the ribosomal DNA for the Pacific salmon (Phillips et al., 1992) produced very weak support for a sister relationship between O. gorbuscha and O. nerka. In order to obtain more definitive data, the ribosomal transcribed spacers (the ITS-1 of 575 bp and the ITS-2 of 375 bp) were sequenced for the five species of Pacific salmon. The aligned sequences were analyzed using m a x i m u m parsimony with O. masou (masu salmon) or
158
RUTH B. PHILLIPS A N D TODD H. OAKLEY
O. mykiss (rainbow trout) as the outgroup. One best tree (Fig. 9A) was obtained pairing O. kisutch and O. tshawytscha, and O. gorbuscha and O. keta. Strong support for the gorbuscha-keta relationship was also obtained from a phylogenetic analysis of data from the GH-IID intron. The intron ranged in size from 1166 to 1376 bp in these species (with the exception of the O. masou intron, which is 634 bp) (McKay et al., 1996). Thus sequence data from nuclear genes support combined mtDNA data in suggesting that O. gorbuscha and O. keta are sister species. A phylogenetic analysis was done on a combined data set of 3702 bp of aligned sequence from nuclear and mitochondrial genes including data from the internal transcribed spacers (ITS1 and ITS2) of the nuclear ribosomal DNA, the sequences of two introns of the second nuclear growth hormone gene (GH-IIC and GH-IID) (Devlin, 1993; Duet al., 1993; Forbes et al., 1994; McKay et al., 1996), and the sequences of the ATPase6 and ND3 genes of the mtDNA (Domanico and Phillips, 1995). This analysis gives strong support to the tree obtained from the mitochondrial sequence data (Domanico et al., 1996) (Fig. 10B). Support for the O. gorbuscha-O, keta sister relationship is also obtained by analysis of three families of salmonid tRNA-derived SINE sequences (Kido et al., 1991). The HpaI family is found in all salmonids, the FokI family is found only in species of the genus Salvelinus, and the SmaI family (Kido et al., 1991) is found only in O. gorbuscha and O. keta. The consensus phylogenetic tree obtained for Pacific salmon was also supported by an analysis of subfamilies of the HpaI sequences and their genomic locations (Murata et al., 1993, 1996). For example, the HpaI-51 subfamily was found only in O. kisutch and O. tshawytscha, whereas the Hpa-19 subfamily was found only in O. gorbuscha, O. keta, and O. nerka. Finally, the O. gorbuscha-O, keta sister relationship is also supported by shared life history traits (reviewed in Stearley, 1992). Karyotype data are inconclusive. Although the chromosome number of O. gorbuscha (2n = 52) is closer to that of O. nerka (2n = 56) than O. keta (2n = 74) (Simon, 1963), chromosome arm homologies are not known. Reexamination of morphological (Stearley, 1992) and allozyme data (Utter et al., 1973) reveals that the results were based on only a few informative characters so that the trees were not strongly supported. There was only one synapomorphy between O. gorbuscha and O. nerka in allozyme data, and the four synapomorphies in the morphological data set involved traits that were age, size, or trophic dependent. The one synapomorphy in the rDNA RFLP data involved an ApaI site in the intergenic spacer region (IGS). ApaI recognizes the sequence CCCGGG, and intraspecific variation has been found in runs of these bases in the
ITS1 and ITS2, which are now routinely eliminated from interspecific comparisons. Examination of the GH-IIC sequences revealed that there were very few informative sites for the relationships among the three advanced Pacific salmon (Domanico, 1994), although the longer GH-IID intron contained enough informative sites to generate a tree with high bootstrap values (McKay et al., 1996). It has been concluded that the three species have branched off closely in time, but that O. gorbuscha and O. keta are sister species. The hypothesis that the three species diverged within a short time period is in agreement with the fossil data of Smith et al. (1982), which show that lineages resembling each of the three modern species (O. gorbuscha, O. nerka, and O. keta) were present 5 - 7 million years ago.
II. C o n c l u s i o n A major concern in molecular systematics is to find genes or gene regions that are evolving at rates appropriate for the systematic problems in question (Graybeal, 1994). Table IV shows a comparison of sequence divergence for several nuclear and mitochondrial genes in salmonid fishes. The nuclear ribosomal DNA spacer sequences are most useful at the interspecific and subspecific levels; the growth hormone introns are more appropriate for more distantly related species and intergeneric relationships. The rDNA spacer regions appear to be evolving at a faster rate in the genus Salvelinus, which has a multichromosomal location of rDNA compared with Oncorhynchus species, which have all of the copies at one chromosomal location. Among the protein-coding genes of the mitochondrial DNA, the ND3 and ATPase6 contain more informative sites for closely related species than the cytochrome b gene and are useful at the interspecific level if corrections for Ts / Tv ratios are made. Evaluation of data from the mtDNA D-loop data from the Pacific salmon suggests that it is too saturated with homoplasic nucleotide changes to be useful for interspecific comparisons. Molecular data have been used successfully to resolve several systematic problems in the Salmoninae. The genus Parahucho has been validated and relationships among species in Salvelinus and Oncorhynchus clarified. In general, nuclear DNA data have been in agreement with allozyme data in cases where enough informative sites are available. Several areas of disagreement between mtDNA and nuclear data have been found. These cases can best be explained by hybridization and introgression. When hybridization occurs infrequently but over a long period, the result can
10. Salmoninae
be m o r e dramatic on the m i t o c h o n d r i a l genome, w h i c h is inherited b y all of the offspring. This has a p p a r e n t l y h a p p e n e d in the chars. The sequences of the mitochondrial genes of several of the N o r t h A m e r i c a n species are m o r e similar than w o u l d be expected f r o m data from nuclear genes. H o w e v e r , if introgression occurred in the past, b u t allopatric p o p u l a t i o n s have been separate for some time, one m i g h t get a better separation of subspecies using m t D N A data. This m a y be the case w i t h Pacific trouts. If accurate p h y l o g e n i e s can be constructed from molecular data, these phylogenies can be u s e d to infer patterns of evolution of other characters, such as morphological, ontogenetic, physiological, cytogenetic, or life history traits (Brooks a n d McLennan, 1991), all of w h i c h are capable of rapid change. Stearley (1992) has discussed the evolution of life history traits in salmonids based on a p h y l o g e n y derived f r o m osteological data. He concludes that there was a freshwater origin of the s a l m o n i d fishes a n d a general trend t o w a r d anadromy. A l t h o u g h the a u t h o r s ' tree based on molecular data differs from Stearley (1992) in some details on the species relationships w i t h i n Salvelinus a n d w i t h i n Oncorhynchus, there are m a n y similarities in the two trees, a n d the a u t h o r s ' tree s u p p o r t s an even closer correlation b e t w e e n p h y l o g e n y a n d life history traits. In fact, P. perryi is the only a n a d r o m o u s species a m o n g the huchen, a n d the a u t h o r s ' analysis of molecular data indicates that it belongs in a separate genus, Parahucho, w h i c h is closely related to the other m o r e a d v a n c e d genera. R. B. Phillips a n d P. Rab (manuscript in preparation) discuss the implications for c h r o m o s o m e evolution in the salmonids, a s s u m i n g that the p h y l o g e n y based on current molecular data is correct. There is a general tend e n c y t o w a r d smaller diploid c h r o m o s o m e n u m b e r s in the m o r e derived taxa w i t h a few exceptions (reviewed in Hartley, 1987). This is consistent w i t h a hypothesis based on data f r o m m a m m a l s (Qumsiyeh, 1994), w h i c h suggests that smaller diploid n u m b e r s m a y be associated w i t h m o r e constant e n v i r o n m e n t s (adult life in the m a r i n e c o m p a r e d to freshwater e n v i r o n m e n t in salmonid fishes). Evidence s h o w s that genetic r e c o m b i n a t i o n is r e d u c e d in o r g a n i s m s w i t h smaller diploid n u m b e r s a n d this could be an a d a p t i v e trait for fish that s p e n d m o s t of their lives in the m o r e constant m a r i n e environment. There are a few cases in w h i c h major chrom o s o m e r e a r r a n g e m e n t s have occurred in one of a pair of closely related species. One example is that of S. trutta (2n = 80), usually confined to fresh water, a n d S. salar (2n = 56), an a n a d r o m o u s species. A n o t h e r is the Salvelinus elgyticus (2n = 76-77) a n d S. svetovidovi (2n = 62), w h i c h are f o u n d in Lake Elgygytgyn. Several other examples of karyotypic divergence a m o n g sym-
159
patric m o r p h o t y p e s have been r e p o r t e d a n d m a y be the result of selection for reproductive-isolating mechanisms b e t w e e n incipient species. Future w o r k in molecular systematics s h o u l d p r o v i d e a f r a m e w o r k for the elucidation of the m e c h a n i s m s involved in the evolution of c h r o m o s o m e s , life histories, and other traits. References
Allendorf, F. W. and Leary, R. F. 1988.Conservation and distribution of genetic variation in a polytypic species, the cutthroat trout. Conserv. Biol. 2:170-184. Allendorf, F. W., and Thorgaard, G. H. 1984.Tetraploidy and the evolution of salmonid fishes. In "Evolutionary Genetics of Fishes" (B. J. Turner, ed.), pp. 1-53. Plenum Press, New York. Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman and Hall, New York. Balon, E. K. 1980. Comparative ontogeny of charrs. In "Charrs: Salmonid Fishes of the Genus Salvelinus" (E. K. Balon, ed.), pp. 703720. Junk Publishers, The Hague. Behnke, R. J. 1965. "A Systematic Study of the Family Salmonidae with Special Reference to the genus Salmo." Doctoral dissertion, University of California Berkeley. Behnke, R. J. 1968. A new subgenus and species of trout: Platysalmo platycephalus from southcentral Turkey with comments on the classification of the subfamily Salmoninae. Mitteilungen Hamburgischen Zoolo. Museum Instit. 66:1-15. Behnke, R. J. 1980. A systematic review of the genus Salvelinus. In "Charrs: Salmonid Fishes of the Genus Salvelinus" (E. K. Balon, ed.), Vol. 1, pp. 441-481. Junk Publishers, The Hague. Behnke, R. J. 1984. Organizing the diversity of the Arctic char complex. In "Biology of the Arctic Charr" (L. Johnson, R. McV.Clarke, and K. E. Marshall, eds.). Proceedings of the International Symposium on Arctic Charr, University of Manitoba Press, Winnipeg. Behnke, R. J. 1989. Interpreting the phylogeny of Salvelinus. Physiol. Ecol. Japan. Special Vol. 1:35-48. Behnke, R. J. 1992. "Native Trout of Western North America." Am. Fish Mon. No. 6. Bernatchez, L., Glemet, H., Wilson, C. C., and Danzmann, R. G. 1995. Introgression and fixation of Arctic char (Salvelinus alpinus) mitochondrial genome in an allopatric population of brook trout (Salvelinus fontinalis). Can. ]. Fish. Aquat. Sci. 52:179-185. Blackhall, W. J. 1994. "A Molecular Study of Introgression between Westslope Cutthroat Trout and Rainbow Trout." M.S. dissertation, University of Alberta, Edmonton, Canada. Brooks, D. R., and McLennan, D. A. 1991. "Phylogeny, Ecology,and Behavior." University of Chicago Press, Chicago, IL. Campton, D. E., and Utter, F. M. 1985.Natural hybridization between steelhead trout (Salmogairdneri) and coastal cutthroat trout (Salmo clarki clarki) in two Puget Sound streams. Can. ]. Fish. Aquat. Sci. 42:110-119.
Cavender, T. M. 1978. Taxonomy and distribution of the bull trout Salvelinus confluentus (Suckley) from the American Northwest. Calif. Fish Game 3:139-174. Cavender, T. M. 1980. Systematics of Salvelinus from the North Pacific Basin. In "Charrs, Salmonid Fishes of the Genus Salvelinus" (E. K. Balon, ed.), Vol. 1, pp. 295-322. Junk Publishers, The Hague. 1980. Cavender, T. M. 1984. Cytotaxonomy of North American Salvelinus. In "Biology of the Arctic Charr" (L. Johnson and B. L. Bums, eds.), pp. 431-445. Manitoba Press, Winnipeg. Cavender, T. M., and Kimura, S. 1989. Cytotaxonomy and interrelationships of Pacific basin Salvelinus. Physiol. Ecol. Japan Special Vol. 1: 49-68.
160
RUTH B. PHILLIPS AND TODD H. OAKLEY
Chereshnev, I. A. 1982. The taxonomic status of sympatric diadromous charrs of the genus Salvelinus (Salmonidae) from eastern Chukotka. J. Ichthyol. 22:22-38. [English translation of Voprosy Ikhtiologii]. Chereshnev, I. A., and Skopets, M. B. 1992. The biology of the charr fishes in the Elgygytgyn Lake. In "The Nature of the Elgygytgyn Hollow," pp. 105-127. FEB Russian Acad. Sci., Magadan. Crane, P. A., Seeb, L. W., and Seeb, J. E. 1994. Genetic realtionships among Salvelinus species inferred from allozyme data. Can. J. Fish. Aquat. Sci. 51(Suppl. 1) 182-197. Devlin, R. H. 1993. Sequence of sockeye type 1 and type 2 growth hormone genes and the relationships of rainbow trout with Atlantic and Pacific salmon. Can. J. Fish. Aquat. Sci. 50:1738-1748. Domanico, M. J. 1994. "Phylogenetic Analysis of Pacific Salmon Using Nuclear and Mitochondrial DNA Sequences." Ph.D. thesis, University of Wisconsin-Milwaukee, Milwaukee, WI. Domanico, M. J., and Phillips, R. B. 1995. Phylogenetic analysis of the Pacific salmon (genus Oncorhynchus) based on mitochondrial DNA sequence data. Mol. Phylogenet. Evol. 4:366-371. Domanico, M. J., Phillips, R. B., and Oakley, T. H. 1997. Phylogenetic analysis of the Pacific salmon (genus Oncorhynchus) using nuclear and mitochondrial DNA sequences. Can. J. Fish. Aquat. Sci. in press. Dorofeyeva, E. A. 1989. The basic principles of classification and phylogeny of the salmonid fishes (Salmoniformes: Salmonoidei: Salmonidae). In "Biology and Phylogeny of Fishes" (V. M. Korovinoi, ed.), pp. 5-16. USSR Academy of Sciences, Proceedings of the Zoological Institute 201, St. Petersburg. [In Russian] Du, S. J., Devlin, R. H. and Hew, C. L. 1993. Genomic structure of growth hormone genes in chinook salmon (Oncorhynchus tshawytscha): Presence of two functional genes GH-1 and GH-2 and a male-specific pseudogene, GH-chi. DNA Cell Biol. 12:739-751. Felsenstein, J. 1992. "PHYLIP Phylogeny Inference Package, Version 3.5." University of Washington, Seattle, WA. Forbes, S. H., Knudsen, K. U, North, T. W., and Allendorf, F. W. 1994. One of the two growth hormone genes in coho salmon is sexlinked. Proc. Natl. Acad. Sci. USA 91:1628-1631. Frolov, S. V. 1991. Karyotypes of Salvelinus malma and S. leucomaenis from North Primorye. Chrom. Inform. Serv. 52:11-14. Frolov, S. V. 1992. Karyotype and chromosomal variability in smallmouth char Salvetinus elgyticus from Lake Elgygytgyn. J. Ichthyol. 32:61-66. Frolov, S. V. 1993. An extraodinarily unique karyotype of an endemic salmonid fish Salvethymus svetovidovi. Tsitologia 329:363-64. [In Russian] Gharrett, A. J., Goto, A., and Yamazaki, F. 1991. A note on the genetic contrast of sympatric Dolly Varden (Salvelinus malma) and Arctic charr (S. alpinus) in the Karluck River system, Alaska. In "Reproductive Biology and Population Genetics of Dolly Varden (Salmonidae)" (F. Yamazaki, ed.), pp. 37-48. Report of Overseas Work Supported by Grant-in Aid for Overseas Scientific Survey of the Ministry of Education, Science and Culture of Japan, during 1987-1990. Available from Dr. Fumio Yamazaki, Faculty of Fisheries, Hokkaido University, Hokodate 041, Japan. Ginatulina, L. K., Shedko, S. V., Miroshnichenko, I. L., and Ginatulin, A. A., 1988. Sequence divergence in mitochondrial DNA from the Pacific salmons. Zhurnal Evot. Biokh. Fiziol. 24:(4), 477-482. [In Russian] Glubokovsky, M. K. 1976. Comparative osteology and systematics of charrs of the genus Salvelinus. In "Sbornik lososevidnye ryby (morphologia, sistematika, ekologia)," pp. 20-21. AN SSSR, Leningrad. [In Russian]. Glubokovsky, M. K., and Chereshnev, I. A. 1982. Unresolved problems concerning the phylogeny of chars (Salvelinus) of the Holarctic. I. Migrating chars of the East Siberian Sea Basin. J. Ichthyol. 21: 1-15.
Glubokovsky, M. K., and Frolov, S. V. 1994. Phylogenetic relationships and classification of chars of Lake El'gygytgyn. J. Ichthyol. 34:128-147. Graybeal, A. 1994. Evaluating the phylogenetic utility of genes: A search for genes informative about deep divergences among vertebrates. Syst. Biol. 43:174-194. Grewe, P. M., Billington, N., and Hebert, P. D. N. 1990. Phylogenetic relationships among members of Salvelinus inferred from mitochondrial DNA divergence. Can. J. Fish. Aquat. Sci. 47:984-991. Haas, G. R., and McPhail, J. D. 1991. Systematics and distributions of Dolly Varden (Salvelinus malma) and bull trout (Salvelinus confluentus) in North America. Can. J. Fish. Aquat. Sci. 48:21912211. Hadzisce, S. 1961. Zur Kentnis des Satmothymus orhidanus (Steindachner) (Pisces, Salmonidae). Intl. Vereinigung theoretische angewandte Limnol. Verhandlungen 14: 785-791. Hartley, S. E. 1987. Chromosomes of salmonid fishes. Biol. Rev. 62: 197-214. Hartley, S. E., McGowan, C., Greer, R. B., and Walker, A. F. 1992. The genetics of sympatric Arctic charr (Salvelinus alpinus (L.)). J. Fish. Biol. 41:1021-1031. Hillis, D. M., Moritiz, C., Porter, C. A., and Baker, R. J. 1991. Evidence for biased gene conversion in concerted evolution of ribosomal DNA Science 251:308-310. Holcik, J. 1982a. Review and evolution of Hucho (Salmonidae). Acta Sci. Nat. Brno 16:1-29. Holcik, J. 1982b. Towards the characteristics of the genera Hucho Gunther, 1866 and Brachymystax Gunther, 1886 (Pisces: Salmonidae). Fol. Zool. 31: 369- 380. Holcik, J. 1988. The Eurasian huchen, Hucho hucho. In "Perspectives in Vertebrate Science," Vol. 5. Junk Publishers, The Netherlands. Hsieh, Chai-Yu, Shan-Wu-Huang and Yun-Yu-Yuan. 1959. Lenok and taimen, their natural hybrids in the Amur river basin. Acta Hydrobiol. Sin. 2:215-220. [In Chinese with Russian summary] Johansen, B., Johnsen, O. C., and Valla, S. 1989. The complete nucleotide sequence of the growth-hormone gene from Atlantic salmon (Salmo salar). Gene (Amsterdam) 77:317-324. Johnson, U 1980. The Arctic charr, Salvelinus alpinus. In "Charrs: Salmonid Fishes of the Genus Salvelinus" (E. K. Balon, ed.), pp. 1598. Junk Publishers, The Hague. Jordan, D. S., and Snyder, J. O. 1902. Review of the salmonid fishes of Japan. Proc. US Natl. Mus. 24:567-593. Kang, Y. S., and Park, E. H. 1973. Somatic chromosomes of the Manchurian trout, Brachymystax lenok (Salmonidae). Chromosome Inf. Serv. 15:10-11 Kendall, A. W., Jr., and Behnke, R. J. 1984. Salmonidae: Development and relationships. In "Ontogeny and Systematics of Fishes" (H. G. Moser, ed.), pp. 142-149. Am. Soc. Ichthyol. Herpetol. Spec. Publ. 1. Kido, Y., Aono, M., Yamaki, T., Matsumoto, K., Murata, S., Saneyoshi, M., and Okada, N. 1991. Shaping and reshaping of salmonid genomes by amplification of tRNA-derived retroposons during evolution. Proc. Natl. Acad. Sci. USA 88:2326-2330 Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evot. 16:111-120. Kumar, S., Tamura, K., and Nei, M. 1993. "MEGA: Molecular Evolutionary Genetics Analysis, Version 1.0." The Pennsylvania State University, University Park, PA. Martin, A. P., Naylor, G. J. P., and Palumbi, S. R. 1992. Rates of mitochondrial DNA evolution in sharks are slow compared with mammals. Nature 357, 153-155. McCart, P. J. 1980. A review of the systematics and ecology of arctic char, Salvelinus alpinus, in the western Arctic. Can. Dept. Fish. Oceans Techn. Rep. 935-989. McKay, S. J., Devlin, R. H., and Smith, M. J. 1996. Phylogeny of Pacific
10. Salmoninae salmon and trout based on growth hormone type-2 (GH2) and mitochondrial NADH dehydrogenase subunit 3 (ND3) DNA sequences. Can. J. Fish. Aquat. Sci., 53:1165-1176. McPhail, J. D. 1961. Study of Salvelinus alpinus complex in North America. J. Fish. Res. Board Can. Bull. 18:793-816. McPhail, J. D., and Lindsey, C. C. 1970. Freshwater fishes of northwest Canada and Alaska. Fish. Res. Bd. Can. Bull. 173-381. McPhail, J. D., and Taylor, E. B. 1995. Final report to Skagit Environmental Endowment Commission on the Skagit Char Project. McVeigh, H. P., and Davidson, W. S. 1991. A salmonid phylogeny inferred from mitochondrial cytochrome b sequences. J. Fish Biol. 39S: 277- 282. Morrow, J. E. 1980. Analysis of the Dolly Varden charr, Salvelinus malma, of northwestern North American and northeastern Siberia. In "Charrs, Salmonid Fishes of the Genus Salvelinus" (E. K. Balon, ed.), pp. 323-338. Junk Publishers, The Hague. Murata, S., Takasaki, N., Saitoh, M., and Okada, N. 1993. Determination of the phylogenetic relationships among Pacific salmonids by using short interspersed elements (SINEs) as temporal landmarks of evolution. Proc. Natl. Acad. Sci. USA 90:6995-6999. Murata, S., Takasaki, N., Saitoh, M., Tachida, H., and Okada, N. 1996. Details of retropositional genome dynamics that provide a rationale for a generic division: The distinct branching of all of the Pacific salmon and trout (Oncorhynchus) from Atlantic salmon and trout (Salmo). Genetics 142:915-926. Norden, C. R. 1961. Comparative osteology of representative salmonid fishes, with particular reference to the grayling (Thymallus arcticus) and its phylogeny. J. Fish. Res. Bd. Can. 8:679-791. Osinov, A. G. 1991. "Genetic Divergence and Phylogenetic Relationships between Lenoks of Genus Brachymystax and Huchens of Genera Hucho and Parahucho." International Symposium on the Biochemical Genetics and Taxonomy of Fish, Belfast, July 1991. [Abstract] Nordic J. Freshwtr. Res. 67:94-95 (1992). Pavlov, D. A. 1980. Peculiarities of embryonic and larval development of Atlantic and Pacific salmons of the genus Salmo with respect to their evolution. Zool. Zhurna159: 569-576. Phillips, R. B., Manley, S. A., and Daniels, T. J. 1994. Systematics of the salmonid genus Salvelinus inferred from ribosomal DNA sequences. Can. J. Fish. Aquat. Sci. 51(Suppl. 1):198-204. Phillips, R. B., Oakley, T. H., and Davis, E. L. 1995a. Support for the paraphyly of the genus Hucho based on ribosomal DNA restriction maps. J. Fish Biol. 47:956-961. Phillips, R. B. and Pleyte, K. A. 1991. Nuclear DNA and salmonid phylogenetics. J. Fish Biol. 39(S) :259-275. Phillips, R. B., Pleyte, K. A., and Brown, M. R. 1992. Salmonid phylogeny inferred from ribosomal DNA restriction maps. Can. J. Fish. Aquat. Sci. 49:2345-2353. Phillips, R. B., Sajdak, S. L., and Domanico, M. J. 1995b. Evolutionary relationships among charrs inferred from ribosomal DNA sequences. Nord. J. Frshwtr Res. 71:378-391. Phillips, R. B., Van Ert, L. M., and Pleyte, K. A. 1989. Evolution of nucleolar organizer regions (NORs) and rDNA in fishes of the genus Salvelinus. In "Biology of Charrs and Masu Salmon" (H. Kawanabe, F., Yamzaki, and D. L. G. Noakes, eds.), Physiol. Ecol. Japan, Spec. Vol. 1. Pleyte, K. A., Duncan, S. D., and Phillips, R. B. 1992. Evolutionary relationships of the salmonid genus Salvelinus inferred from DNA sequences of the first internal transcribed spacer (ITS 1) of ribosomal DNA. Mol. Phylogenet. Evol. 1:223-230. Qumsiyeh, M. B. 1994. Evolution of number and morphology of mammalian chromosomes. J. Hered. 85: 455-465. Rab, P., Slechta, V., and Flajshans, M. 1994. Cytogenetics, cytotaxonomy, and biochemical genetics of Huchonine salmonids. Folia Zool. 43: 97-107. Rab, P., and Liehman, P. 1982. Chromosome study of Danube salmon Hucho hucho (Pisces, Salmonidae). Folia Zool. 31:181-190.
161
Regen, C. T. 1914. The systematic arrangement of the fishes of the family Salmonidae. Ann. Mag. Nat. Hist. 13(Series 8):405-408. Reist, J. D., Johnson, J. D., and Carmichael, T. J. 1997. Variation and specific identity of char from Northwestern Arctic Canada. Am. Fish. Symp. 19 in press. Sanford, C. P. J. 1990. The phylogenetic relationships of salmonid fishes. Brit. Mus. Nat. Hist. (Zool). 56:145-153. Savvaitova, K. A. 1980. Taxonomy and biogeography of charrs in the Paleoarctic. In "Charrs, Salmonid Fishes of the Genus Salvelinus'" (E. K. Balon, ed.), pp. 281-294. Junk Publishers, The Hague. Savvaitova, K. A. 1989. Arctic chars (structure of population systems, perspectives of using reserves in economy). Moscow, Agropromizdat. [In Russian] Savvaitova, K. A. 1995. Patterns of diversity and processes of speciation in Arctic chars. Nordic J. Frshwtr. Res. 71:81-91. Shedko, S. V., and Ginatulina, L. K. 1993. Restriction endonuclease analysis of mitochondrial DNA of the two forms of lenok Brachymystax lenok (Pall.l) and Hucho taimen (Pall.). Genetika 29:779807. Shedlock, A. M., Parker, J. D., Crispin, D. A., Pietsch, T. W., and Burmer, G. C. 1992. Evolution of the salmonid mitochondrial control region. Mol. Phylogenet. Evol. 1:179-192. Simon, R. C. 1963. Chromosome morphology and species evolution in the five North American species of Pacific salmon (Oncorhynchus) J. Morphol. 112: 77-97. Smith, G. R. 1975. Fishes of the Pliocene Glenns Ferry Formation, Southwest, Idaho. In "Claude W. Hibbard Memorial," Vol. 5. Papers on Paleontology No. 14., University of Michigan Museum of Paleontology. Smith, G. R. 1992. Introgression in fishes: Significance for paleontology, cladistics, and evolutionary rates. Syst. Biol. 41:41-57. Smith, G. R., and Stearley, R. F. 1989. The classification and scientific names of rainbow and cutthroat trouts. Fisheries 14: 4-10. Smith, G. R., Swirydczuk, K., Kimmel, P. G., and Wilkinson, B. H. 1982. Fish biostratiography of late Miocene to Pleistocene sediments of the western Snake river plain, Idaho. In "Cenozoic Geology of Idaho" (B. Bonnichsen and R. M. Breckenridge, eds.), pp. 519-541. Idaho Bureau of Mines and Geology Bulletin 26. Stearley, R. F. 1992. Historical ecology of the Salmoninae, with special reference to Oncorhynchus. In "Systematic Historical Ecology and North American Freshwater Fishes" (R. L Mayden, ed.). Stanford University Press, Stanford, CA. Stearley, R. F., and Smith, G. R. 1993. Phylogeny of the Pacific trouts and salmons (Oncorhynchus) and genera of the family Salmonidae. Trans. Am. Fish. Soc. 122:1-33. Svetovidov, A. 1975. Comparative osteological study of the Balkan endemic genus Salmothymus in relation to its classification. Zool. Zhurna154 :1174-1190. [In Russian] Swofford, D. 1993. "PAUP: Phylogenetic Analysis Using Parsimony, Version 3.1." Illinois Natural History Association, Champaign, IL. Thomas, W. K., and Beckenbach, A. T. 1989. Variation in salmonid mitochondrial DNA: Evolutionary constraints and mechanisms of substitution. J. Mol. Evol. 29:233-245. Thomas, W. K., Withler, R. E., and Beckenbach, A. T. 1986. Mitochondrial DNA analysis of Pacific salmonid evolution. Can. J. Zool. 64: 1058-1064. Thorgaard, G. H. 1983. Chromosomal differences among rainbow trout populations. Copeia 1983(3) :650-662. Tsuyuki, H. and Roberts, E. 1966. Interspecies relationships within the genus Oncorhynchus based on biochemical systematics. J. Fish. Res. Bd. Can. 23:101 - 107. Utter, F. M., and Allendorf, F. W. 1994. Phylogenetic relationships among species of Oncorhynchus: A consensus view. Conserv. Biol. 8:864-867.
162
RUTH B. PHILLIPS AND TODD H. OAKLEY
Utter, F. M., Allendorf, F. W., and Hodgins, H. O. 1973. Genetic variability and relationships in Pacific salmon and related trout based on protein variations. Syst. Zool. 22:257-270. Viktorovsky, R. M. 1975. Karyotypes of the kunzha (Salvelinus leucomaenis) and the malma (S. malma) (Salmoniformes, Salmonidae). Zool. Ch. 54: 787- 789. Viktorovsky, R. M., Makoedov, A. N., and Shevchishin, A. A. 1985. The chromosomal sets of Brachymystax lenok and Hucho taimen and the divergency of the salmonid genera. Tsitologia 27:7034709. [In Russian] Vladykov, V. 1963. A review of salmonid genera and their broad geo-
graphical distribution. Trans. Roy. Soc. Can. 1 (Series 4, Section 3): 459-504. Wendel, J. F., Schnabel, A., and Seelanan, T. 1995. An unusual ribosomal DNA sequence from Gossypium gossypioides reveals ancient, cryptic, intergenomic introgression. Mol. Phylogenet. Evol. 4:298-313. Wilson, M. V. H. 1974. "Fossil Fishes of the Tertiary of British Columbia." Ph.D. thesis, University of Toronto, Toronto, Ontario. Wilson, C. C., and Hebert, P. D. N. 1993. Natural hybridization between Arctic char (Salvetinus alpinus) and lake trout (S. namaycush) in the Canadian Arctic. Can. J. Fish. Aquat. Sci. 50:2652-2658.
C H A P T E R
11 Combining Molecular and Morphological Data in Fish Systematics: Examplesfrom the Cyprinodontiformes ALEX PARKER Department of Zoology University of Maine Orono, Maine 04469
primers with broad taxonomic utility (Kocher et al., 1989), molecular systematics has become a very highprofile pursuit, with numerous publications in highly visible fora (e.g., Wainright et al., 1993; Ruvolo et al., 1994) and several journals devoted in whole or part to molecular systematic studies. Arguments have been made, however, that objective systematists must consider all available data in formulating their hypotheses (Carnap, 1950; Donoghue and Sanderson, 1992; Kluge and Wolf, 1993), and in fact efforts to integrate molecular and morphological data have been made since the dawn of molecular systematics (e.g., Gould et al., 1974), including the systematics of fishes (Mickevich and Johnson, 1976). Such efforts have become more common in recent years (e.g., Marshall, 1992; Ernisse and Kluge, 1993; Vrana et al., 1994; Lafay et al., 1994). Curiously, fish systematists have been slow to adopt this approach; although countless publications cite contrasts or congruences between morphological and molecular studies, a literature review revealed only two papers including actual combined analyses of molecular and morphological data (Lydeard et al., 1995; Alves-Gomes et al., 1995). This chapter has two objectives. The first is a review of current theoretical perspectives on the appropri-
I. I n t r o d u c t i o n
Ongoing debate has characterized the interaction of morphological and molecular systematics since the inception of the latter discipline. Some classical systematists have objected to analysis of characters that are acknowledged as merely likely to be homologous and, in general, to the ambiguous use of "homology" and related terms by molecular systematists (Patterson, 1987, 1988). Others have cited indiscriminate use of phenetic rather than cladistic means of phylogenetic inference (McKenna, 1987), including use of methods (e.g., UPGMA) whose assumptions are clearly violated by most molecular data sets (Swofford and O1son, 1990), or argued that character correlation makes sequence-based gene trees no better than single characters contributing to inference of species trees (Doyle, 1992). Molecular systematists have in turn emphasized examples of misleading morphological convergences (Sibley and Ahlquist, 1987), the absence of statistical rigor in many (especially earlier) morphological analyses, and the apparent circularity of inferences about phenotypic evolution based on phylogenies erected by analysis of phenotypes (Sytsma, 1990). With the advent of the polymerase chain reaction (PCR) and conserved
MOLECULAR SYSTEMATICS OF FISHES
163
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
164
ALEX PARKER
ate means of integrating molecular and morphological data. The second objective is application of some of these techniques to analysis of molecular and morphological data sets for two groups of killifishes (order Cyprinodontiformes, suborder Cyprinodontoidei, and family Cyprinodontidae) (Table I). For purposes of this chapter, the term "molecular" will be used to refer to characters derived, either directly (DNA sequences) or indirectly (restriction fragments and allozymes), from
TABLE I
Classifcation of Killifish Taxa a Pertinent to This Text
Cyprinodontiformes Aplocheiloidei Aplocheilidae Nothobranchius melanospilus Rivulidae Cynolebias whitei Rivulus harti Cyprinodontoidei Profundulidae Profundulus guatamatensis Fundulidae Fundulus heteroclitus Poecilioidea Anablepidae Anableps anableps Jenynsia lineata Poeciliidae Poeciliinae Cnesterodon decmmaculatus Poecilia caucana Tomeurus gracilis Xiphophorus maculatus, X. signum Fluviphylacinae Fluviphylax pygmaeus Aplocheilichthyinae Aptocheilichthys kassenjiensis, A. spilauchen Cyprinodontoidea Goodeidae Crenichthys baileyi Xenotoca eiseni Zoogonecticus quitzeoensis Cyprinodontidae Cubanichthyinae Cubanichthys pengetlyi Cyprinodontinae Orestini: Aphanius chantrei, A. dispar, A. fasciatus, A. mento Kosswigichthys asquamatus Orestias agassii, O. ispi, O. luteus Cyprinodontini: Cualac tesselatus Cyprinodon variegatus Floridichthys carpio Garmanella pulchra Jordaneltafloridae Megupsilon aporus aFrom Parenti (1981).
the fundamental genotype of an organism. Recently developed classes of markers such as microsatellite repeats or RAPDs will not be considered, as the phylogenetic potential of these data has not been firmly established nor have the appropriate means of their analysis in a phylogenetic context. Molecular characters stand in contrast to those derived from aspects of the organismal phenotype, potentially including morphology, ontogeny, behavior, and others. This heterogeneous group will be referred to as "morphological" characters in the interest of clarity and in keeping with the common semantics of the ongoing debate over "total evidence" issues. Numerous advantages and disadvantages are attributed to both kinds of characters by their various proponents; several reviews of these are available (Hillis, 1987; McKenna, 1987; Patterson, 1987; Donoghue and Sanderson, 1992).
II. A n a l y s i s o f C o m b i n e d
Data: Justification
Several arguments have been offered by proponents of combined analyses. From a practical standpoint, it is suggested that even if several data sets individually contain little information, in aggregate their signal may rise above the associated noise to provide a wellsupported phylogenetic hypothesis (Sneath and Sokal, 1973). Also, different data sets may contain information about relationships among different subsets of the taxa under study, such that their synthesis will provide greater resolution than would any one considered singly. Barrett et al. (1991) argued that the best phylogenetic hypothesis is the one which most parsimoniously explains all the available data, regardless of type, and that the necessity of choosing among the numerous available consensus methods must inevitably lead to arbitrariness of method. Epistemological arguments for data combination are treated extensively by Kluge and Wolf (1993). Objections to combination of data have included concerns that large (usually molecular) data sets will swamp out the information contained in smaller ones (Hillis, 1987); although this can in principle be compensated for with differential weighting, the particular weighting scheme invoked itself requires justification. A related objection is that characters within each data set are less likely to be independent of each other than are characters in different data sets. In such a case, combined analysis may result in the acceptance of hypotheses not supported by the majority of data sets, if characters in one set are strongly intercorrelated and phylogenetically misleading (deQuieroz, 1993).
11. Combined Data of Cyprinodontiformes
A series of simulations carried out by Bull et al. (1993) provide a number of important cautionary notes regarding combination of data sets. Contrary to common assumptions, they demonstrate that when only a small fraction of available characters are consistent with the "true" phylogeny, the addition of more such characters increases the certainty that a faulty result will be obtained. They further show that a combination of character sets that are congruent with the same tree, but have evolved at different rates, can produce a less accurate estimate of phylogeny than will either considered alone; this is more likely as the difference in rates of evolution increases. It should be pointed out that the significance of these results is in no way limited to combination of molecular with morphological data; they are equally relevant to combined analyses of rateheterogenous molecular data (e.g., nuclear and mitochondrial genes), and perhaps even simultaneous consideration of rate-heterogenous characters within single data sets (e.g., first and second versus third codon positions). Rate and information content aside, Bull et al. (1993) argue that a combination of data sets that provide significantly different estimates of phylogeny is not scientifically justifiable. If two groups of characters strongly support incongruent phylogenetic hypotheses, it may be inferred that (at most) only one of them is correct; in this situation, combination of data is likely to lead only to a less accurate final result. When phylogenies suggested by different data sets are incongruent, this may be because their characters have evolved under different sets of functional "rules" (e.g., selection vs neutrality), or have otherwise different evolutionary histories (e.g., homology vs paralogy). Bull et al. (1993) coined the term "process partition" to denote a group of characters that are presumed to share a single evolutionary history and suggested that different process partitions be identified and tested for phylogenetic congruence prior to combined analysis. Two tests appropriate to this purpose, Faith's (1991) topology-dependent phylogenetic tail probability (TPTP) test and the bootstrap-based method of Rodrigo et al. (1993), are described in Section VII. If different data sets are found to provide incongruent estimates of phylogeny, there are several possible courses of action. The most conservative is simply to assume that at least one of them is incorrect and proceed to collect more data in hopes of identifying which one(s). Alternately, one might revise the model(s) under which separate analyses are carried out in hopes of producing congruent results. Such revisions should really be justifiable a priori; approaching such a situation iteratively is almost certainly not justifiable by the same logic that supports successive-approximation weighting schemes for single data sets (e.g., Farris,
165
1969). Finally, one might inspect individual data sets for potential process partitions and test these against each other for congruence, seeking to identify some definable subset of characters that cannot be reconciled with the whole; this too is much more justifiable as an a priori exercise. The perspective of Bull et al. (1993) on combination of data sets was termed the "prior agreement" approach by Chippindale and Weins (1994), who critically discuss their findings and advocate a less conservative approach. They argue that many of the objections raised against combined analysis will prove equally confounding to the prior agreement approach, and emphasize the utility of differential character weighting (e.g., Farris, 1969; Wheeler, 1990) for resolving problems of character incongruence in combined analyses.
III. Analysis of Combined Data: Methods Maximum parsimony appears to be the most straightforward means of analysis for combined data. Available software packages [e.g., PAUP (Swofford, 1990)] allow specification of any number of weights and models of evolution for different characters, such that no compromises in coding or inferred process should be required for simultaneous analysis of any combination of character state data. The way is far less clear for distance-based methods. If one wished to employ a distance method in analyzing combined molecular and morphological data, perhaps because of a suspicion that the molecular data violated some assumption(s) of parsimony (Felsenstein, 1978), it might be possible to do so using a genetic distance metric based on a simple, nonbase composition-dependent model of sequence evolution (e.g., Jukes and Cantor, 1969; Kimura, 1980). Binary morphological characters could then be appended to the sequence data, coded for instance as A and G rather than 1 and 0, and distances calculated from the combined data. An additional complication is introduced by more sophisticated metrics of evolutionary distance between DNA sequences (e.g., Tamura, 1992) or primary data in the form of distances, e.g., morphometric or D N A - D N A hybridization data. In such a case, one might calculate distances for each data set separately, normalize them, and then average the individual distance matrices prior to tree construction; this would be analogous to what is done when genetic distances are calculated for multiple loci (Nei, 1987). Theoretical justification for such procedures, however, has not yet been advanced. Currently, no maximum likelihood
166
ALEX PARKER
method allowing for multiple models of character evolution within a single data set exists, and it might in fact be impossible to construct a single ML algorithm adaptable to a wide variety of character-type combinations. For the present, consensus approaches seem to be the only route available to those wishing to perform maximum likelihood analyses of molecular data in combination with morphological data.
IV. Consensus Approaches: Justification Arguments for the use of consensus methods commonly include the assertion that because they retain only those groups supported by most or all (depending on the method employed) of the trees from which the consensus is drawn [termed rival trees by Swofford (1991)], they provide a conservative estimate of phylogeny (Hillis, 1987). It is also suggested that by giving equal weight to all rival trees, consensus prevents large data sets from obscuring potentially important smaller ones. Miyamoto and Fitch (1995) assert that derivation of the same pattern of relationships among taxa from several independent data sets constitutes a particularly strong form of support, comparable to validation of an experimentally derived hypothesis using another, fundamentally different experiment. The potential for this form of support, they claim, is lost when data sets are combined. In contrast, Kluge and Wolf (1993) argue that the fundamental flaw of consensus approaches is that categorization of data into different "types," such as "molecular" or "morphological," is an arbitrary process with no objective basis. From their perspective, all data derived from study of an organism are data relevant to that organism and should not be subdivided in any way. Other opponents of consensus methods note that giving equal weight to data sets of different size amounts to arbitrary weighting of individual characters, such that those in the smallest data sets receive the greatest weights. Also, some consensus methods are far from conservative in that they can in some instances support topologies that do not occur in any of the rival trees from which the consensus is derived (Barrett et al., 1991). Consensus trees may also fail to reflect the most parsimonious global pattern of character change (Miyamoto, 1985). Additionally, combined analyses will always produce more fully resolved trees and may preserve unique information, present in only one of several data sets, that is likely to be lost in consensus analyses. This is of particular concern when separate data sets are
analyzed using algorithms that tend to produce fully resolved (binary) trees, even in the absence of meaningful support for some nodes (e.g., most distancebased methods). In this case, spurious structure associated with parts of some rival trees may prevent well-supported relationships on another tree from appearing in the final consensus. This unfortunate phenomenon may in part be avoided by subjecting rival trees to analysis for strength of support (e.g., by bootstrapping or distance-ratio tests), collapsing poorly supported nodes prior to consensus procedures, and using less stringent methods of consensus construction. It should be noted that objections raised against consensus methods are generally assumed to pertain to their application to trees derived from different data sets and not to multiple maximum parsimony (MP) trees derived from a single data set (Barrett et al., 1991; but see Carpenter, 1988, for a different perspective).
V. Consensus Methods Numerous means of consensus tree construction have been proposed; those most commonly employed are briefly described here, and some will be illustrated in the following section. The strict consensus of several rival trees retains only those topologies common to all. This procedure is viewed by many as overly conservative, as in some instances (illustrated in Swofford, 1991) the strict consensus of two fully resolved trees that are identical save for the placement of a single taxon will be a completely structureless polytomy. The majority rule consensus retains topologies common to more than a specified fraction of rival trees, usually one half. In the case of two rival trees, it is identical to the strict consensus. A related but more computationally complex approach is the median consensus of Barth616my and Monjardet (1981). The combinable component or semistrict method (Bremer, 1990) provides a consensus wherein relationships resolved on one rival tree are preserved against polytomies on other trees. When applied to fully resolved rival trees, it produces their strict consensus. These three methods are viewed by some as the most conservative available, as they will never result in a consensus containing relationships that do not explicitly appear in any rival tree. In contrast, the Adams (1972) consensus retains any pattern of relationship common to all rival trees, regardless of the placement of other taxa. It can thus result in consensus trees containing, for instance, sister pairs of taxa that are not sister to each other in any rival tree. A similar approach is the Nelson (1979) consensus. As formalized by Page (1989a), all cliques of corn-
11. CombinedData of Cyprinodontiformes binable topological elements are counted for all rival trees, and the most frequent combinable cliques are retained to form the consensus. The semistrict, Adams, and Nelson methods have a greater capacity to preserve information available only in one or a few rival trees. For this reason they may be seen as providing more of a synthesis of phylogenetic hypotheses than the strict or majority rule methods, which serve primarily to make explicit points of agreement between rival trees.
167
pothesized interrelationships of some North American and Caribbean cyprinodontids (Parenti, 1981). The implications of various hypotheses of relationship within these groups have been discussed extensively elsewhere (see publications cited earlier and references within) and will not be elaborated further here. The following section will instead focus on the implementation of various means of combined analysis of morphological and molecular data discussed earlier, and their effects on the phylogenetic hypotheses ultimately obtained.
VI. Analysis of Cyprinodontiform Data This section explores methods of integration of data sets, using morphological characters from Parenti's (1981) revisions of cyprinodontiform fishes and the cyprinodontiform genus Orestias (Parenti, 1984), together with previously published DNA sequence data from studies expressly intended to test two novel hypotheses of relationship suggested therein. Meyer and Lydeard (1993) sought to test the hypothesis that viviparity has evolved only once in cyprinodontiform fishes, thus defining a monophyletic group including goodeids, Jenynsia, Anableps, and some poeciliids. This pattern of relationship was an uncritically accepted tenet of cyprinodontiform systematics until challenged by Parenti (1981). They analyzed 315 bp of coding sequence, comprising most of exons 8, 9, and 10 of the tyrosine kinase gene X-src, from 22 taxa belonging principally to suborder Cyprinodontoidei (GenBank Accession numbers U02343-U02366). Their results supported Parenti's (1981) hypothesis of multiple origins of viviparity, but conflicted with several other posited relationships. Parker and Kornfield (1995) tested Parenti's (1981) assignment of Orestias, a genus endemic to the Altiplano of Peru and Bolivia, to Cyprinodontidae; Parenti hypothesized that Orestias was sister to the Turkish genus Kosswigichthys and assigned both to an apical position within the Anatolian (Mediterranean and Near East) genus Aphanius, implying paraphyly of Aphanius. This hypothesis has apparent and somewhat startling zoogeographic implications, which have been extensively discussed (Parenti, 1981; Parker and Kornfield, 1995). Parker and Kornfield (1995) analyzed a 378-bp sequence from the mitochondrial 16S rRNA gene (GenBank Accession numbers U05964-U05980), and found strong support for Parenti's assignment of Orestias to Cyprinodontidae, but in contrast found it to be sister to a monophyletic Aphanius. This pattern of relationship was suggested to be consistent with the late Cretaceous-early Paleocene breakup of the Gondwanan supercontinent. Their results also conflicted with hy-
VII. Methods Morphological data were taken from monographs of Parenti (1981, 1984) and coded as two-, three-, or four-state unordered characters (see Appendices). DNA sequences were subdivided into potential process partitions prior to analysis. X-src sequences (Meyer and Lydeard, 1993) were partitioned into first and second versus third codon positions; alignment of intron sequences was ambiguous, so this potential third process partition was excluded from further consideration. 16S rRNA sequences (Parker and Kornfield, 1995) were divided into helix- and loop-forming regions, largely by alignment to published structural models for Homo and Mus (Glotz et al., 1981). Two regions, approximately 80 bp each, were unalignable to published sequences (those regions exhibiting the greatest divergence between cyprinodontid taxa, as well as between human and mouse sequences). Secondary structures (see, e.g., Fig. 1; aligned sequences for all taxa, including assignment of nucleotides to helix- and loop-forming regions, may be obtained from the author) for these regions were inferred from minimum-energy folding patterns calculated using MULFOLD (Jaeger et al., 1989) and were consistent across all 16 cyprinodontid sequences. Maximum parsimony trees were inferred for each process partition using the heuristic search algorithm of PAUP v3.1.1 (Swofford, 1990), and the degree of support offered for different elements of these trees was estimated by a heuristic search of 200 bootstrapped data sets. 1 Both the T-PTP (Faith, 1991) and the bootstrap (Rodrigo et al., 1993) approaches to combinability testing (see later) involve tree topologies and are appropri-
1Time constraints imposed by the large number of taxa studied and searches performed made implementationof the number of replicates necessary for statisticallyaccurate estimationof the bootstrap p value (2000replications;Hedges, 1992)impractical.
]68
ALEX PARKER
a aat c a t a
-G
c
-G -G
g
a
5'
t
a
aa
-A -Tg a
c
g
-C -A
T G G..C G,T T.. A A" T
a
a G-C c t
t
c
ga
A'T tA T" G
Cc'Ggc 3' t
a
t
A-T t a C-G T-A a A-T g gC-G
-A a
g G,C
t
t a g
G-C a t a
- ca t
ag
T-A
ag
a
T -A
c
-A
t a
t
ga
c
g a ~aCG_Taa ~ C-G ~ A - T
a
g
at "T~ t
gaCCCTAt gAGCT I I I I I I I I I I t aGGGAT TTGAGc aa cc C a a a c g ~ C a a
ct
c ~
ca /gg c
~
T
T G.'A G, t ttG'c: C /
/
g g ~ t
GGGG /""---,-~C C C C c ~
a
a a c
a t c
'
C a
G-C a G-~'G c C" G-C aTA A-T c T " "A T
c-o
T-A A-T
g g
t
%c
g t
-
" G ,C
G ,C G ,T A ,T t t A c a g c tat
t a
a g a g t c c e
FIGURE 1 Assignment of 16S rRNA nucleotide positions to helix- and loop-forming regions for Fundulus heteroclitus. Uppercase letters, helix-forming; lowercase letters, loop-forming. Bold type indicates positions identical to Homo sapiens sequence; italics indicate positions varying in a cladistically informative manner among cyprinodontid taxa included in this study. Figure after Glotz et al. (1981).
ately carried out using fully resolved (binary) trees. To satisfy this requirement for each process partition where multiple MP trees were found, the set of MP trees was filtered, using PAUP v3.1.1, to remove polytomous trees for which a more resolved compatible tree was present. When more than one tree was retained by the filter, the majority-rule consensus of remaining trees was computed and used as the partition tree. For purposes of the tests to be conducted only, the morphological partition trees were modified to make the genera Rivulus and Xiphophorus monophyletic in the Cyprinodontoidei tree and Parenti's nominal genera Aphanius and "Aphanius" (Parenti, 1981) monophyletic in the Cyprinodontidae tree.
The methods of combinability testing advocated by the prior agreement approach (Bull et al., 1993) differ both in the null hypotheses they consider and in their overall stringency, i.e., the likelihood that they will contraindicate combination of data sets. The T-PTP test (Faith, 1991) is an outgrowth of the phylogenetic tail probability (PTP) test (Archie, 1989; Faith and Cranston, 1991). The PTP test seeks to determine whether there is a significant phylogenetic signal present in a data matrix by testing the null hypothesis that the MP tree for that data matrix is no shorter than would be expected for random data of the same character-state composition. The distribution of tree lengths for such random data is estimated by repeatedly permuting
11. Combined Data of Cyprinodontiformes
data across states within characters and calculating the minimum tree length for each permuted data set. The T-PTP test compares the fit of a data matrix to two competing or alternative topologies. In the case of combinability testing, MP tree topologies are determined for each data set (call them A and B). The relative fit of data set A to MP topologies A and B is tested by repeatedly permuting data set A, calculating the number of steps required to fit permuted data to tree B, and subtracting the number of steps required to fit permuted data to tree A. The distribution of tree length differences so generated is the null expectation to which the tree length difference for unpermuted data is compared. The reverse procedure can also be carried out, testing the relative fit of data set B to trees A and B. The results of the reciprocal test need not be identical, as the difference in tree length between competing topologies may be much greater for highly structured data than for data exhibiting greater homoplasy, whereas tree length differences between permuted data sets depend only on the number and frequency of states present at each character. An important caveat associated with the T-PTP test is that the tree topologies compared should both be binary; because permuted data are of essentially random composition, the shortest tree compatible with a partially unresolved tree will almost always be shorter than the shortest tree compatible with a binary one. The test is thus biased toward rejecting combinability of data associated with the more resolved tree. Because character-state permutation tests are fundamentally tests for the presence of phylogenetic signal, the null hypothesis tested in a T-PTP combinability test (again of data sets A and B) is that the phylogenetic signal present in data set A does not offer significantly greater support for topology A than it does for topology B. Several computer utilities can be employed in T-PTP testing. Hennig86 (Farris, 1988) users can use the RANDOM CLADISTICS package (Siddall, 1993); permuted data sets for analysis by other programs can be generated using the SEQBOOT component of PHYLIP 3.5 (Felsenstein, 1992). For the analyses presented here, the author has written a brief program which embeds permuted data sets within files appropriate for batch processing by PAUP v3.1.1. T-PTP and other combinability tests will be implemented in the next release of PAUP (D. Swofford, personal communication). In contrast to the T-PTP test, the bootstrap-based combinability test developed by Rodrigo et al. (1993) views MP trees for different data sets as sample estimates, or topostatistics, of a "true" or parametric phylogenetic tree. From this perspective, one can test the null hypothesis that although data set A might support tree A more strongly than it does tree B, the difference in support is no greater than might be expected on the
169
basis of variation present in the two data sets being compared. The magnitude and distribution of sampling variance in the data matrices are estimated in the fashion most common to phylogenetic analyses, the bootstrap (Felsenstein, 1985). The difference between tree topologies is quantified using SDc, the symmetric difference of components (Hendy et al., 1984). Again it is important that the trees to be compared be binary, as presence of unresolved nodes in one of two trees compared inflates SDc between them. This test is carried out by first estimating the sampling variance associated with each data set by generating a number of bootstrapped data sets, finding the MP tree for each, and calculating SDc between each bootstrap tree and the original MP tree. The two distributions thus generated are then compared to SDc between the MP trees for the two original data sets, typically using p = 0.05 as a criterion for rejecting the null hypothesis (and thus rejecting combinability of the data sets). For the analyses presented here, 200 bootstrap trees were generated for each data set using PAUP v3.1.1, and SDc calculations were carried out using COMPONENT vl.5 (Page, 1989b). To ensure that all trees compared were binary, the "collapse zero length branches" option was not used in searching for bootstrap MP trees; to avoid pseudoreplication, only one MP tree was saved for each bootstrapped data set. As in the T-PTP test, results need not be symmetrical across data sets. In this case, the difference in SDc between the original MP tree and bootstrap trees is expected to be greater for data exhibiting greater homoplasy. This is because resampling of highly structured, self-consistent data will produce trees very similar to the MP tree for that data, whereas resampling homoplasious data, where many characters conflict with each other, produces a wider variety of tree topologies. Thus the difference in SDc between two MP trees might fall outside the distribution of bootstrap trees for one data set, but inside that of the other. Overall sequence divergence and the transition: transversion ratio were calculated for all possible pairwise comparisons within each DNA sequence partition using MEGA vl.0 (Kumar et al., 1993). DNA sequence partitions were tested for homogeneity of results under five to one weighting of transversions over transitions versus one to one weighting. Sequence partitions were then tested for combinability within each molecular data set, prior to testing combinability of molecular with morphological data. Combined analyses of molecular and morphological data were carried out by heuristic search, using PAUP v3.1.1; support for the MP tree derived from each combined analysis was assessed by a heuristic search of 200 bootstrapped data sets. Consensus trees were calculated from molecular and morphological MP trees us-
170
ALEX PARKER
ing PAUP v3.1.1. Strict and Adams consensuses were derived from unmodified MP trees; for calculation of the semistrict consensus, nodes not appearing in over 50% of bootstrap trees were collapsed prior to the consensus procedure.
VIII. Results and D i s c u s s i o n
Partition trees for both data sets are shown in Figs. 2 and 3. The reality of these data subdivisions as process partitions is supported by striking differences in patterns of substitution and levels of interspecific divergence (Fig. 4). In the X-src sequences, first and second codon positions exhibit much less overall sequence divergence than do third positions, suggesting strong selective constraint on many of them; this is also reflected in the proportion of invariant first (79.04%) and second (87.62%) versus third (24.76%) positions. Also, transition:transversion (Ts:Tv) ratios are on average lower for first and second positions, even among sequences that differ very little overall (Fig. 4). A comparable difference in overall sequence divergence exists between helix- and loop-forming regions of the 16S rRNA sequences. The difference in the Ts:Tv ratio, however, is even more pronounced than in the X-src data partitions, and of opposite polarity, such that the less divergent helices exhibit higher Ts:Tv ratios than do the more variable loops. This may reflect the fact that because RNA structures incorporate G-U in addition to G-C and A-U base pairs, four of the six possible transitional changes to an RNA base pair maintain base pairing, and thus secondary structure, whereas any transversional change will disrupt base pairing. Although the maximum parsimony trees found under 5:1 and 1:1 weighting schemes differ for each DNA sequence partition, those relationships receiving bootstrap support in excess of 50% are generally unaffected by weighting. T-PTP tests of homogeneity (Figs. 5 and 6) reveal only one instance where significantly greater support is found under one weighting scheme versus the other: X-src third codon positions, weighted 1:1, provide significantly more support for the 1:1 MP tree than for the 5:1 MP tree (Fig. 5E). The large excess of transitions over transversions in this data set (Fig. 4) also supports the use of 1:1 weighting. In light of this observation, and the lack of significant differences among weighting schemes for the other data partitions, all subsequent analyses employ 1:1 weighting. Results of combinability testing of DNA sequence partitions differ greatly between data sets. Although T-
PTP tests reveal significantly better support of each data partition for its own MP tree than for that of the complementary partition (Figs. 7A, B, D, and E), nonsignificance of the bootstrap combinability test suggests that, in the case of the X-src sequences, such differences are attributable to sampling error (Fig. 7C). In contrast, combinability testing decisively rejects the null hypothesis of difference due only to sampling error for the 16S rRNA helix- and loop-forming sequence partitions (Fig. 7F). Additionally, the MP tree for helix-forming nucleotide positions posits a sister group relationship between Kosswigichthys asquamatus of central Turkey and Garmanella pulchra of the Yucatan (Fig. 3B), which is at odds with strong support for monophyly of tribe Cyprinodontini found in mtDNA control region sequences (Parker and Kornfield, 1995), morphology (Parenti, 1981), and allozymes (Echelle and Echelle, 1993). Accordingly, the helix-forming data partition was excluded from further combinability analyses. Combinability tests of molecular and morphological data partitions (Figs. 8 and 9) also differed greatly in their outcomes. Initial bootstrap tests rejected the combinability of Cyprinodontoidei data partitions based on the distribution of SDc for morphological characters, but not for molecular data (Fig. 8C). This difference in distribution of SDc for the two data partitions is presumably due to the greater degree of homoplasy present in the molecular data (see Section VII). In cases where combinability tests reveal significant but not profound conflicts between data partitions, proponents of the prior agreement approach to data combination have suggested pruning of especially problematic taxa (Rodrigo et al., 1993). Inspection of Cyprinodontoidei partition trees (Figs. 2E and F) reveals that molecular and morphological data partitions strongly disagree over the affinities of Profundulus guatemalensis: morphological characters place it as the basal cyprinodontoid lineage, whereas molecular data strongly indicate a sister group relationship to Goodeidae. Pruning P. guatemalensis greatly changes the outcome of the bootstrap combinability test, such that the null hypothesis of difference due only to sampling error is assigned a probability of 0.495 for the molecular data partition. Although combinability is still rejected for morphological data (p=0.020), SDc between morphological and molecular trees (10) now falls within the distribution of SDc between the morphological MP tree and its bootstrap trees (range 0-11), where previously it fell far outside (compare Figs. 7C and F). The outcome of TPTP tests following pruning of P. guatemalensis, while still rejecting the null hypothesis of no difference in support, also reflects the greater agreement of data partitions (Figs. 7A and B vs 7D and E).
11. Combined Data of Cyprinodontiformes Rivulus harti Rivulus sp. Cynolebias whitei Nothobranchius melanospilus Cubanichthys pengellyi Xenotoca eiseni Zoogonecticus quitzeoensis Crenichthys baileyi Profundulus guatemalensis Fundulus heterocfitus Jordanella floridae Cyprinodon rubrofluvilatilis Poecilia caucana Xiphophorus maculatus Xiphophorus signum Tomeurus gracilis Cnesterodon decemmaculatus Jenynsia lineata Anableps anableps Fluviphylax pygmaeus Aplocheilichthys kassenjiensis Aplocheilichthys spilauchen
79
1ST & 2ND Positions Weight: 1:1 Length: 55 CI: 0.727 Char: 24 Trees: 1
q,
kt_ I s~=
3RD Positions
-[q l
Weight: 1:1 Length: 184 CI: 0.620 Char: 60 Trees: 1
qq
69r" I
r-" r'58
All Positions Weight: 1:1 Length: 243 Cl: 0.634 Char: 84 Trees: 1
_EL
~~176
"], "1, ~
,00]a~l 94r"Z
Rivulus harti Rivulus sp. Cynolebias whitei Nothobranchius melanospilus Cubanichthys pengellyi Jordanella floridae Cyprinodon rubrofluvilatilis Aplocheilichthys kassenjiensis Aplocheilichthys spilauchen Fundulus heteroclitus Xenotoca eiseni Zoogonecticus quitzeoensis Crenichthys baileyi Profundulus guatemalensis Anableps anableps Jenynsia lineata Tomeurus gracilis Fluviphylax pygmaeus Poecilia caucana Cnesterodon decemmaculatus Xiphophorus maculatus Xiphophorus signum
Rivulus harti Rivulus sp. Cynolebias whitei Nothobranchius melanospilus Cubanichthys pengellyi Jordanella floridae Cyprinodon rubrofluvilatilis Xenotoca eiseni Zoogonecticus quitzeoensis Crenichthys baileyi Profundulus guatemalensis Fundulus heterocfitus Aplocheilichthys kassenjiensis Aplocheilichthys spilauchen Fluviphylax pygmaeus Anableps anableps Jenynsia lineata Tomeurus gracilis Cnesterodon decemmaculatus Poecilia caucana Xiphophorus maculatus Xiphophorus signum
r ~3r_
FIGURE 2 Partition maximum parsimony (MP) trees for Cyprinodontoidei data. Numbers above nodes reflect the frequency of their occurrence in a 50% majority-rule bootstrap consensus tree (200 replicates). Trees are labeled as follows: weight, Tv: Ts weighting; length, MP tree length; CI, consistency index; Char, number of parsimony-informative characters; trees, number of MP trees. For morphological data partition, all characters were unordered and equally weighted.
171
172
ALEX PARKER
Rivulusharti Rivulussp. Cynolebiaswhitei Nothobranchiusmelanospilus Poeciliacaucana
71~
1ST & 2ND Positions
~
Weight: 5:1 Length: 169 Char: 24 Trees: 6
~ j
Xiohophorusmaculatus
Xiphophorussignum Tomeurusgracilis Cnesterodondecemmaculatus Anablepsanableps Jenynsialineata Aplocheilichthyskassenjiensis Aplocheilichthys spilauchen Fluviphylaxpygmaeus
~ 51 54
~
Jordaneflafloridae
Cyprinodon rubrofluvilatilis Cubanichthyspengellyi Xenotocaeiseni Zoogoneticusquitzeoensis Crenichthysbaileyi Profundulusguatemalensis Fundulusheteroclitus
57 64
~.~
Rivulusharti Rivulussp. Cynolebiaswhitei Fluviphylaxpygmaeus Xenotocaeiseni r ~ Zoogoneticusquitzeoensis Crenichthysbaileyi Nothobranchiusmelanospilus Fundulusheteroclitus ~Aplocheilichthyskassenjiensis 54 Aplocheilichthys spilauchen Cubanichthyspengellyi Jordanellafloridae
3RD Positions Weight: 5:1 Length: 437 Char: 60 Trees: 1
L~
I
~ Cyprinodonrubrofluvilatilis
76 r " ~
Morphology
loo~
~ 1 | ~0~11
Weight: equal Length: 117 CI: 0.803 Char: 74 Trees: 1
pm,m
100~ ~r~.-
79 ~
%
I
~[~ 75
~
FIGURE2
Continued.
I
Profundulusguatemalensis Anablepsanableps Jenynsialineata Tomeurusgracilis Xiphophorusmaculatus Xiphophorussignum Poeciliacaucana Cnesterodondecemmaculatus Rivulusharti Rivulussp. Cynolebiaswhitei Nothobranchiusmelanospilus Profundulusguatemalensis Fundulusheterocfitus Xenotocaeiseni Zoogoneticusquitzeoensis Crenichthys baileyi Jordanellafloridae Cyprinodonrubrofluvilatilis Cubanichthyspengeilyi Anablepsanableps Jenynsiafineata Aplocheilichthyskassenjiensis Aplocheilichthysspilauchen Fluviphylax pygmaeus Tomeurusgracilis Poeciliacaucana Xiphophorusmaculatus Xiphophorussignum Cnesterodondecemmaculatus
11. CombinedData of Cyprinodontiformes
93
~
Weight" 1 1 Length" 65 C I 0.769 Char: 20 Trees 1
Loops 11
Weight: 1 "1 Length 275 CI" 0.622 Char: 69 Trees: 1
8~
55
Morphology Weight: equal Length 37 CI 0.917 Char: 31 Trees: 1
65
, , ~ Aphaniusdispar
~ ~ lOO
|
FIGURE3
, Fundulus heteroclitus Orestias ispi Orestias luteus Orestias agassii
~
Helices
173
Aphanius fasciatus Aphanius mento Cualac tesselatus Aphanius chantrei Cubanichthys pengelleyi Garmanella pulchra Kosswigichthys asquamatus Floridichthys carpio Cyprinodon variegatus Jordanella floridae Megupsilon aporus
Fundulus heterocfitus Cubanichthys pengelleyi Orestias ispi Orestias luteus Orestias agassii
Aphaniusdispar
Aphanius mento Aphanius fasciatus Aphanius chantrei i--,-- Kosswigichthys asquamatus Cualac tesselatus Floridichthys carpio Garmanella pulchra Jordanella floridae 9~ r - " Cyprinodon variegatus L__ Megupsilon aporus
Fundulus heterocfitus Cubanichthys pengelleyi Orestias ispi Orestias luteus ~ ~ - - ~ Orestias agassfi 96 Kosswigichthys asquamatus r ' - Aphanius chantrei L___ Aphanius mento Aphanius dispar Aphanius fasciatus 67 Cualactesselatus Floridichthys carpio Garmanella pulchra Jordanella floridae Cyprinodon variegatus Megupsilon aporus
Partition MP trees for Cyprinodontidae data. Trees are labeled as in Fig. 2.
T-PTP and combinability tests of Cyprinodontidae data partitions reveal much greater incompatibility than in the previous example. In this case, the difference in tree length between molecular and morpho-
logical MP topologies is vastly larger than the null expectation (Figs. 9A and B), and the bootstrap test rejects combinability for both data partitions (Fig. 9C). Also, unlike P. guatemalensis in the Cyprinodontoidei
Helices Weight: 5"1 Length: 146 Char: 20 Trees" 4
s,r-.
I
rI
+4r-
Loops
Fundulus heterocfitus Cubanichthys pengelleyi Orestias ispi Orestias luteus Orestias agassii Aphanius mento Aphanius dispar Aphanius chantrei Kosswigichthys asquamatus Aphanius fasciatus Floridichthys carpio Garmanella pulchra Jordanella floridae Cualac tesselatus Megupsilon aporus Cyprinodon variegatus
i
100
Weight: 5"1 Length: 789 Char: 69 Trees" 1
80
70
FIGURE 3
Fundulus heteroclitus Orestias ispi Orestias luteus Orestias agassii Cualac tesselatus Aphanius mento Aphanius dispar Cubanichthys pengelleyi Aphanius chantrei Aphanius fasciatus Kosswigichthys asquamatus Garmanella pulchra Floridichthys carpio Jordanella floridae Megupsilon aporus Cyprinodon variegatus
Continued.
12
First and Second Positions [] Ingroup Comparisons
Helices [~
10
9Outgroup Comparisons
30 .o s163
2
9
(3
o
9
4-
CO ODCO IOO
o
o nu
0
20
10
~
~>
30
o
c
r-
o n I
9
~o
0o
oI
,,
o
10
!--
!--~ 10 .o
~ b-
o
iI 9 i II
g m
Ingroup Comparisons Outgroup Comparisons
12
Third Positions 9
8
I 9 Ingroup Comparisons
o
9
9
9
9
9
9
i I9
|
9
:'
'
10
=
"= "
9Outgroup Comparisons
"
"'.."'I++Ii!, li!". 9 "" :".,'. ~ 9
I o Ingroup Comparisons
Comparisons
9
9 9i I 9
"i
Outgroup
Loops
m 10
:o'i "
.
i ~ ! ! '!
'
'
20 Percent Sequence
i~ o 30
Divergence
"
[] 0
40
0
0
0
nO 0 i;I
om
0
o
n n
[] []
8
9
0
10
20
Percent Sequence Divergence
F I G U R E 4 T r a n s i t i o n : t r a n s v e r s i o n ratio v e r s u s total number of substitutions for D N A sequence data partitions. Nine comparisons in which no transversions were observed, a n d the ratio w a s therefore undefined, are omitted.
11. CombinedData of Cyprinodontiformes
175
First and Second Codon Positions
C
A 1.0 m
5:1 weighting, p=0.965
1.0
m
1:1 weighting, p=0.170
0.4
m
m m
m
u
m
m
m
1:1 weighting, p=0.830 5:1 weighting, p=0.675 12
0.5-
0.5--
~
0.2
"
m i m
=,,
m
g
-1
3
5
-1
9
0
1
mR
0
Tree length Difference
3
6
9
12
15
18
21
24
27
SD c Third Codon Positions
E 0.5 " I
0.3
0.5]
5:1 weighting, p=0.960
F 0.4
11 weighting, p<0.005
18
r"3 1:1 weighting, p=0.315 5:1 weighting, p=0.350
0.3"
"
0.1" 5
II In_oltnl I
10 15 20
25 30
-5
-2
Tree length Difference
13
1
4
7
10
0.2
0
3
6
9
12
SDc
15
18
21
24
FIGURE 5
Tests of homogeneity of results for X-src sequence data partitions under different charactertranformation weighting schemes. (A) T-PTP test for first and second codon positions, 5:1 weighting scheme. Frequency histogram indicates length of shortest tree compatible with first and second position MP tree for 1: 1 weighting (Fig. 2) minus length of shortest tree compatible with MP tree for 5:1 weighting (Fig. 2), for 200 permuted data sets weighted 5:1. Arrow indicates length difference for unpermuted data. (B) T-PTP test for first and second codon positions, 1 : 1 weighting scheme. Frequency histogram indicates length of shortest tree compatible with MP tree for 5:1 weighting minus length of shortest tree compatible with MP tree for 1:1 weighting, for 200 permuted data sets weighted 1:1. Arrow indicates length difference for unpermuted data. (C) Bootstrap combinability test for 1:1 and 5:1 weighted first and second positions. White bars indicate distribution of symmetric difference of components (SDc) between the original 1:1 MP tree and 200 bootstrap trees for 1 : 1 weighted data. Gray bars indicate distribution of SDc between the original 5:1 MP tree and 200 bootstrap trees for 5:1 weighted data. Arrow indicates SDc between original 1:1 and 5:1 MP trees. (D-F) Identical tests of homogeneity of results for 1 : 1 and 5:1 weighting schemes applied to third codon positions.
example, no one taxon differs dramatically in placement between the partition MP trees. Rather, the entire arrangement of taxa within tribe Orestini (Orestias, Aphanius, and Kosswigichthys) is at odds (Figs. 3C and E) so there seems little prospect for improvement in pruning one or a few taxa; to resolve this conflict, most of the taxa whose relationships initially motivated collection of data would have to be eliminated from consideration. Thus from the prior agreement perspective (Bull et al., 1993), two different situations obtain with these phylogenetic analyses: combined analysis seems reasonable for pruned Cyprinodontoidei data, whereas consensus appears to be the only appropriate approach
for Cyprinodontidae data. In the interest of comparing analytical methods and evaluating the impact of prior agreement recommendations on the outcome of these analyses, combined and consensus analyses of both data sets were performed. For Cyprinodontoidei data less P. guatemalensis, two MP trees were found. They differ only in the placement of Fluviphylax pygmaeus; one of the two topologies was consistent with bootstrap support for F. pygmaeus as a sister group to the two Aplocheilichthys species and is shown in Fig. 10D. This phylogenetic hypothesis is almost identical to that derived from molecular data alone (Fig. 2E); it differs only in sister group relation-
176
ALEX PARKER
Helix -forming regions
A
B
0.6
5:1 weighting, p=0.525
~
0.2
3
6
C
0.6" 1"1 weighting, p=0.225
0.4--
n
5:1 weighting, p=0.905
1
0.4"
1:1 weighting, p=o.97o 8
0.2-" ~,
0.2- . _ . ~
m
9
12 15
-5
-3
Tree length Difference
-1
1
a
3
0
5
3
,l,[i [! I 6
9
12
SDc
15
18
21
24
Loop-forming regions
D
0.5"
lO
E 0.5t
5:1 weighting, p=0.445
0.3"
F
0.3
0.1"
M
~
6
3
6
9
12
"
0.1 15 18
Bm 5:1 weighting, p=0.865
0.2--
n,.-, 0
m'~ 1:1 weighting, p=0.645
0.4--
1"1 weighting, p=0.065
-9
-6
-3
0
3
Tree length Difference
6
9
0
3
6
9
12
15
18
21
SD c
Tests of homogeneity of results for 1" 1 and 5" 1 weighting schemes applied to 16S rRNA sequence data partitions. (A-C) Helix-forming regions. (D-F) Loop-forming regions. Tests are exactly analogous to those described in Figs. 5A-F.
FIGURE 6
ships of Anablepidae (Anableps anableps and Jenynsia lineata) to Poeciliidae and Cubanichthys pengelleyi to Cyprinodontinae. The affinities of these taxa are poorly supported in the molecular partition tree; their position in the combined analysis MP tree is clearly a contribution of the morphological data (see Fig. 2F). Overall, the combined analysis produces a highly resolved phylogenetic hypothesis, most elements of which receive strong bootstrap support. This topology clearly supports Parenti's (1981) hypothesis of convergent evolution of viviparity, as do molecular data alone (Meyer and Lydeard, 1993); the only poorly supported relationship of any great import is that between the clade composed of Cubanichthys plus Cyprinodontinae and the remaining Cyprinodontoidei. Molecular data suggest that it may be basal to the other Cyprinodontoidei, which would be consistent with the age of this group as inferred from zoogeography (Parker and Kornfield, 1995), but convincing resolution awaits the collection of further data.
The most highly resolved consensus tree of the three constructed from Cyprinodontoidei partition trees (Fig. 10) is the semistrict or combinable component consensus. This tree and the strict consensus are compatible with the combined analysis MP tree. This is not the case for the Adams consensus; the shortest tree compatible with this topology requires eight more steps than the combined MP tree (Table II). All three consensus trees accomplish the objective for which molecular data were gathered: they confirm that viviparity has evolved repeatedly in cyprinodontoids. Only the combinable component consensus, however, makes clear that three separate evolutionary events have taken place, once each within Anablepidae, Poeciliidae, and Goodeidae. None of the consensus trees are as fully resolved as the combined MP tree, however, and none shed much light on relationships among the six families of cyprinodontoids represented. For combined Cyprinodontidae data, three MP trees were found; their majority-rule consensus is shown in
11. CombinedData of Cyprinodontiformes
177
First and second vs. third codon positions
A 0.5"
17
B
lst/2nd positions, p<0.005
0.4
0.5" 3rd positions, p<0.005 m
0.1"
~0.1
rl -9
-6
n -3
0
...m~. 3 6
n
3rd positions, p-0.405
0.3"
0.3"-
r ~ lst/2nd positions, p=0.370
II
0.2
--I"! -9
-6
-3
0
ri :i 3
6
3
9
6
12
9
Tree length Difference
15
18
21
24
27
SD c Helix- vs. loop-forming regions
D 0.5"
E 0.5 "-
Loops, p<0.005
m
F 0.4 "- r ~ Loops, p
Helices, p=0.015
m
0.3
Helices, p=0.090
m
0.3 6 0.2
0.1
r-! -15 -10
rl -5
0
5
10
n 15
-9
-6
Tree length Difference
-3
0
hi_ 3
6
0
3
6
9
12
15
SDc
18
21
24
FIGURE 7
Tests of combinability for DNA sequence data partitions. (A and B) T-PTP tests of X-src first and second versus third codon positions. (C) Bootstrap combinability test of X-src first and second versus third codon positions. (D and E) T-PTP tests of 16S rRNA helix- versus loop-forming regions. (F) Bootstrap combinability test of 16S rRNA helix- versus loop-forming regions. Diagrams are analogous to those described for Figs. 5A-F.
Fig. 11D. Despite marked differences between the partition MP trees, this tree differs from the molecular MP tree (Fig. 3D) in one aspect only: the relationship between Aphanius dispar, A. fasciatus, and A. mento. Thus this combined analysis implies the same phylogenetic and zoogeographic conclusions derived from molecular data alone (Parker and Kornfield, 1995). Detractors of data combination might argue, however, that the similarity of combined and molecular-only results simply reflects the fact that there are over twice as many molecular as morphological data: 69 parsimonyinformative molecular characters versus 31 morphological (the Cyprinodontoidei data are far less imbalanced, with 84 molecular and 74 morphological characters). This imbalance might be rectified in two ways: either by increasing the weight given to morphological characters in the combined analysis or by consensus methods, which weight each partition equally.
Reanalysis of combined data, with each morphological character given twice the weight assigned to molecular characters, results in an MP tree identical to the morphological partition tree save for the relationships among the three Orestias species (Fig. 3E). Bootstrap resampling of this reweighted data, however, reveals that equal weighting of these conflicting data partitions serves only to reduce the number of well-supported topological elements. In this case the only nodes appearing in over 50% of bootstrap trees are those indicating monophyly of Cyprinodontinae, Cyprinodontini, Orestini, and Orestias (Fig. 11D). This topology, while supporting Parenti's (1981) assignment of Orestias to Cyprinodontidae, fails to resolve not only the relationship of Orestias to Aphanius and Kosswigichthys, but any pattern of relationship at all among members of the two tribes. It is clear from this result that application of
178
ALEX PARKER X-src sequence vs. morphology, including Profundulus guatamalensis
A
B
0.5"
11 Fill
X-srcsequence including 0.5 1 Profundulus,p
0.3
0.3
C
Morphologyincluding Profundulus,p
~l}[L~ 15
0.4
r ' l x-src sequences includingProfundulus, p=0.180 ~ Morphology including Profundulus, p<0.005
I-1
0'
!"3 -9
I]HH -6
-3
It--II 0
,-,
3
-6
-3
0
Tree length Difference
3
6
9
0
2
4
6
8
10
SDc
12
14
16
18
X-src sequence vs. morphology, excluding Profundulus guatamalensis
D 0.5"
E X-src sequence excluding 0.5" Profundulus, p<0.005
0.3"
0.4 n
Morphology excluding Profundulus, p<0.005
r l X-src sequences excluding Profundulus, p=0.495 Im Morphologyexcluding Profundulus, p=0.020 10
0.3"
0.2
0.1"
,",
-9 -6
-3
0
r]-3
6
-9 -6
-3
Tree length Difference
0
3
6
0
9
2
4
6
8
SDc
10
12
r~
M
16
14
18
FIGURE 8 Tests of combinability for X-src sequence data and morphological characters. (A and B) T-PTP tests of molecular and morphological characters for all taxa. (C) Bootstrap combinability test of molecular and morphological characters. (D and E) T-PTP tests of molecular and morphological characters excluding Profundulus guatemalensis. (F) Bootstrap combinability test for molecular and morphological characters excluding P. guatemalensis.
16s RNA loop-forming regions vs. morphology
A 0.5 1
B 16S rRNA (loops), p<0.005
0.5 1
0.3
C Morphology,p<0.005
0.4--
[ ~ Loops, p=0.025 Morphology, p<0.005
IH ' I 1 ~i
0.3
29 0.1
16
0.2
0.1
-20 -15 -10
-5
0
5
10
-6
-3
Tree length Difference
0
3
6
9
0
3
6
9
SDc
II'rl 12
FIGURE 9 Tests of combinability for 16S rRNA sequence data and morphological characters. (A and B) TPTP tests. (C) Bootstrap combinability test.
15
M
18
11. Combined Data of Cyprinodontiformes
Adams Consensus
Strict Consensus ........
C. whitei Ii'
R. harti N. melanospilus F. heterocfitus P. guatamalensis A. anableps I J. lineata C. decemmaculatus P. caucana X. maculatus ' X. signum T. gracilis .... A. kassenjiensis ~, A. spilauchen F. pygmaeus C. bafleyi Z quitzeoensis ! X. eiseni C. pengelleyi C. rubrofluviatilis J. floridae
"'
,i,, '''
R. s p e c i e s
!
.......
Semistrict Consensus !
"
~.....
'
'
'' .....
I
r ~ I
._t I
I
,,|,,
..........
,
L_
..... E I
i"
I
|
I
........
I
, !
I
'
I ~
g. tlori~ae
I
t"-I.__
........
t-I.._. f--I._.
i
"' ,,
D
,
'
,
9999 ~
J. C. lineata decemmaculatus P. caucana X. maculatus X. signum 7. gracilis A. kassenjiensis A. spflauchen F. pygmaeus C. baileyi Z. quitzeoensis X. eiseni C. rubrofluviatflis j. floridae C. pengelleyi P. guatamalensis
C. whitei R. species
'
R. harti N. melanospilus F. heteroclitus P. guatemalensis C. baileyi Z. quitzeoensis X. eiseni
" p1 ~
66
~
~L..j
r-A'anableps J. lineata
,,,,, ! O0
R. harti N. melanospilus F. heteroclitus A. anableps
Combined Analysis
R. s p e c i e s
R. harti N. melanospilus F. heterocfitus P. guatamalensis A. anableps J. lineata C. decemmaculatus P. caucana x. maculatus X. signum T. gracilis A. kassenjiensis A. spilauchen F. pygmaeus C. baileyi Z quitzeoensis X. eiseni C. pengelleyi C. rubrofluviatilis
C. whitei I - - " R. species
I I
C. whitei
I
179
1~1
-~~
16~ ~ 7
1
.......
'"
C. ffocemmaculatus P. caucana Xmaculatus X. signum T. gracilis
1~t~_ A. kassenjiensis
64 ~
looL~
A. spilauchen F. pygmaeus C. pengelleyi c. rubrofluviatilis J. floridae
FIGURE 10 Hypotheses of cyprinodontoid relationships derived from X-src DNA sequence data and morphological characters, excluding P. guatemalensis. (A) Strict consensus of molecular (Fig. 2) and morphological (Fig. 2) trees. (B) Adams consensus of molecular and morphological trees. (C) Semistrict consensus. (D) Combined analysis MP tree, all character state transformations unordered and equally weighted; length = 361, CI = 0.681, 158 informative characters. Numbers above nodes indicate their frequency of occurence in a 50% majority-rule bootstrap consensus tree (200 replicates). Reanalysis including data for P. guatemalensis results in its placement as indicated by the dashed line; tree topology is otherwise unaltered.
cladogram-based, character-weighting schemes (e.g., Far-ris, 1969; Williams and Fitch, 1989), which iteratively increase the weight applied to characters exhibiting greater consistency with the cladogram derived
prior to reweighting, would simply result in convergence on either the morphological or the molecular MP tree, depending on the weighting scheme initially employed. In the interest of more fully exploring the
TABLE H
Maximum Parsimony Tree Lengths and Consistency Indices for Separate and Combined Analyses
Cyprinodontoidei excluding
Combined analysis MP tree
Compatible with Adams consensus
37 0.917 115 0.809
312
327 0.624 361 0.681
332 0.611 364 0.676
117 0.803
360
371 0.668
379 0.654
Morphology MP tree
Length: CI: Length: CI:
275 0.622 238 0.643
Length: CI:
243 0.634
Data set Cyprinodontidae
Sum of molecular and morphology
Molecular MP tree
353
P. guatemalensis Cyprinodontoidei including
P. guatamalensis
A
Strict Consensus
B
Adams Consensus
F. heteroclitus
F. heterocfitus
C. pengelleyi
C. pengelleyi
O. agassii
O. agassii
O. ispi
" - - - - - - O. ispi
O. luteus
O. luteus
A. dispar
A. dispar
A. fasciatus
A. fasciatus
A. mento A. chantrei
~--
K. asquamatus
C
K. asquamatus
F. carpio
F. carpio
J. floridae
J. floridae
C. variegatus
G. pulchra
M. aporus
C. variegatus
G. pulchra
M. aporus
C. tesselatus
C. tesselatus
Semistrict Consensus
D
Combined Analysis
F. heteroclitus
F. heterocfitus
C. pengelleyi
C. pengelleyi
O. agassfi
100 1 871
O. ispi
I
O. luteus
671
A. dispar A. fasciatus A. mento
I
A. mento A. chantrei
O. agassfi O. ispi O. luteus A. dispar A. fasciatus
A. chantrei
A. mento A. chantrei
K. asquamatus
K. asquamatus
F. carpio
F. carpio
J. floridae
G. pulchra
C. variegatus
J. floridae
M. aporus G. pulchra C. tesselatus
88
IL . . . . . . . . . . . . . . . ~ 52
C. variegatus M. aporus C. tesselatus
FIGURE 11 Hypotheses of cyprinodontid relationships derived from 16S rRNA and morphological characters. (A) Strict consensus of molecular (Fig. 2) and morphological (Fig. 2) trees. (B) Adams consensus of molecular and morphological trees. (C) Semistrict consensus. (D) Combined analysis MP tree, all characters unordered and weighted equally; length = 327, CI = 0.624, 100 informative characters. Numbers above nodes indicate their frequency of occurence in a 50% majority-rule bootstrap tree (200 replicates).
11. CombinedData of Cyprinodontiformes Step Matrix Character State Transfomation Weights for 16S RNA Loop-Forming Regions a
TABLE III
From
To
A C G T
A
C
G
T
m 1.155 1.094 1.050
0.994 m 1.650 0.826
0.744 1.465 ~ 1.224
0.978 0.914 1.501
aCalculated using the combinatorialweightsprocedure of Wheeler (1990).
suggestion of Chippindale and Weins (1994) that character weighting might be employed to resolve incompatibility of data sets, character state transformation weights for 16S rRNA loop-forming nucleotide positions in the form of a step matrix (Table III), using the combinatorial weighting procedure of Wheeler (1990), were also calculated. Reanalysis of combined data using this weighting scheme, however, again results in convergence on the molecular tree when morphological characters are assigned weights equal to the mean weight given to molecular characters. Doubling the weight assigned to morphological characters results, as described earlier, in reconstruction of the morphological MP tree, except for relationships among Orestias species. Similar ambiguities arise using consensus approaches; the strict consensus (Fig. 11C) in fact retains exactly those relationships supported by bootstrap resampling of combined, reweighted data. Both Adams and semistrict consensus procedures produce slightly more resolved topologies (Fig. 11), but neither topology allows determination of whether Orestias is a component of a paraphyletic Aphanius and Koswigichthys or sister to a monophyletic group of Anatolian cyprinodonts. As was the case in the Cyprinodontoidei analysis, the strict and semistrict consensus trees are compatible with the combined MP tree, whereas the most parsimonious tree compatible with the Adams consensus is three steps longer (Table II).
IX. C o n c l u s i o n s
The two sets of molecular and morphological characters assembled and reanalyzed here illustrate two very different outcomes of data combination. Both analyses, however, provide useful perspectives on methods of data combination and the justifications offered for their use. The Cyprinodontoidei data are an
181
excellent example of the desired outcome: each data set provides some information largely lacking in the other (relationships within Poeciliinae in the case of molecular data and the relationship of Cubanichthys to Cyprinodontinae in morphological data are examples), and the final result is a highly resolved and wellsupported phylogenetic hypothesis. Here, analysis of combined data preserved information that was lost in the construction of consensus cladograms, including relationships that were strongly supported in analysis of combined data. Was exclusion of P. guatemalensis, as required to satisfy the tenets of the prior agreement approach, necessary to obtain these results? Reanalysis of combined data, including this taxon, results in a majority-rule bootstrap consensus tree essentially idenical to the previous combined analysis, placing P. guatemalensis in an unresolved trichotomy with F. heteroclitus and Goodeidae (dashed line in Fig. 10D). In the case of the Cyprinodontidae data, neither combination nor consensus proved particularly satisfactory, as the combined MP tree changed dramatically depending on whether weighting was applied to equally weight characters or data sets, and consensus trees failed to resolve many relationships of interest. Ignoring for the moment the incompatibility between loop-forming and morphological data partitions indicated by application of prior agreement methods (Fig. 9), did exclusion of the even more incompatible helix-forming data partition (Figs. 6D-F) have a significant impact on this combined analysis? Reinclusion of these characters, weighted equally, alters the MP tree topology only in the relationship among the previously described three Aphanius species; bootstrap values are also Slightly altered, but the same nodes receive, or do not receive, support in excess of 50% of replications (data not shown). In both cases, the semistrict consensus tree illustrates those relationships which, due to their occurrence in the MP trees for both data sets, receive "independent confirmation." From the perspective of Miyamoto and Fitch (1995), such confirmation is obscured by data combination. Especially in the Cyprinodontoidei analysis, however, these relationships receive the greatest degree of bootstrap support in combined analysis (Fig. 10). Thus even if data were to be treated only in combination, these are the relationships in which we would place the greatest confidence. It is also noteworthy that in both cases the Adams consensus procedure produced a cladogram incompatible with the most parsimonious explanation of all available data (the combined MP tree). Combinability testing, as advocated by the "prior agreement" approach of Bull et al. (1993), proved useful in identifying data partitions that conflicted with each other; clearly illustrating a profound and intrigu-
182
ALEX PARKER
ing incompatibility between helix- and loop-forming regions in 16S rRNA data. Adherence to its guidelines, however, did not greatly alter the outcome of either analysis. Reinclusion of excluded taxa and data partitions did not significantly alter the phylogenetic hypotheses that were arrived at, and despite unambiguous rejection of combinability for the unpruned Cyprinodontoidei data partitions, information would clearly have been lost by refraining from combined analysis. In particular, the T-PTP test seems to be far too stringent for the purpose of combinability testing (see Figs. 5-9), although this observation clearly does not compromise its utility in the analyses for which it was developed (Faith, 1991). Additionally, it appears that when data sets are truly incongruent, results of both combined and consensus analyses will reveal this. In cases of profound incompatibility, as in the Cyprinodontidae data reanalyzed here, it is apparent that neither method of analysis can produce a robust result, regardless of application of differential characterweighting schemes (Chippindale and Wiens, 1994). In such cases collection of further data seems to be the only viable course of action.
Acknowledgements The author thanks Irv Kornfield for comments on this manuscript and Lynne Parenti for expert advice on killifish systematics. This work was supported by the National Science Foundation (DEB 9311727, to I. Kornfield) and the University of Maine Center for Marine Studies.
References Adams, E. N., III. 1972. Consensus techniques and the comparison of taxonomic trees. Syst. Zool. 21:390-397. Alves-Gomes, J. A., Orti, G., Haygood, M., Heilenberg, W., and Meyer, A. 1995. Phylogenetic analysis of the South American electric fishes (order Gymnotiformes) and the evolution of their electrogenic system: a synthesis based on morphology, electrophysiology, and mitochondrial sequence data. Mol. Biol. Evol. 12: 298-318. Archie, J. W. 1989. A randomization test for phylogenetic information in systematic data. Syst. Zool. 38:239-252. Barrett, M. M., Donoghue, M. J., and Sober, E. 1991. Against consensus. Syst. Zool. 40:486-493. Barth616my, J. P., and Monjardet, B. 1981. The median procedure in cluster analysis and social choice theory. Math. Soc. Sci. 1: 235-267. Bremer, K. 1990. Combinable component consensus. Cladistics 6: 369-372. Bull, J. J., Huelsenbeck, J. P., Cunningham, C. W., Swofford, D. L., and Waddell, P. J. 1993. Partitioning and combining data in phylogenetic analysis. Syst. Bot. 42:384-397. Carnap, R. 1950. "Logical Foundations of Probability." University of Chicago Press, Chicago. Carpenter, J. M. 1988. Choosing among multiple equally parsimonious cladograms. Cladistics 4:291-296. Chippindale, P. T., and Weins, J. J. 1994. Weighting, partitioning, and combining data in phylogenetic analysis. Syst. Biol. 43:278-287.
De Quieroz, A. 1993. For consensus (sometimes). Syst. Biol. 42: 368-372. Donoghue, M. J., and Sanderson, M. J. 1992. The suitability of molecular and morphological evidence in reconstructing plant phylogeny. In: "Molecular Systematics of Plants" (P. S. Soltis, D. E. Soltis, and J. J. Doyle, eds.), pp. 340-368. Chapman and Hall, New York. Doyle, J. J. 1992. Gene trees and species trees: Molecular systematics as one-character taxonomy. Syst. Bot. 17:144-163. Echelle, A. A., and Echelle, A. F. 1993. Allozyme variation and systematics of the New World cyprinodontines (Teleostei: Cyprinodontidae). Biochem. Syst. Ecol. 21:583-590. Eernisse, D. J., and Kluge, A. G. 1993. Taxonomic congruence versus total evidence, and amniote phylogeny inferred from fossils, molecules, and morphology. Mol. Biol. Evol. 10:1170-1195. Faith, D. P. 1991. Cladistic permutation tests for monophyly and nonmonophyly. Syst. Zool. 40: 366-375. Faith, D. P., and Cranston, P. S. 1991. Could a cladogram this short have arisen by chance alone? Cladistics 7:1-28. Farris, J. S. 1969. A successive approximations approach to character weighting. Syst. Zool. 18:374-385. Farris, J. S. 1988. Henning86. Computer software. Felsenstein, J. F. 1978. Cases in which parsimony or compatibility will be positively misleading. Syst. Zool. 27: 401-410. Felsenstein, J. F. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39: 783-791. Felsenstein, J. F. 1992. "PHYLIP (Phylogenetic Inference Package, computer software), Version 3.5." University of Washington, Seattle, WA. Glotz, C., Zweib, C., and Brimacombe, R. 1981. Secondary structure of the large subunit ribosomal RNA from Escherichia coli, Zea mays chloroplast, and human and mouse mitochondrial ribosomes. Nucleic Acids Res. 9: 3287-3306. Gould, S. J., Woodruff, D. S., and Martin, J. P. 1974. Genetics and morphometrics of Cerion at Pongo Carpet: A new systematic approach to this enigmatic land snail. Syst. Zool. 23:518-535. Hedges, S. B. 1992. The number of replications required for accurate estimation of the bootstrap P value in phylogenetic studies. Mol. Biol. Evot. 9:366-369. Hendy, M. D., Little, C. H. C., and Penny, D. 1984. Comparing trees with pendant vertices labelled. SIAM J. Appl. Math. 44:10541065. Hillis, D. M. 1987. Molecular versus morphological approaches to systematics. Annu. Rev. Ecol. Syst. 18:23-42. Jaeger, J. A., Turner, D. H., and Zuker, M. 1989. Predicting optimal and suboptimal secondary structure for RNA. Meth. Enzymol. 183:281-306. Jukes, T. H., and Cantor, C. R. 1969. Evolution of protein molecules. In "Mammalian Protein Metabolism" (H. H. Munro, ed.), pp. 21132. Academic Press, New York. Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120. Kluge, A. G., and Wolf, A. J. 1993. Cladistics: What's in a word? Cladistics 9:183 - 199. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., Paabo, S., Villablanca, F. X., and Wilson, A. C. 1989. Dynamics of mitochondrial DNA evolution in animals: amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86:6196-6200. Kumar, S., Tamura, K., and Nei, M. 1993. MEGA: Molecular Evolutionary Genetics Analysis. The Pennsylvania State University, University Park. Lafay, B., Smith, A. B., and Christen, R. 1994. A combined morphological and molecular approach to the phylogeny of asteroids (Asteroidea: Echinodermata). Syst. Biol. 44:190-208.
11. Combined Data of Cyprinodontiformes
Lydeard, C., Wooten, M. C., and Meyer, A. 1995. Molecules, morphology, and area cladograms: a cladistic and biogeographic analysis of Gambusia (Teleostei: Poeciliidae). Syst. Biol. 44:221-236. Marshall, C. R. 1992. Character analysis and the integration of molecular and morphological data in an understanding of sand dollar phylogeny. Mol. Biol. Evol. 9:309-322. McKenna, M. C. 1987. Molecular and morphological analysis of high-level mammalian interrelationships. In "Molecules and Morphology in Evolution, Conflict or Compromise?" (C. Patterson, ed.), pp. 55-94. Cambridge University Press, Cambridge. Meyer, A., and Lydeard, C. 1993. The evolution of copulatory organs, internal fertilization, placentas, and viviparity in killifishes (Cyprinodontiformes), as inferred from a DNA phylogeny of the tyrosine kinase gene X-src. Proc. R. Soc. Lond. B 254:153162. Mickevich, M. F., and Johnson, M. S. 1976. Congruence between morphological and allozyme data in evolutionary inference and character evolution. Syst. Zool. 25:260-270. Mickevich, M. F., and Farris, J. S. 1981. The implications of congruence in Menidia. Syst. Zool. 27:143-158. Miyamoto, M. M. 1985. Consensus cladograms and general classifications. Cladistics 1:186-189. Miyamoto, M. M., and Fitch, W. M. 1995. Testing species phylogenies and phylogenetic methods with congruence. Syst. Biol. 44:64-76. Nei, M. 1987. "Molecular Evolutionary Genetics." Columbia University Press, New York. Nelson, G. J. 1979. Cladistic analysis and synthesis: Principles and definitions, with a historical note on Adanson's Familles des Plantes (1763-1764). Syst. Zool. 28:1-21. Page, R. D. M. 1989a. Comments on component-compatibility in historical biogeography. Cladistics 5:167-182. Page, R. D. M. 1989b. "COMPONENT ver. 1.5." Department of Zoology, University of Auckland, New Zealand. Parenti, L. R. 1981. A phylogenetic and biogeographic analysis of cyprinodontiform fishes. Bull. Am. Mus. Nat. Hist. 168:341-557. Parenti, L. R. 1984. A taxonomic revision of the Andean killifish genus Orestias. Bull. Am. Mus. Nat. Hist. 178:110-214. Parker, A., and Kornfield, I. 1995. A molecular perspective on evolution and zoogeography of cyprinodontid killifishes. Copeia 1995:8-21. Patterson, C. 1987. Introduction. In "Molecules and Morphology in Evolution, Conflict or Compromise?" (C. Patterson, ed.), pp. 122. Cambridge University Press, Cambridge. Patterson, C. 1988. Homology in classical and molecular biology. Mol. Biol. Evol. 5:603-625. Rodrigo, A. G., Kelly-Borges, M., Bergquist, P. R., and Bergquist, P. L.
183
1993. A randomisation test of the null hypothesis that two cladograms are sample estimtes of a parametric phylogenetic tree. N. Zeal. J. Bot. 31:257-268. Ruvolo, M., Pan, D., and Von Dornum, M. 1994. Gene trees and hominid phylogeny. Proc. Natl. Acad. Sci. USA 91:8900-8911. Sibley, C. G., and Ahlquist, J. E. 1987. Avian phylogeny reconstructed from comparisons of the genetic material, DNA. In "Molecules and Morphology in Evolution, Conflict or Compromise?" (C. Patterson, ed.), pp. 95-122. Cambridge University Press, Cambridge. Siddall, M. E. 1993. "RANDOM CLADISTICS ver. 2.1.1," University of Toronto, Toronto, Ontario. Sneath, P. H. A., and Sokal, R. R. 1973. "Numerical taxonomy." Freeman, San Francisco. Swofford, D. L. 1990. "PAUP: Phylogenetic analysis using parsimony, v3.0." Illinois Natural History Survey, Champaign, IL. Swofford, D. L. 1991. When are phylogeny estimates from from molecular and morphological data incongruent? In "Phylogenetic Analysis of DNA Sequences" (M. M. Miyamoto and J. Cracraft, eds.), pp. 295-333. Oxford University Press, New York. Swofford, D. L., and Olsen, G. J. 1990. Phylogeny reconstruction. In "Molecular Systematics" (D. M. Hillis and C. Moritz, eds.), pp. 411-501. Sinauer, Sunderland, MA. Sytsma, K. J. 1990. DNA and morphology: Inference of plant phylogeny. Trends. Ecol. Evol. 5:104-110. Sytsma, K. J., Smith, J. F., and Berry, P. E. 1991. Biogeography and evolution of morphology, breeding systems, flavonoids, and chloroplast DNA in the four Old World species of Fuchsia (Onagraceae). Syst. Bot. 16:257-269. Tamura, K. 1992. Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C content biases. Mol. Biol. Evol. 9:678-687. Turbeville, J. M., Schulz, J. R., and Raft, R. A. 1994. Deuterostome phylogeny and the sister group of the chordates: Evidence from molecules and morphology. Mol. Biol. Evol. 11:648-655. Vrana, P. B., Milinkovitch, M. C., Powell, J. R., and Wheeler, W. C. 1994. Higher-level relationships of arctoid Carnivora based on sequence data and "total evidence." Mol. Phyl. Evol. 3:47-58. Wainright, P. O., Hinkle, G., and Sogin, M. L. 1993. Monophyletic origins of the Metazoa: An evolutionary link with fungi. Science 260: 340- 341. Wheeler, W. C. 1990. Combinatorial weights and phylogenetic analysis: A statistical parsimony procedure. Cladistics 6:269-275. Williams, P. L., and Fitch, W. M. 1989. Finding the minimal change on a given tree. In "The Hierarchy of Life" (B. Fernholm et al., eds.), pp. 453-470. Elsevier, Amsterdam.
184
ALEX PARKER
APPENDIXI'.
Character 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Orbital rim Mesethmoid Pelvic fin supports Lacrimal Anterior basihyal Anterior naris Cephalic sensory pores Basibranchals Dorsal hypohyal Interarcual cartilage Autopalatine Head of autopalatine Metapterygoid Alveolar arm of premaxilla Dentary First dorsal fin ray Maxillary-rostral ligament Ethmo-maxillary ligament Maxillary-premaxillary meniscus Premaxillary process Rostral cartilage Inner arm of maxillary Articulation of lateral ethmoid with autopalatine Dorsal process of maxillary Nasal Lateral ethmoid
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
Autopterotic fossa Anal inclinators Distal arm of maxilary Anterior arm of parasphenoid Retroarticular Urogenital pouch Fourth pharyngiobranchial toothplate Outer tooth arrangement Second pharyngiobranchial Parietal Meckel's cartilage Transverse processes of vertebrae First postcleithrum Branchiostegal rays Dermosphenotic Preopercular Vomer Anterior process of maxilla Radials Ventral hypohyal Anterior ceratohyal First several hemal spines Pelvic fins Supraorbital pores Anal rays I and 2 Anal ray 3 Anal ray 4 Anal ray 5 Anal ray 6 Anal ray 7 Hemal arches Fertilization Fourth epibranchial
M o r p h o l o g i c a l characters a
State 0, free; 1, attached 0, ossified; 1, cartilagenous 0, nominal; 1, close set 0, wide and flat; 1, narrow and twisted 0, slender; 1, broad 0, small; 1, tubular 0, neuromasts; 1, pores; 2, absent 0, three ossified; 1, two ossified 0, absent; 1, present 0, large; 1, reduced 0, nominal; 1, anterioventral extension 0, straight; 1, offset and flanged; 2, reduced 0, absent; 1, present 0, straight; 1, posterior indentation 0, thin; 1, medial expansion; 2, medial extension 0, true first ray; 1, true second ray; 2, spine-like true second ray 0, present; 1, absent 0, present; 1, absent 0, present; 1, absent 0, flat and broad; 1, narrow/reduced 0, reduced; 1, large and rectangular; 2, dumbell shaped 0, does not contact rostral cartilage; 1, abuts rostral cartilage 0, present; 1, reduced; 2, absent 0, absent; 1, rounded/reduced; 2, lateral indentation; 3, medial expansion 0, nominal; 1, medial expansion 0, medial expansion/perpendicular to frontal; 1, lateral facet articulates with head of autopalatine; 2, expanded; 3, nominal 0, nominal; 1, reduced; 2, wide 0, nominal; 1, enlarged; 2, fan-shaped 0, reduced; 1, nominal; 2, enlarged 0, nominal; 1, expanded 0, nominal; 1, elongate; 2, extremely elongate 0, absent; 1, present 0, nominal; 1, reduced; 2, fused to third 0, multiserial; 1, uniserial 0, not offset; 1, offset to third 0, absent; 1, present 0, narrow; 1, posterior expansion 0, nominal; 1, reduced 0, absent; 1, present 0, covered; 1, exposed 0, nominal; 1, reduced 0, nominal; 1, reduced 0, present; 1, posteriorly triangular; 2, absent 0, absent; 1, present 0, nominal; 1, dorsally placed 0, nominal; 1, expanded 0, nominal; 1, no ventral extension; 2, separated from posterior ceratohyal 0, without pleural ribs; 1, with pleural ribs 0, anterior; 1, posterior; 2, absent 0, 2b-4a recessed; 1, nominal; 2, reduced 0, nominal; 1, reduced 0, reduced; 1, nominal; 2, gonopodium 0, reduced; 1, nominal; 2, gonopodium 0, reduced; 1, nominal; 2, gonopodium 0, reduced; 1, nominal; 2, gonopodium 0, reduced; 1, nominal; 2, gonopodium 0, unmodified; 1, modified to support gonopodium 0, external; 1, internal 0, unmodified; 1, modified to support dorsal gill arch elements
11. Combined Data of Cyprinodontiformes
185
APPENDIX I--Continued Character 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 85 87
Exoccipital chondyles Neural arches Supraoccipital processes Epiotic processes Sexual laterality Inner teeth Middle anal radials Distal arm of premaxilla Articular Reproductive system Sperm transfer organ Embryonic trophotaeniae Ovaries Proximal anal radials Supraoccipital Pharyngiobranchial teeth Lower limb of posttemporal Number of vertebrae Interhyal Urohyal Number of dorsal fin rays Midlateral blotch Suborbital bar First pharyngiobranchial Angularticular Dorsal scales of head Lateral scales of head and forebody Outer tooth form
aTaken from Parenti (1981, 1984).
State 0, absent; 1, present 0, open; 1, closed; 2, first pair angled anteriorly 0, nominal; 1, enlarged; 2, enlarged and single 0, nominal; 1, enlarged 0, absent; 1, present 0, lateral cusps; 1, without lateral cusps; 2, absent 0, present; 1, first 2-5 reduced, absent or fused to proximals; 2, cartilagenous 0, S-shaped; 1, straight 0, nominal; 1, reduced 0, oviparous; 1, viviparous 0, absent; 1, gonopodium; 2, tubular gonopodium; 3, muscular intromittent organ 0, absent; 1, present 0, paired; 1, without ovigerous tissue in walls; 2, fused 0, nominal; 1, some elongate 0, free; 1, fused to foramen magnum 0, not in rows; 1, in discrete rows 0, ossified; 1, cartilagenous 0, mode ~28; 1, mode >28 0, ossified; 1, cartilagenous 0, free; 1, embedded, leading to 90~ angle of lower mandible 0, mode ~15; 1, mode ->15 0, absent; 1, present 0, absent; 1, present 0, absent; 1, present 0, without ventral extension; 1, with ventral extension 0, nominal; 1, enlarged and weakly striated 0, nominal; 1, enlarged and weakly striated 0, unicuspid; 1, tricuspid; 2, bicuspid; 3, absent
186
ALEX PARKER
APPENDIX 11: Morphological data employed in analysis of cyprinodontoid relationships Character
Cynolebias whitei Rivulus sp. R. harti Nothobranchius melanospilus Profindulus guatemalensis Fundulus heteroclitus Poecilia caucana Xipophorus maculatus X . signum Cnesterodon decemmaculatus Tomeurus gracilis Aplocheilichthys kassenjiensis A. spilauchen Fluviphylax pygmaeus Anableps anableps lenynsia lineata Crenichthys baileyi Zoogoneticus quitzeoensis Xenotoca eiseni Cubanichthys pengellyi Cyprinodon rubr$uvilatus Jordanellaj7oridae
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 1
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0
1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 1 1 1 3 3 3
0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 1 1 1 3 3 3 3 3 2 2 3 3 3 0 0 0 0 0 0
0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
0 0 0 0 0 0 0 2 2 2 2 2 0 0 0 0 1 1 1 1 1 1
0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 1 1 1 2 2 2
0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
187
22. Combined Data of Cyprinodontiforrnes
3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 7 7 7 7 8 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 8 0 0 0 0 0 0 0 0 0 o 1 2 2
0 0 0 0 0 0 1 1 1 i 1 1 1
0 0 0 0 0 0 0 0 0 o 1 0 0
0 0 0 0 0 0 0 0 0 ~ 0 0 0
0 0 0 0 0 0 0 0 0 o 0 0 0
1 1 1 1 1 1 1 1 1 i 1 0 0
0 0 0 0 0 0 0 0 0 o 0 0 0
0 0 0 0 0 0 0 0 0 o 0 0 0
0 0 0 1 1 1 1 1 1 i 1 1 1
1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 o o o o o i i i i 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0
1 1 1 1 1 1 0 0 0 o 0 0 0
1 1 1 1 1 1 1 1 1 o 1 1 1
1 1 1 1 1 1 2 2 2 i 2 1 1
1 1 1 1 1 1 2 2 2 2 2 1 1
1 1 1 1 1 1 2 2 2 2 2 1 1
1 1 1 1 1 1 1 1 1 2 1 1 1
1 1 1 1 1 1 1 1 1 i 1 1 1
0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 1 1 1 i 0 0 0
0 0 0 0 0 0 1 1 1 i 1 0 0
1 1 1 1 1 1 0 0 0 i 0 1 1
1 1 1 1 0 0 0 0 0 o 0 1 1
0 0 0 0 0 0 0 0 0 ~ 0 0 0
0 0 0 0 0 0 0 0 0 o 0 0 0
0 0 0 0 0 0 0 0 0 o 0 0 0
1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 o i o o o i 1 o i i o 1 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0
0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 ? 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 0 0
2 2 0 0 0 0 0 0
1 1 0 0 0 0 0 0
0 0 0 0 0 1 1 1
0 0 0 0 0 0 1 1
0 0 0 0 0 0 1 1
1 1 1 1 1 1 0 0
0 0 0 0 0 0 1 1
0 0 0 0 0 0 1 1
1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1
2 1 1 1 1 1 1 1
1 1 1 0 0 1 1 1
2 2 1 0 0 1 1 1
2 1 1 0 0 1 1 1
2 1 1 0 0 1 1 1
2 2 1 1 1 1 1 1
2 2 1 1 1 1 1 1
0 0 0 0 0 0 0 0
1 1 0 1 1 0 0 0
0 0 0 0 0 0 0 0
1 1 1 1 1 1 0 0
1 1 0 0 0 1 1 1
1 1 0 0 2 0 0 0
1 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0
0 0 1 1 1 1 2 2
0 1 1 1 1 0 0 0
0 0 1 1 1 0 0 0
0 0 1 1 1 0 0 0
1 1 0 1 1 0 0 0
2 2 0 3 3 0 0 0
0 0 0 1 1 0 0 0
1 1 1 0 0 1 1 1
1 1 0 0 0 0 0 0
1 1 2 0 0 0 1 1
188
ALEX PARKER
APPENDIX IIZ: Morphological data employed in analysis of cyprinodontid relationships Character
Aphanius fasciatus A. dispar A. mento A. chantrei Kosswigichthys asquamatus Orestias ispi 0.luteus 0.agassii Cyprinodon variegatus Megupsilon aporus ]ordanellafloridae Gumanella pulchra Cualac tesselatus Floridichthys carpi0 Cubanichthys pengellyi Fundulus heteroclitus
1 1 3 3 3 3 3 3 3 4 4 4 6 6 6 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 7 5 6 3 4 5 6 7 8 9 3 7 9 0 1 6 2 4 5 6 7 8 9 0 1 2 3 4 5 6 7 1 1 0 0 0 0 0 0 1 2 1 1 1 1 1 1
2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 2 1 2 2 1 1 1 1
1 1 1 1 1 2 2 2 1 1 1 1 1 1 1 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1
0 0 0 0 0 2 2 2 0 0 0 0 0 0 0 0
0 0 0 0 0 2 2 2 0 0 0 0 0 0 0 0
1 1 1 1 1 2 2 2 1 2 1 1 1 1 1 1
1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1
1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1
0 0 0 0 0 2 2 2 0 0 0 0 0 0 0 0
1 1 1 1 1 2 2 2 1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0
0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0
1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1
0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0
C H A P T E R
12 Molecular Phylogeny of the Fundulidae ('Teleostei, Cyprinodontiformes) Based on the Cytochrome b Gene GIACOMO BERNARDI Department of Biology University of California at Santa Cruz Santa Cruz, California 95064
of characters used to unravel the phylogenetic relationships among Fundulidae. This chapter gives a general overview of the major phylogenetic issues relevant to the family and presents molecular data that will address some of these issues. Fundulidae is a relatively large group of cyprinodontiform fishes that live in fresh, brackish, and coastal marine waters. They are distributed over Central and North America, and their tolerance for high salinity probably explains their presence on Cuba and Bermuda (Fig. 1). An introduced population of F. heteroclitus is also found in southern Spain (Bernardi et al., 1995). Two species, F. parvipinnis and F. lima, are isolated on the western part of the North American continent, in California and Baja California (Mexico). Fundulids are oviparous, and their reproduction and egg development have been thoroughly studied (on earth as well as in space!) (Hubbs and Burnside, 1972; Koenig and Livingston, 1976; Taylor et al. 1977, Hoffman et al., 1977). Other aspects of fundulid biology have also been studied such as hybridization (Hubbs and Drewry, 1959; Setzer, 1970), behavior (Foster, 1967), and karyology (Chen, 1971; Chen and Ruddle, 1970). F. heteroclitus is probably the best-studied fish model for enzyme kinetics and expression. Overall, this group has been
I. I n t r o d u c t i o n
The family Fundulidae has a long and complex taxonomic history. After being included in the cyprinodontid subfamily Fundulinae by Myers (1931), the genera Adinia, Fundulus, Lucania, Leptolucania, and Plancterus were elevated to family status (Fundulidae) by Parenti (1981) in her major revision of order Cyprinodontiformes. To these extant genera, a few fossils forms, generally attributed to either Fundulus or Parafundulus, are also added to the family (Eastman, 1917; Miller, 1945; Parenti, 1981). A Central American family, Profundulidae, which includes one genus Profundulus with five species, is generally considered a sister clade to fundulids and other cyprinotontoids (Fig. 1). The fundulid genera themselves have been the subject of extensive taxonomic work, with a special emphasis put on the most speciose genus of the family, Fundulus. Fundulus systematics dates as far back as Linnaeus. The genus was revised several times by researchers including Garman (1895), Jordan and Evermann (1896), Jordan et al. (1930), Hubbs (1931), Miller (1955), Farris (1968), Parenti (1981), and Wiley (1986). Allozymic (Cashner et al., 1992, for a review) and DNA (Bernardi and Powers, 1995) data have also been added to the list
MOLECULAR SYSTEMATICS OF FISHES
189
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
190
GIACOMO BERNARDI
Profundulus (5)
I Profundulidae
Plancterus (1) Fundulus (40)
Fundulidae
Lucania (2) Leptolucania (1) Adinia (I)
Other Cyprinodontoids
: .~
Parenti, 1981 oC:::~ Profundulus "Plancterus"
13o ~
~ooo
\~~AN"
70
o
I Profundulidae
Lucania
Distributional limits of the family Fundulidae (redrawn after Parenti, 1981).
FIGURE 1
"Fundulus"
Fundulidae
Leptolucania Adinia
thoroughly studied in almost every possible aspect, however, its phylogenetic relationships, so essential for comparative studies, are still poorly understood.
Other Cyprinodontoids
Wiley, 1986
II. Morphology
Relationships of the major genera among Fundulidae according to Parenti (1981) (top) and Wiley (1986) (bottom). (Top) Numbers in parentheses correspond to the number of species within each genus.
Parenti (1981) defines the family using two morphological synapomorphies: "(1) inner arms of the maxillaries directed anteriorly, and often pronounced hooks; and (2) snout pointed and drawn anteriorly with the autopalatine projecting and not articulating with the lateral ethmoid." Wiley (1986) agrees with this definition but questions the validity of the second character. He proposes, however, another morphological character to support the family: "in all fundulids, the epipleural ribs overlap the pleural ribs and are either directly connected to the parapophysis (Adinia, Leptolucania) or to the parapophysis via connective tissue (Fundulus, Lucania, "'Plancterus")" (Wiley, 1986). Once the boundaries of the family are defined, the major issues concerning this family are the interrelationships of the different genera and the monophyletic status of Fundulus. Few attempts have been made to establish phylogenetic relationships among fundulid genera, the most precise ones being presented by Parenti (1981) and Wiley (1986) (Fig. 2). Wiley (1986) questions several morphological characters used by Parenti (1981) to derive phylogenetic relationships within the family and concludes "that the placement of nominal genera
within the family is problematical and a solution must await additional characters." The phylogenetic relationships among Fundulus species, however, have been studied in great detail. By removing Plancterus from Fundulus, Parenti was able to find a single character in support of Fundulus monophyly, a broad articular surface on the second pharyngobranchial. This character is questioned by Wiley (1986), but no alternate character is proposed. In any case, both Parenti and Wiley have doubts about Fundulus monophyly. Indeed, Parenti (1981) says that "a more parsimonious interpretation would place some species of Fundulus as more closely related to Lucania, Leptolucania, or Adinia,'" and Wiley cannot show Fundulus "to be monophyletic and cannot exclude the possibility that it might be para- or polyphyletic" (Wiley, 1986). Fundulus is the most speciose genus of the family. Although Adinia, Leptolucania, Lucania, and Plancterus comprise 5 or 6 species altogether, Fundulus alone includes more than 35 species. Studies on Fundulus relationships were first attempted by Miller
FIGURE 2
12. Fundulidae
(1955) who ranked 27 species in a tentative phylogenetic sequence and by Brown (1957) who placed, without explanation, these taxa into five subgenera: Fontinus, Fundulus, Plancterus, Xenisma, and Zygonectes. Griffith (1972, 1974) established evolutionary relationships among the different taxa based on 70 characters, and Farris (1968), using morphological characters, placed Fundulus taxa into four monophyletic subgenera: Fundulus, Plancterus, Xenisma, and Zygonectes. Lastly, Wiley (1986) provided a phylogenetic analysis of the genus using morphological characters. Wiley recognized the five subgenera described by Brown (1957) but did not find a place for Plancterus and the West Coast Fundulus (i.e., F. parvipinnis and F. lima), which were assigned to the "other species" category.
191
IV. Fish Samples Samples were obtained from all the extant fundulid genera. Adinia xenica , Fundulus olivaceus, F. chrysotus, and F. dispar were collected in Louisiana by B. J. Granier, Leptolucania ommata was collected in Alabama by R. Harper, F. notatus and F. catenatus were collected in Texas by A. Stock and D. W. Stock, F. lima was collected in San Ignacio, Baja California Sur, Mexico, by C. H. Stowell, and F. parvipinnis was collected in Santa Barbara, California, by S. Anderson. DNA sequences from Plancterus zebrinus and DNA from Profundulus punctatus were made available by C. Grant. DNA was extracted from liver tissue following Bernardi and Bernardi (1990).
III. Allozymes and DNA V. DNA Sequences Allozyme data have been used to study Fundulus phylogenetic relationships at the population (Powers and Place, 1978) and species level (Fleming et al., 1962; Duggins et al., 1989). More extensive investigations at the subgeneric and generic level were presented by Cashner and co-workers (Rogers and Cashner, 1987; Cashner et al., 1988; Grady et al., 1990; Cashner et al., 1992). Allozyme work not only provided support for the monophyletic status of subgenera Xenisma and Zygonectes, as well as a clarification of the relationships of taxa within these subgenera, but also provided a framework to better understand the biogeographical implications of Fundulus distributions (Cashner et al., 1992). At the DNA level, nuclear and mitochondrial markers have been used. The nuclear lactate dehydrogenase-B gene has extensively been studied by Powers and co-workers (1993 for a review), mostly in F. heteroclitus populations. Mitochondrial DNA (mtDNA) restriction fragment length polymorphisms (RFLPs) and sequences were also studied for the same populations (Gonzales-Villasenor and Powers, 1990; Bernardi et al., 1993). At a higher taxonomic level, mtDNA gene sequences were determined for the genera Crenichthys and Empetrichthys, which were confirmed as nonfundulids (Grant and Riddle, 1995), for nine species of Fundulus and for Plancterus zebrinus (Bernardi and Powers, 1995). West Coast Fundulus were found to be very divergent, but because sequences from only two genera, Fundulus and Plancterus, were analyzed, and Plancterus was used as an outgroup, the monophyletic status of Fundulus could not be addressed. This chapter presents sequence data from all fundulid genera and one outgroup, Profundulus, and discusses the phylogenetic implications derived from these results.
The polymerase chain reaction (PCR) (Saiki et al., 1988) was used to amplify a 270-bp region of the cytochrome b gene, beginning at the human amino acid 34. Primers and PCR protocols followed Kocher et al. (1989) and Palumbi et al. (1991). Sequencing and PCR primers used were CB2-H, CB1-L, and GLUDG-L (Palumbi et al., 1991). Approximately 100 ng of DNA was used as template for 100-~1 PCR reactions containing 10 mM Tris-HC1 (pH 8.3), 50 mM KC1, 1.5 mM MgC12, 0.01% (w/v) gelatin, 200 mM each dNTP, 2.5 units of Taq DNA polymerase (Perkin-Elmer Cetus), and 1 ]zM each amplification primer. PCR products were used for Taq DyeDeoxy Terminator cycle-sequencing reactions (Applied Biosystems Inc.) and loaded on an automated sequencer (Applied Biosystems 373A). Cytochrome b sequences were aligned using the Navigator program (Applied Biosystems Inc.). Phylogenetic analyses employed maximum parsimony (MP) using the Heuristic option of the PAUP program (phylogenetic analysis using parsimony, Swofford, 1993). The degree of confidence assigned to nodes in trees obtained by MP was determined by bootstrapping (Felsenstein, 1985) with 2000 replicates (Hedges, 1992). The topology-dependent cladistic permutation tail probability analysis (T-PTP) (Faith, 1991) was performed by randomly shuffling the data sets 99 times (after removing the outgroup sequence), using the RANDOMIZER package (Trueman, 1994) and these permuted data sets as input files in PAUP. Actual tree topologies were considered significantly better than random ones when less than 5% of the random sets produced shorter trees than the actual data. The maximum likelihood test of Kishino and Hasegawa (1989)
TABLE I
Fundulus heteroclitus heteroclitus Fundulus heteroclitus macrolepidotus Fundulus grandis Fundulus notatus Fundulus olivaceus Fundulus dispar Fundulus chrysotus Fundutus catenatus Fundulus lima Fundulus parvipinnis Planeterus zebrinus Lucania parva Adinia xenica Leptolucania ommata Profundulus punctatus
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
m 4 18 45 41 44 49 37 50 51 47 37 37 49 59
inf m 20 45 41 46 51 39 50 49 47 39 39 50 55
9.0 10 -46 45 49 48 34 53 55 46 37 35 49 56
3.0 3.0 3.5 -15 36 43 46 54 55 44 52 39 44 54
2.9 2.9 3.8 5.0 -35 39 47 55 54 47 48 39 47 58
2.8 2.9 3.1 4.0 3.5 ~ 47 50 51 58 47 48 41 49 58
4.1 4.2 4.8 4.8 4.9 3.9 ~ 45 62 59 51 41 43 50 54
3.7 3.9 4.2 4.2 3.9 3.6 4.5 ~ 50 49 45 43 41 53 57
2.6 2.6 2.8 3.0 3.2 2.4 3.0 2.6 -20 53 53 58 63 59
3.2 3.1 3.4 3.7 3.4 2.9 3.3 3.1 4.0
4.7 4.7 4.6 2.6 2.9 3.4 4.2 3.2 2.3 2.5
2.8 3.0 2.8 3.2 3.7 3.7 3.7 3.9 2.7 2.8 4.3
3.1 3.2 2.9 3.5 3.9 4.1 7.2 3.4 2.8 3.4 3.7 4.4
2.5 2.5 2.5 2.9 3.4 2.5 3.6 2.7 2.7 3.3 2.7 2.6 2.6
4.2 3.9 4.0 3.2 3.6 3.2 3.4 2.8 2.2 2.3 3.1 3.0 2.8 2.6
60 58 62 66 59
a The n u m b e r of s u b s t i t u t i o n s b e t w e e n taxa are s h o w n below the diagonal. T r a n s i t i o n / t r a n s v e r s i o n ratios are s h o w n above the diagonal.
47 44 59 56
48 50 62
47 57
62
12. Fundulidae was also used to test significance between different trees. The test was performed using the corresponding option in the PHYLIP package (Felsenstein, 1989). The 270-bp portion of the cytochrome b was analyzed (contact author for raw data). Of 263 aligned positions, 111 were variable and 92 were phylogenetically informative. The number of transitions was higher than the number of transversions (Table I), with an average ratio of 3.5 (thus weighting ratios, when used, corresponded to 3 for transversions and 1 for transitions). This result indicates that data are likely to be close to the multiple-hit zone (Brown et al., 1982; Meyer and Wilson, 1990) and corroborates the idea that cytochrome b genes are often not ideal molecular markers for this level of phylogenetic analysis (Meyer, 1994). In the case reported here, cytochrome b data do not allow us to completely resolve the phylogenetic relationships among all taxa that were studied, but do allow us to statistically test some phylogenetic hypotheses.
9 .~
9 .~
193
VI. Phylogenetic Relationships A single most parsimonious tree was obtained (unweighted tree length = 342 steps, consistency index = 0.49; weighted tree length = 538 steps). When transition/tranversion weights were changed or removed, the topology of the tree remained mostly unchanged with the exception of the unstable position of F. chrysotus. Although the overall topology of the tree was stable, only a few clades showed high bootstrap support. A consensus tree (50% majority-rule consensus) of 2000 bootstrap replicates is shown in Fig. 3 (only bootstrap values higher than 50% are shown). Three questions may be addressed from these results: (1) Are the West Coast Fundulus a sister clade to all other fundulids? (2) Is Fundulus monophyletic? (3) Are phylogenetic relationships among fundulids based on morphological and DNA characters concordant?
F.h.heteroclitus
F
F.h.macrolepidotus
F
F.grandis
F
F.catenatus
X
F.notatus
Z
F.olivaceus
Z
F.dispar
Z
Fundulus
~ Xenisma
Zygonectes
L.ommata F.chrysotus
Z
~Zygonectes
O.S
~Plancterus
A.xenica P.zebrinus L.parva 100
F.lima
O.S
F.parvipinnis
O.S
Xenisma
P.punctatus FIGURE 3 Phylogenetic tree of the family Fundulidae obtained using 270 bp of the cytochrome b gene. Consensus tree (50% majority rule consensus) resulting from 2000 bootstrap replicates. The numbers by each branch indicate the result of a bootstrap analysis (2000 replicates) using maximum parsimony (heuristic search) when the bootstrap value was greater than 50%. Vertical bars correspond to the subgenera recognized by Farris (1968) whereas italic letters correspond to the subgenera described by Wiley (1986) (E Fundulus; Z, Zygonectes; X, Xenisma; and O.S., Other species).
194
GIACOMO BERNARDI
A. Are West Coast Fundulus a Sister Clade to A l l O t h e r Fundulids?
(1968), Wiley could not place them in any subgenus and prefers to include them in an undefined "other species" group (Wiley, 1986). Data show that the two West Coast species form a robust clade (supported in 100% of the bootstrap replicates) and that they are the sister group to all other species examined. Four supplementary steps would be necessary to disrupt this sistership. A T-PTP test showed that this result was highly significant (Fig. 4a). The maximum likelihood test of Kishino and Hasegawa (1989) also indicates that the phylogenetic trees have significantly different topologies. Two important implications can be derived from these results: (1) the West Coast Fundulus are shown to be the sister clade of all other fundulids and (2) Fundulus is not monophyletic.
As mentioned earlier, both Parenti (1981) and Wiley (1986) have questioned the monophyletic status of Fundulus. Indeed, only a single character was found by Parenti to support Fundulus monophyly. It is also worth noticing that within the family, the other genera only include one or two species whereas Fundulus comprises more than 35 species (Fig. 2). When using cytochrome b sequence data, 12 steps would have to be added to the most parsimonious tree to obtain a monophyletic Fundulus. In order to determine if these 12 steps are statistically significant, a topology-dependent cladistic permutation tail probability (T-PTP) test was performed (Faith, 1991; Halanych et al., 1995). Data reported here did not support a monophyletic Fundulus (data not shown). If, as shown in Fig. 3, West Coast Fundulus are the sister clade of all other fundulids, then by definition the genus Fundulus is not monophyletic. Two species of West Coast Fundulus, F. lima and F. parvipinnis, live in an isolated area of the West Coast of the United States and Mexico. Although F. parvipinnis can live in fresh, brackish, or salt water, generally preferring brackish estuaries and sloughs along the coasts of California and Baja California (Mexico), F. lima live in freshwater lagoons in the Baja California desert close to San Ignacio. These species have been isolated from the rest of the group since the beginning of the Pliocene, 5.3 million years ago (Griffith, 1972). F. lima and F. parvipinnis may have migrated to the western part of the continent from the East Coast using a southern route before the closing of the Isthmus of Panama (Griffith, 1972). Although these species have tentatively been assigned to subgenus Xenisma by Farris
14d,)
12-
q--4
10-
I.,.,.
8-
,.~
6-
E
4-
Z
e-
o
0
-
B. Is the Genus Fundulus Monophyletic? Because data show that West Coast Fundulus are not to be included in the genus, the next question is whether other Fundulus representatives form a monophyletic assemblage. The author analyzed data constraining the genus Fundulus (after removing the west coast species) to be monophyletic. However, these data were unable to provide statistically significant evidence for either hypothesis (Fig. 4b).
C. Are Phylogenetic Relationships among Fundulids Based on Morphological and DNA Characters Concordant? 1. F u n d u l i d a e
Fundulid relationships have been proposed by Parenti (1981) and Wiley (1986) (Fig. 2), however; neither
a
i
-5
6
5
10
1'5
2'0
0
q '0
15
20
25
30
Tree length differences (steps) FIGURE 4 Distributions of tree length differences between constrained and unconstrained topologies. Randomized data (white bars) are compared to actual data (black bar). (a) A T-PTP test for West Coast Fundulus (i.e., F. parvipinnis and F. lima) being the sister clade of all other fundulids. (b) A T-PTP test for Fundutus (after removing West Coast representatives) monophyly.
12. Fundulidae are supported by the author's data. Indeed, the phylogenetic relationships suggested by Parenti and Wiley require, respectively, 16 and 19 more steps than the relationships based on cytochrome b sequences (as shown in Fig. 3). Although the author's data do not support these relationships, no statistically supported alternative emerges from these data (most of the clades have low bootstrap support). 2. Fundulus
Fundulus has been divided into three subgenera by most authors, Fundulus, Xenisma, and Zygonectes; two other subgenera have also been proposed, Fontinus and Plancterus. Figure 3 compares the author's molecular results with previous subgeneric assignments based on morphological characters (Farris, 1968; Wiley, 1986). Representatives of subgenera Fundulus, Xenisma, and Zygonectes were included in the analysis. The subgenus Fundulus, which is the least controversial of the groupings, is consistent for the three studies presented in Fig. 3. Molecular data support this group with high bootstrap values (94% of bootstrap replicates); however, data for more taxa are needed to confirm these results. Within Zygonectes, the striped species F. notatus and F. olivaceus are found to be sister taxa (95% bootstrap). This result is not surprising and is generally accepted (Farris, 1968; Wiley, 1986; Cashner et al., 1992). Another Zygonectes representative, F. chrysotus, does not cluster with the remaining Zygonectes. However, as mentioned earlier, the branch leading to F. chrysotus is unstable and data are not incompatible with a monophyletic Zygonectes. F. catenatus, a Xenisma representative, is found to be the sister clade of subgenus Fundulus. This result is in disagreement with Farris (1968), who considers Fundulus to be closely related to Zygonectes. The author's results are also in disagreement with the placement of the West Coast species in the subgenus Xenisma (Farris, 1968). As mentioned earlier, F. parvipinnis and F. lima are found to be the sister clade of the rest of the fundulids.
VII. C o n c l u s i o n
Fundulids have been the subject of several conflicting phylogenetic analyses making them a system of choice for molecular studies. Hypotheses based on morphology, behavior, and allozymic studies can be compared with molecular data, and the differences can be statistically tested. Our results are mostly in agreement with subgeneric assignments of different Fundulus species. The subgenera Fundulus and Zygonectes are concordant between the different studies; only Xenisma
195
exhibits important differences among the analyses. At the other end of the hierarchical scale, the generic positions within the family are different between the two morphological studies and sequence data presented here. More taxa and more characters will be needed to clearly define relationships at the intrafamilial level. The West Coast Fundulus species, previously assigned to the "other species" group by Wiley (1986), seem to form a monophyletic sister clade to all other fundulids (or at least all other fundulids studied here). This finding could be the result of long time and geographical isolation of F. lima and F. parvipinnis from the rest of the group, which would produce long branches that might artificially group the two clades. However, for both species the branch length is less than the average branch length of other taxa making this possibility unlikely. If West Coast Fundulus are a sister clade to all other Fundulidae, as data suggest, some taxonomic revisions concerning these two species may have to be considered. Furthermore, it has been shown that F. lima and F. parvipinnis occupy a basal position, making them good indicators for the time of divergence of the family. The family Fundulidae would have diverged before the divergence of the West Coast fundulids from the rest of the family, approximately 5 million years ago.
Acknowledgments This study would not have been possible without the samples or sequences provided by Chris Grant, B. G. Granier, Chris Stowell, Rodney Harper, Shane Anderson, Albert Stock, and David Stock. Chris Grant, Robert Cashner, and Dennis Powers provided useful comments and discussion. Thanks to John Trueman (Australian National University) for providing the Randomizer program and his expertise on T-PTP tests. This research was partly supported by faculty research funds granted by the University of California, Santa Cruz.
References Bernardi, G., and Bernardi, G. 1990. Compositional patterns in the nuclear genome of cold-blooded vertebrates. J. Mol. Evol. 31: 265-281. Bernardi, G., Fernandez-Delgado, C., Gomez-Chiarri, M., and Powers, D. A. 1995. Origin of a Spanish population of Fundulus heteroclitus inferred by cytochrome b sequence analysis. J. Fish Biol. 47: 737-740.. Bernardi, G., and Powers, D. A. 1995. Phylogenetic relationships among nine species from the genus Fundulus (Cyprinodontiformes, Fundulidae) inferred from sequences of the cytochrome b gene. Copeia 469-471. Bernardi, G., Sordino, P., and Powers, D. A. 1993. Concordant mitochondrial and nuclear DNA phylogenies for populations of the teleost fish Fundulus heteroclitus. Proc. Natl. Acad. Sci. USA 90: 9271-9274. Brown, J. L. 1957. A key to the species and subspecies of the cyprin-
196
GIACOMO BERNARDI
odont genus Fundulus in the United States and Canada east of the continental divide. J. Wash. Acad. Sci. 47:69-77. Brown, W. M., Prager, E. M., Wang, A., and Wilson, A. C. 1982. Mitochondrial DNA sequences of primates: Tempo and mode of evolution. J. Mot. Evol. 18:225-239. Cashner, R. C., Rogers, J. S., and Grady, J. M. 1988. Fundulus bifax, a new species of the subgenus Xenisma from the Tallapoosa and Coosa river systems of Alabama and Georgia. Copeia 674-683. Cashner, R. C., Rogers, J. S., and Grady, J. M. 1992. Phylogenetic studies of the genus Fundulus. In "Systematics, Historical Ecology, and North American Freshwater Fishes" (Richard L. Mayden, ed.). Stanford University Press, Stanford, CA. Chen, T.R. 1971. A comparative chromosome study of twenty killifish species of the genus Fundulus (Teleostei: Cyprinodontidae). Chromosoma 32: 436-453. Chen, T. R., and Ruddle, F. H. 1970. A chromosome study of four species and a hybrid of the killifish genus Fundulus (Cyprinodontidae). Chromosoma 29:255-267. Duggins, C. F. J., Relyea, K. G., and Karlin, A. A. 1989. Biochemical systematics in southeastern populations of Fundulus heteroclitus and Fundulus grandis. Northeast Gulf Sci. 10:95-102. Eastman, C. R. 1917. Fossil fishes in the collection of the United States National Museum. Proc. U.S. Natl. Mus. 52:235-304. Faith, D. P. 1991. Cladistic permutation tests for monophyly and nonmonophyly. Syst. Zool. 40:366-375. Farris, J. S. 1968. "The Evolutionary Relationships between the Species of the Killifish Genera Fundulus and Profundulus (Teleostei: Cyprinodontidae)." Unpublished Ph.D. dissertation, University of Michigan, Ann Arbor, MI. Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783-791. Felsenstein, J. 1989. PHYLIP, manual version 3.4 (University Herbarium, University of California, Berkeley, 1989). Fleming, W. R., Scheffel, K. O., and Linton, J. R. 1962. Studies on the gill cholinesterase activity of several cyprinodontid species. Comp. Biochem. Physiol. 6:205-213. Foster, N. R. 1967. "Comparative Studies of the Biology of Killifishes (Pisces: Cyprinodontidae)." Unpublished Ph.D. dissertation, Cornell University, Ithaca, NY. Garman, S. 1895. The cyprinodonts. Mem. Mus. Comp. Zool. 19:1-179. Gonzales-Villasenor, L. I., and Powers, D. A. 1990. MitochondrialDNA restriction-site polymorphisms in the teleost Fundulus heteroclitus support secondary intergradation. Evolution 44:27-37. Grady, J. M., Cashner, R. C., and Rogers, J. S. 1990. Evolutionary and biogeographic relationships of Fundulus catenatus (Fundulidae). Copeia 315-323. Grant, E. C., and Riddle, B. R. 1995. Are the endangered springfish (Crenichthys Hubbs) and poolfish (Empetrichthys Gilbert) Fundulines or Goodeids? A mitochondrial DNA assessment. Copeia 209-212. Griffith, R. W. 1972. "Studies on the Physiology and Evolution of Killifishes of the Genus Fundulus.'" Unpublished Ph.D. dissertation, Yale University, New Haven, CT. Griffith, R. W. 1974. Environment and salinity tolerance in the genus Fundulus. Copeia 319-331. Halanych, K. M., Bacheller, J. D., Aguinaldo, A. M. A., Liva, S. M., Hillis, D. M., and Lake, J. A. 1995. Evidence from 18S ribosomal DNA that the lophophorates are protostome animals. Science 267: 1641-1643. Hedges, S. B. 1992. The number of replications needed for accurate estimation of the bootstrap P value in phylogenetic studies. Mol. Biol. Evol. 9:366-369. Hoffman, R. B., Salinas, G. A., and Baky, A. A. 1977. Behavioral analysis of killifish exposed to weightlessness in the ApolloSoyuz test project. Aviat. Space Environ. Med. 48: 712-717.
Hubbs, C. 1931. Studies of the fishes of the order Cyprinodontes. X. Four nominal species of Fundulus placed in synonymy. Occas. Pap. Mus. Zool. Univ. Mich. 16:1-86. Hubbs, C., and Burnside, D. F. 1972. Developmental sequences of Zygonectes notatus at several temperatures. Copeia 862-865. Hubbs, C., and Drewry, G. E. 1959. Survival of F1 hybrids between cyprinodont fishes, with a discussion of the correlation between hybridization and phylogenetic relationships. Publ. Inst. Mar. Sci. 6:81-91. Jordan, D. S., and Evermann, R. W. 1896. The fishes of North and Middle America. Bull. U.S. Nat. Mus. 47:1-3313. Jordan, D. S., Evermann, R. W., and Clarck, H. W. 1930. Checklist of the fishes and fishlike vertebrates of North and Middle America north of the northern boundary of Venezuela and Colombia. Rept. U.S. Comm. Fish. 1928:1-670. Kishino, H., and Hasegawa, M. 1989. Evaluation of the maximum likelyhood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order of the Hominoidea. J. Mol. Evol. 29:170-179. Koenig, C. C., and Livingston, R. J. 1976. The embryological development of the diamond killifish (Adinia xenica). Copeia 435-445. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., Paabo, S., Villablanca, F. X., and Wilson, A. C. 1989. Dynamics of mitochondrial DNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86:6196-6200. Maddison, W. P., and Maddison, D. R. 1989. Interactive analysis of phylogeny and character evolution using the computer program McClade. Folia Primatot. 53:190-202. Meyer, A. 1994. Shortcomings of the cytochrome b gene as a molecular marker. Trends Ecol. Evol. 9:278-280. Meyer, A., and Wilson, A. C. 1990. Origin of tetrapods inferred from their mitochondrial DNA affiliation to lungfish. J. Mol. Evol. 31: 359-364. Miller, R. R. 1945. Four species of fossil cyprinodont fishes from eastern California. J. Wash. Acad. Sci. 5:315-321. Miller, R. R. 1955. An annotated list of the American cyprinodontid fishes of the genus Fundulus, with the description of Fundulus persimilis from Yucatan. Occas. Pap. Mus. Zool. Univ. Mich. 568:1-25. Myers, G. S. 1931. The primary groups of oviparous cyprinodont fishes. Stanford Univ. Publ. 6:1-14. Palumbi, S. R., Martin, A., Romano, S., McMillan, W. O., Stice, L., and Grabowski, G. 1991. "The Simple Fool's Guide to PCR." University of Hawaii, Honolulu. Parenti, L. R. 1981. A phylogenetic and biogeographic analysis of cyprinodntiform fishes (Teleostei, Atherinomorpha). Bull. Am. Mus. Nat. Hist. 168:335-557. Poss, S. G., and Miller, R. R. 1983. Taxonomic status of the plains killifish, Fundulus zebrinus. Copeia 55-67. Powers, D. A., and Place, A. R. 1978. Biochemical genetics of Fundulus heteroclitus (L.). I. Temporal and spatial variation in gene frequencies of Ldh-B, Mdh-A, Gpi-B, and Pgm-A. Biochem. Genet. 16:593-607. Powers, D. A., Smith, M., Gonzalez-Villasenor, I., DiMichele, L., Crawford, D., Bernardi, G., and Lauerman, T. 1993. A multidisciplinary approach to the selectionist/neutralist controversy using the model teleost Fundulus heteroclitus. In "Oxford Surveys in Evolutionary Biology" (D. Futuyama and J. Antonovics, eds). Volume 9, 43-107. Rogers, J. S., and Cashner, R. C. 1987. Genetic variation, divergence, and relationships in the subgenus Xenisma of the genus Fundutus. In "Community and Evolutionary Ecology of North American Stream Fishes" (W. J. Matthews and D. C. Heins, eds), pp. 251264. University of Oklahoma Press, Norman, OK. Saiki, R., Gelfand, D., Stoffel, S., Sharf, S., Higuchi, R., Horn, G., Mullis, K., and Erlich, H. A. 1988. Primer-directed enzymatic ampli-
12. Fundulidae
fication of DNA with a thermostable DNA polymerase. Science 239: 487-491. Setzer, P. Y. 1970. An analysis of a natural hybrid swarm by means of chromosome morphology. Trans. Am. Fish. Soc. 99:139-146. Swofford, D. L. 1993. PAUP: Phylogenetic Analysis Using Parsimony, Version 3.1 (Illinois Natural History Survey, Champaign, 1991).
197
Taylor, M. H., DiMichael, L., and Leach, G. J. 1977. Egg stranding in the life cycle of the mummichog, Fundulus heteroclitus. Copeia 397-399. Trueman, J. W. H. 1994. RANDOMISER program package, version 11/94. Wiley, E. O. 1986. A study of evolutionary relationships of Fundulus topminnows (Teleostei: Fundulidae). Am. Zool. 26:121-130.
This Page Intentionally Left Blank
C H A P T E R
13 Interrelationships of Lamniform Sharks: Testing Phylogenetic Hypotheses with SequenceData G A V I N J. P. NAYLOR
A N D R E W P. M A R T I N
ERIK G. M A T T I S O N and WESLEY M. B R O W N
Department of Biology Osborn Memorial Laboratory Yale University New Haven, Connecticut 06520
Biological Sciences University of Nevada Las Vegas Las Vegas, Nevada 89154
Department of Biology University of Michigan Ann Arbor, Michigan 48109
notion that molecular data are intrinsically better templates than are morphological or behavioral data for recording the tell-tale imprint of evolutionary history. The authors believe that DNA sequence data offer advantages for phylogenetic reconstruction, but do not subscribe to the view that they are intrinsically "better" than other forms of data. The advantages seen for DNA sequence data are: (1) A large number of potentially informative, heritable, and discrete characters can be obtained. This can be useful when the group under investigation is conservative, is characterized by a scarcity of good morphological characters, or has been subjected to repeated bouts of evolutionary parallelism of phenotypic characters. (2) Protein-encoding DNA sequence data can be broken down into different constraint categories based on a knowledge of the genetic code. First, second, and third codon positions can be recognized as can two- and four-fold degenerate sites. These categories can be treated as distinct classes of data and analyzed separately. Morphological and behavioral traits cannot be broken down in this way. (3) Because distinct classes of data can be recognized, differences in the evolutionary dynamics among classes can be investigated. Observations in a given class can be pooled across a number of sites and sub-
I. I n t r o d u c t i o n
Phylogenetic reconstruction involves estimating relationships from patterns of character-state covariation seen among taxa. The endeavor would be straightforward were each evolutionary lineage to acquire its o w n set of unique traits at birth and then pass them on immutably to all descendents. If this were the case, phylogenetic reconstruction would require no more than a search for the evolutionary tree which accounted for the distribution of traits as a perfectly nested set. Unfortunately, evolution is not so simple. Character-state changes occur with markedly different probabilities across both characters and taxa, traits frequently revert to previous conditions, lineages occasionally coalesce, and identical character states arise in multiple lineages by parallel or convergent evolution. In many cases these vagaries conspire to confound or bias inferences about evolutionary history. It is important, when estimating phylogeny, to explore the strengths and limitations of data in light of these potentially confounding influences. In the past decade, much has been made of the power of molecular sequences for inferring evolutionary history (Avise, 1994). Arguments often promote a MOLECULAR SYSTEMATICS OF FISHES
199
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
200
G A V I N J. P. N A Y L O R et al.
stitutional changes contrasted. This capacity to pool observations across sites provides the sample size necessary to detect subtle patterns of character change that might otherwise go undetected if sites had to be analyzed individually (Collins et al., 1994). (4) Because a large number of characters with similar evolutionary dynamics can be assembled, it is possible to estimate relative branch lengths among lineages for a particular class of trait. This can provide a relative (albeit approximate) temporal scale to the pattern of inferred relationships. While the same might be attempted for morphological data, the diverse range of evolutionary dynamics spanned by a collection of morphological traits likely increases the error term of any estimate and thus diminishes reliability. The advantages ascribed to molecular data are thus predominantly associated with the ability to describe their evolutionary dynamics (i.e., to specify a model of character-state change). There are, however, serious shortcomings to DNA data. First, there are only four character states (G, A, T, and C). This limited number of states increases the likelihood of reversion caused by multiple substitutions at a site. Such reversion, termed "site saturation," is a general problem for any set of characters and depends on intrinsic rates of character-state change and the character-state space available. Higher rates and fewer states promote rapid saturation. Problems associated with saturation can sometimes be sidestepped by focusing on slowly evolving sites or on characters that contain a greater number of character states [e.g., focusing on the codon as a character rather than on single nucleotide sites (Goldman and Yang, 1994)]. Second, DNA sites are probably not independent of one another. Most phylogenetic inference methods assume character independence (although many appear quite resilient to violation of this assumption). When sites are not independent or linked, undue weight can be assigned unknowingly to certain character-state distribution patterns at the expense of others. This has the potential to confound phylogenetic inference. Third, there is often considerable variation in the rates at which sites evolve, either across sites within a lineage (Gillespie, 1986a) or for homologous sites across lineages (e.g., Martin et al., 1992). In certain circumstances, rate variation can seriously compromise phylogenetic inferences (Huelsenbeck and Hillis, 1993). Fourth, the nucleotide bases G, A, T, and C are often found in unequal proportions. Such bias in base composition diminishes the character-state space available for recording evolutionary change and effectively lowers the saturation ceiling (De Salle et al., 1987). If bias is similar in different lineages, all lineages should be similarly affected. However, if bias differs across lineages--a con-
dition known as "deviation from stationarity"--there is a tendency for phylogenetic methods to group taxa by the similarity of base composition, regardless of their historical relatedness (Lockhart et al., 1992, 1994). Finally, the various processes and constraints that act on DNA sequences can interact to increase the variance in the evolutionary rate seen for a given class of site, making the task of fitting a model of evolutionary change to the data particularly difficult. For example, redundancy of the genetic code renders third codon positions more free to vary than other codon positions. However, sites free to vary also accumulate compositional biases. The bias, in turn, restricts the amount of evolutionary change that can be recorded. Third position sites can thus range from appearing to be highly variable, fast-evolving sites that record a large number of evolutionary changes to highly constrained, slowly evolving sites that seldom record an event, depending on the eveness of base composition. Phylogenetic inferences derived from molecular data should be critically evaluated in light of these shortcomings so that character-state covariations due to site saturation, rate variation, and base compositional effects are not mistaken for evolutionary signal.
A. Assumptions In phylogenetic inference, a model of evolutionary transformation between character states is applied to a distribution of character states for a group of taxa to yield a tree that best explains the data (Sober, 1988). Different tree-building algorithms invoke different models. Some models are very specific and restrictive (e.g., distance cluster analysis of corrected genetic distances among taxa). Others are more general and have fewer restrictions (e.g., cladistic parsimony). There is generally a trade-off: the more restrictive the model the more explanatory power is reaped; however, as restrictions increase so does the likelihood that assumptions of the model will be violated (Huelsenbeck and Hillis, 1993). Because of the heterogeneous nature of evolutionary change both among characters and among lineages, it is perhaps best to rely on models that can be (1) applied across different types of characters, and (2) modified to include restrictive assumptions when suggested by the data. Cladistic parsimony has been championed as a method that requires few restrictive assumptions (Farris, 1983) and, therefore, as a method that should be widely applicable across a broad range of evolutionary dynamics, such as those seen at the different positions of a codon. The authors subscribe to this view, but feel it important to outline the assumptions that are implicit in parsimony analyses. This is done to underscore that the conclusions are inferences
13. Lamniform Sharks
contingent on data fitting the implied model. Parsimony requires that homoplasy be randomly distributed among taxa. When this is the case, the true historical "signal" (if it is the most influential source of character-state covariance among taxa) should overshadow any "noise" due to homoplasy. In keeping with this view, it is assumed (1) that incorrect inferences are due to stochastic error associated with a small sample of characters, and (2) that erroneous inferences should disappear as more data are collected. This argument is appealing. However, highly structured, nonhistorical sources of character-state covariation among taxa can often dilute or eradicate the signal due to shared history. In some cases these can even swamp any phylogenetic signal with a positively misleading signal. For example, as previously alluded to, nucleotide base compositional frequencies can vary across taxa in such a way that distantly related organisms have more similar base compositions than do close relatives. In such situations, parsimony will be predisposed to incorrectly group the distantly related taxa together because of their base compositional similarity (Loomis and Smith, 1990; Penny et al., 1990; Sidow and Wilson, 1990, 1991; Lockhart et al., 1992; Hasegawa and Hashimoto, 1993; Steel et al., 1993). However, such nonrandom distributions of homoplasy do not necessarily preclude the effective use of parsimony. A careful inspection of data can help identify classes of characters that have the potential to be misleading. Problems so identified can sometimes be ameliorated by judicious use of differential weighting schemes (Hillis et al., 1994; Huelsenbeck et al., 1994). For example, in a data set of six taxa where the two most divergent forms share a base composition profile comprising 45% G, 45% C, 5% T, and 5% A, while the remainder share a profile of 5% G, 5% C, 45% T, and 45% A, parsimony will be predisposed to link the two most divergent forms together as sister taxa. However, if the nucleotides are recoded as either purines or pyrimidines, all six taxa are rendered an unbiased 50:50 purine:pyrimidine base composition. In essence, the transformation of data results in an amplification of the phylogenetic signal to noise ratio by bringing the data more into line with parsimony's requirement for the random distribution of homoplasy.
B. Fossils and Phylogenies Investigation of the sources of character-state covariation in molecular data sets is best accomplished for groups in which the phylogeny is known [e.g., bacteriophage (Hillis et al., 1992), mice (Sage et al., 1993), corn (Kellog and Birchler, 1993)] or can be corroborated by independent means (e.g., higher-order verte-
201
brate classes). In most cases these "model groups" may not lead to predictive results that can be widely applied to different groups of organisms because they either focus on unusual genomes in contrived conditions (e.g., bacteriophage) or address issues that arise when highly divergent lineages are compared (e.g., vertebrate classes). Successes and pitfalls encountered in the analysis of higher taxa may have little relevance for analyses at lower taxonomic ranks because higher taxa generally differ in so many ways that it is impossible to attribute patterns of character-state covariation to any specific subset of biological influences. Ideally, character-state covariation in molecular data sets is best explored by describing patterns of molecular evolution for a group whose phylogeny can be corroborated by independent means and then expanding the taxonomic scope to include taxa that are biologically similar. The success of these studies often depends on information about the history of the group derived from the fossil record. Although first appearances of fossil lineages that lead to extant forms do not provide information about phylogeny, positive correlation between the age of lineages estimated from fossils and the age of lineages determined from phylogenetic analysis of DNA sequences can provide a gauge for the accuracy of phylogenetic inference. Although a lack of significant correlation between age and clade ranks may be indicative of a poor fossil record (Norell and Novacek, 1992) or of a grossly inaccurate reconstruction of phylogeny, a significant positive correlation between age and clade ranks most likely reflects correspondence between phylogenetic pattern and evolutionary history recorded in the paleontological record (Norell and Novacek, 1992). Times of first appearance in the fossil record have been documented for diversifying lineages in a number of groups [e.g., Bryozoans (Jackson and Cheetham, 1994); catfishes (Lundberg, 1992); sharks (Maisey, 1984; Cappetta, 1987)]. When the fossil record for such groups is dense and continuous, the first appearance times of different lineages can be used to calibrate rates of molecular evolution and to investigate rate heterogeneity within and among taxonomic groups (Martin et al., 1992). This can be important for testing alternative hypotheses of molecular evolution (Gillespie, 1986b; Kimura, 1983) and for establishing the phylogenetic utility of specific genes at various levels of taxonomic differentiation (Graybeal, 1994; Friedlander et al., 1994). C. Sharks and the Order Lamniformes
The fossil record of sharks is dense, relatively continuous, and consists almost entirely of teeth (Maisey, 1984; Cappetta, 1987). Many of these teeth are distinc-
202
GAVIN J. P. NAYLOR et al.
tive enough to allow identification of the fossil lineages that gave rise to extant forms, making sharks a model group for the type of fossil-calibrated molecular systematics study described earlier. In general, sharks are a morphologically conservative group. Phylogenetic hypotheses based on morphology have been hampered by a scarcity of shared derived character states (Fechhelm and McEachran, 1984; Compagno, 1988). Molecular sequences may provide a much needed source of shared derived character information with which to infer phylogenetic hypotheses for different shark groups (e.g., Martin, 1993). One group of sharks that is particularly well represented in the fossil record is the order Lamniformes, which originated 124-140 million years ago (Maisey, 1984; Cappetta, 1987). Paleontological work indicates that the order was at its most diverse in the middle and late Cretaceous, but subsequently suffered repeated bouts of extinction. Extant lamniform sharks thus constitute a relictual assemblage of highly divergent lineages. The differentiation among contemporary species is reflected in their classification. There are 16 recognized species classified in 10 genera and seven families. Five of the genera and four of the families are monotypic. The order comprises the relatively wellknown, endothermic superpredators (Lamnidae), i.e., the great white, the two makos, the porbeagle, and the salmon shark; the three species of thresher sharks (Alopiidae) with their extremely long caudal fins, which they use like whips to stun and kill schooling fishes (Compagno, 1984); the whale-like, filter-feeding, basking shark (Cetorhinidae), which can attain lengths of up to 30 feet; the sluggish, benthic sand tiger sharks (Odontaspididae); the deep-water crocodile and goblin sharks (Pseudocarchariidae and Mitsukurinidae, respectively); and the recently discovered megamouth shark (Megachasmidae). Attempts to estimate the evolutionary relationships among these extant taxa based on morphological characters have yielded conflicting hypotheses (Maisey, 1985, Fig. 1A; Compagno, 1990; Fig. 1B). In order to evaluate these alternate hypotheses and to investigate the correspondence between phylogenetic inference and paleontological information, the sequences of the mitochondrial protein-encoding cytochrome b and NADH 2 genes have been determined for all but 2 of the 16 extant lamniform species and the data have been subjected to phylogenetic analysis. Particular attention has been paid to issues that might confound covariation of character states due to shared ancestry. Results based solely on these sequence data suggest a new phylogenetic hypothesis for the order. When inferred branch lengths based on sequence data are contrasted with first appearance information from the fossil rec-
ord, the new hypothesis shows a better fit to the fossil record than do the hypotheses of either Compagno (1990) or Maisey (1985).
II. Materials and Methods Fresh tissue samples were obtained for all but 2 of the 16 lamniform species. Tissues from Odontaspis noronhai or from Carcharias tricuspidatus were not obtained. Where possible, multiple specimens were sequenced for each species. A list of specimens sequenced is presented in the Appendix with corresponding locality data. Sequences for the mitochondrial NADH 2 and cytochrome b genes were obtained. Both were amplified using polymerase chain reaction (PCR), using a different protocol for each. The NADH 2 gene was amplified in two steps. The double-stranded DNA product was made by subjecting a total DNA preparation to 30 cycles of PCR amplification in a 100-#1 reaction using Perkin-Elmer Taq polymerase and two universal NADH 2 primers (Kocher et al., 1995). The product was chloroform extracted, precipitated with ammonium acetate and ethanol, washed in a 70% ethanol/10mM TRIS, 1/~M EDTA (TE) solution, air dried, resuspended in TE, and run out on a low melting point gel. The band, visualized under low intensity ultraviolet light, was excised, purified (Gene-Clean; United States Biochemical), and stored in 80 #10.1 TE. A single-stranded DNA product was then made by using the double-stranded product as the template in a second 20-cycle PCR reaction, to which only one of the two original amplification primers was added. The resulting single-stranded product was cleaned and concentrated using four flushes of 0.1 TE solution through ultrafree microconcentration tubes (Millipore Corporation), then sequenced using the Sequenase protocol (USB) employing dideoxy-NTP termination reactions (Sanger et al., 1977, 1980) in conjunction with a series of sequencing primers spaced at approximately 150-bp intervals along the fragment. The cytochrome b sequence was amplified in a 25-#1 reaction using 12.5 pmol of the primers GluDG and Cb1211H (Palumbi et al., 1991), Perkin-Elmer buffer, 200 ~M of each nucleotide, and 1/~1 of Perkin-Elmer Taq polymerase. An initial 30 amplification cycles were carried out at 94~ for 30 sec, 52~ for 15 sec, and 72~ for 60 sec. One microliter of the amplified product was used to seed a second, 150-]zl reaction containing 75 pmol of the GluDG primer and 7.5 pmol of biotinlabeled Cb1211H. After another 35 cycles of amplification, the DNA was precipitated with 7.5 M ammonium acetate and 50% ethanol, pelleted by centrifugation at
13. Lamniform Sharks
Lamnidae
A
203
Alopiidae u~ r
.C) "~
~
~~~
g ~
I I I
I I I I I
Lamnidae
B 4:=
"
LIJ I I I I
Alopiidae
~.~ ~
"P-.
.~ ~~: ~
Odontaspididae
~ .s
~
E
[-fJ 'iJ L-IJ I I
i
Hypotheses of lamniform relationships forwarded by (A) Maisey (1985) and (B) Compagno (1990).
FIGURE 1
high speed for 10 min, washed once with ethanol, air dried, and resuspended in 40 /zl of water. For each sample, 20/~1 of Dynal streptavadin beads was washed with 50/zl of binding and washing (BW) buffer (4 M NaC1, 10 mM Tris, 1 mM EDTA, 0.1% Nonidet P-40). The beads were resuspended in 40 #1 of BW buffer, combined with the DNA, and the solution was incubated for 1 hr with slow rotation at 45~ to allow the biotin-labeled DNA to bind to the streptavadin beads. The beads were then washed once with 50 #1 of BW buffer, twice with 50/~1 of sterile, distilled water, and resuspended in 12/~1 of water. The sample was boiled
for 15 sec and quickly put on the magnet to remove the beads from solution. The solution containing the nonbiotin-labeled DNA was collected, diluted with 28 #1 of water, labeled, and stored at 4~ Following heat denaturation, the beads were incubated at room temperature for 10 min in 0.1 N NaOH, washed twice with 50 /zl of sterile, distilled water, and resuspended in 40/~1 of water. Both strands were sequenced using a battery of primers and the Sequenase protocol (USB). In most cases, both genes were sequenced for each individual. However, there were instances where NADH 2 was sequenced from one individual whereas
204
GAVIN J. P. NAYLOR et al.
cytochrome b was sequenced from a different individual of the same species (see Appendix). In these cases, the genes from the different individuals were combined to represent that species. Such use could pose a problem if within-species p o l y m o r p h i s m was so great as to render a species paraphyletic with respect to other taxa. However, in all cases in which multiple individuals were sequenced there was very little within-species sequence variation (<0.5%). In fact, in most cases, sequences from replicate individuals of the same species were identical. A data set comprising N A D H 2 and cytochrome b sequences for the 14 lamniform taxa, seven carcharhiniform o u t g r o u p taxa (GenBank Access n u m b e r s U91417- U91447, L08031 - L08034 and L08036 - L08042) and corresponding GenBank sequences for carp (Cyprinus carpio), loach (Crossostoma lacustre) and trout (Oncorhynchus mykiss) was assembled. Sequences were aligned with Clustal W (Thompson et al., 1994), checked visually with the codon coloring feature of Aligner (Eernisse, 1992), and subjected to a n u m b e r of different analyses. The authors relied p r e d o m i n a n t l y on cladistic p a r s i m o n y for phylogenetic reconstruction and shied away from methods with more restrictive assumptions. A n u m b e r of analyses were carried out to evaluate the overall signal and the signal at different codon positions of each gene. These data subsets have been assessed for internal concordance and combinability. The reliability of inferred clades has been evaluated using both bootstrap resampling (Felsenstein, 1985) and Bremer (1988) support heuristics. Several previously suggested hypotheses are fitted to the data and evaluated. All of the phylogenetic analyses were carried out using the computer p r o g r a m PAUP ~ 4.0 (Swofford, 1996).
III. Results and Discussion A. Combinability
Although mitochondrial genes are inherited as a single unit and are generally not subject to recombination, it is possible that differential selection a m o n g genes over time could render their respective hierarchical signals incongruent or noncombinable. In order to evaluate this aspect, combinability tests were conducted (Farris et al., 1995) a m o n g the three codon positions of N A D H 2 and cytochrome b. Sequence data were sorted by gene and codon position (six data subsets). Tests were then conducted to assess whether the partitions between pairs of subsets were statistically different from r a n d o m expectation. Parsimony analyses were carried out for each data subset separately. The length of the most parsimonious tree (MPT) for the first data subset was a d d e d to that of the second. This combined tree length was then contrasted with a distribution of comparable tree lengths generated by randomly partitioning the original data. Results (Fig. 2) suggest that the signals from all pairs of subsets are highly combinable. The lowest p value for combinability was 0.31 between the third position sites of the two genes.
B. N u c l e o t i d e C o m p o s i t i o n
Base composition was assessed for variable sites at first, second, and third codon positions for each gene separately (Table I). Noticeable differences were seen, both a m o n g codon positions and between genes. Base compositional stationarity was evaluated for all shark
FIGURE 2 Combinability results for contrasts between the different codon positions. In order to assess whether two data sets were combinable, the authors evaluated whether the partition between the two data sets was statistically different from random expectation. Parsimony analyses were carried out for each data subset separately. The length of the most parsimonious tree (MPT) for the first data subset was added to that of the second. This combined tree length was then contrasted with a distribution of comparable tree lengths generated by randomly partitioning the original data (Farris et al., 1995). p values reflecting the placement of the original data partition in the distribution were computed and are shown in bold type, whereas the number of parsimony informative sites are shown in regular text. (A) Contrasts among codon positions between the two genes. Bootstrap consensus trees are presented for each of the six gene/codon subsets to provide a sense of the phylogenetic signal intrinsic to each data subset. The retention index for the MPT for each data subset is also shown. I.oxy, Isurus oxyrinchus; I.pau, Isurus paucus; C.car, Carcharodon carcharias;L.nas, Lamna nasus; L.dit, Lamna ditropis; C.max, Cetorhinus maximus; M.pel, Megachasmapelagios;P.kam, Pseudocarchariaskamoharai; O.fer, Odontaspisferox; A.sup, Alopias superciliosus; A.vul, A. vulpinus; A.pel, A. pelagicus; C.tau, Carchariastaurus; M.ows, Mitsukurina owstoni; G.cuv, Galeocerdo cuvier; N.bre, Negaprion brevirostris; P.gla, Prionace glauca; C.por, Carcharhinus porosus; C.plu, C. plumbeus; S.tib, Sphyrna tiburo; S.lew, S. lewini; carp, Cyprinus carpio; loach, Crossostoma lacustre; trout, Oncorhynchus mykiss. (B) Contrasts among codon positions within genes.
13. Lamniform Sharks
NADH 2 pos 2
NADH 2 pos 1 E
205
G.cuv N.bre P.ala ~.l~or u
G.cuv N.bre P.ala 9or
1
H I-Z-~ ~:Caarx I I ~ M . p e l I P-~::::::::= k~et I ~ A.sup II ~I==::=:= ~:~e~ '~ P.kam
A
I I I
~
70 ~
9
86
-75= ~ ~
~~:le~ V l.oxy
t~.car C.tau
C.max
I I
~ ,
i::~}~s
A.SUD M.oV~s O.fer P.kam carp, Ioacn trout
,
i
I
loo~l._pau , ! ^^ c.car I ~[_~.x--,,-L.nas
~:vPuet
'
G.cuv N.bre P.ala . r
95
lool r'-'l I I
M.pel
,
C.tau 7--,g,,r-- ca rP, I~ Ioacn trout
mpt ri = 0.60
NADH 2 pos 3
mpt ri = 0 . 7 2
I
I
i
1961
I I I
I I
'
I I
~l
l - ~ ' - - ~ .dit
U. max M,Del
I
A.~~J
A.vul .A.sup
! I
I ~
M.ows
~;~er C.tau
~ L . ~ ~o~h ,----
trout
mpt ri = 0 . 4 6
G.cuv N.bre P.ala C~.l~pr
0.82
~~
!.ow au
CYT B pos 1
83
mpt ri = 0 . 5 8
L,.car L.n.as ~'.d~ax M.pel
I
L ~
0.94
0.53
.A.sup
pars inf sites:
pars inf sites:
CYTB
= 116
CYTB
= 116
P.kam G.tau carp, lOaCh trout
ND2
= 151
ND2
= 66
T O T A L = 267
T O T A L = 182
pars inf sites: CYTB
=116
ND2
= 302
T O T A L = 418
G.cuv 91:
CYTB pos 2
i
g~.i~or
N
0.72
0187
I::~,~~
rapt ri = 0 . 7 6
_~l~r r
C.max M.Del A.pel A.vul A.SUD M.oV~s O.fer P.kam C.tau ,carp, loach trout G.cuv N.bre P .ala Cx.15or
~<.p;u
CYT B pos 3 mpt ri = 0 . 4 3
0.93 pars inf sites:
pars inf sites:
CYTB
= 39
C Y T B = 39
ND2
= 151
ND2
T O T A L = 190
= 66
T O T A L = 105
0.88
pars inf sites: C Y T B = 39
ND2
= 302
T O T A L = 341
0.52
0.31
Ij3au ~'~ax
I
/
G
50 ~
P.Ram M.oel A.pel A.vul
pars inf sites:
pars inf sites"
pars inf sites:
CYT B = 327
CYT B = 327
CYT B = 327
O.fer c.tau
ND2
ND2
ND2 = 302 TOTAL = 629
~s trout
= 151
T O T A L = 478
= 66
T O T A L = 393
206
GAVIN J. P. NAYLOR et al.
NADH 2 pos 1
CYT B
pos 2
pos 2
pos 3
pos 1
pos 2
0.89
1.0
pos 2
0.50
0.60 pars inf sites: pos 1 = 151 pos 3 = 3 0 2
pos 3
pars inf sites: pos 2 = 6 6 pos 3 = 3 0 2
pars inf sites: pos 1 = 1 1 6 pos 3 = 3 2 7 i
pars inf sites: pos 2 = 3 9 pos 3 = 3 2 7
FIGURE 2 Continued.
taxa (note that carp, loach, and trout, having clearly different base compositions, were excluded from this analysis). Base compositions at first and second codon positions did not differ significantly among the taxa surveyed. Third position sites of both genes were highly significantly different among taxa (p < 0.00000001). However, when these third position sites were recoded as either purines or pyrimidines, deviations from stationarity were ameliorated (see Table I). Note that the X2 tests presented are intended only as coarse descriptors of base compositional stationarity; structured patterns of base compositional difference are not always reflected by such tests. For example, the lamnid taxa have a lower percentage of adenine and a higher percentage of guanine at first position sites (of both genes) than do any of the other sharks surveyed (Table I), yet p values suggest that deviations from stationarity are not significant.
C. Saturation Transitions and transversions for each codon position were plotted separately for each gene (Fig. 3) against the average number of nucleotide differences between pairs of taxa (mean raw distance). Although there is an element of nonindependence in such plots (part of the information is the same for both the ordinate and the abscissa), differences in trend are detectable. In both genes, third positions are more variable than first positions, which are, in turn, more variable
than second positions. This is expected based on the relative redundancy of the genetic code. The NADH 2 gene registers more changes than does the cytochrome b gene, reflecting perhaps a diminished level of constraint in NADH 2 relative to cytochrome b. There is clear evidence for signal saturation in transitions at third positions of both genes. Slight saturation is seen for the first position transitions of NADH 2; however, this is restricted to the distant comparisons between sharks and the teleost outgroups. Second positions show a considerable amount of scatter, particularly in cytochrome b, possibly reflecting a relatively constrained character-state space at this position (Irwin et al., 1991). It is possible that a distance which accommodated multiple substitutions might be a more sensitive indicator of saturation. This was tested by plotting various such distance measures against the average number of nucleotide differences used herein. All distances examined exhibited a linear relationship (Fig. 4) with the "uncorrected" distance used, suggesting that the presented estimates of saturation would not be greatly affected. Transition:transversion ratios derived from pairwise comparisons among taxa vary according to taxonomic depth of the comparison and to codon position. For pairwise contrasts within the Lamniformes, the mean observed transition:transversion ratio was approximately 5:1 for first position sites and 3:1 for second and third position sites. Somewhat unexpectedly, given the apparent differences in substitution rate
TABLE I
Base Composition Profiles for Cytochrome B and N A D H 2 a
First position sites
Second position sites
Taxon
A
C
G
T
Site
A
C
Carp Loach Trout G.cuv N.bre P.gla C.por C.plu S.lew S.tib I.oxy I.pau C.car L.nas L.dit C.max M.pel A.pel A.vul A.sup M.ows O.fer P.kam C.tau
0.25 0.21 0.19 0.35 0.31 0.30 0.29 0.30 0.29 0.29 0.27 0.28 0.32 0.31 0.29 0.33 0.36 0.33 0.35 0.30 0.35 0.36 0.34 0.31
0.32 0.29 0.31 0.26 0.29 0.29 0.24 0.29 0.32 0.27 0.36 0.36 0.31 0.32 0.33 0.33 0.26 0.33 0.30 0.34 0.30 0.29 0.27 0.33
0.27 0.31 0.33 0.15 0.17 0.18 0.19 0.18 0.18 0.18 0.21 0.21 0.19 0.18 0.20 0.15 0.13 0.17 0.14 0.18 0.13 0.15 0.16 0.17
0.15 0.19 0.17 0.24 0.24 0.22 0.27 0.22 0.21 0.26 0.17 0.15 0.17 0.19 0.18 0.20 0.25 0.17 0.20 0.18 0.21 0.21 0.23 0.19
142 142 142 143 143 143 143 143 143 143 143 143 143 143 133 131 132 132 132 131 141 136 139 138
0.14 0.15 0.11 0.17 0.17 0.17 0.18 0.17 0.17 0.17 0.18 0.17 0.17 0.17 0.12 0.15 0.14 0.21 0.19 0.18 0.14 0.15 0.15 0.20
0.37 0.35 0.45 0.23 0.23 0.22 0.23 0.22 0.22 0.23 0.23 0.25 0.20 0.18 0.19 0.24 0.21 0.24 0.26 0.25 0.27 0.20 0.23 0.20
Mean
0.30
0.31
0.19
0.20
139
0.16
0.25
39.6 p = 0.98
G
T
Cytochrome 0.09 0.40 0.11 0.38 0.14 0.31 0.09 0.51 0.09 0.51 0.09 0.52 0.09 0.49 0.09 0.52 0.09 0.52 0.09 0.51 0.11 0.48 0.12 0.46 0.14 0.49 0.12 0.52 0.11 0.58 0.11 0.51 0.09 0.55 0.12 0.43 0.09 0.46 0.11 0.47 0.10 0.49 0.12 0.53 0.13 0.50 0.10 0.51 0.11
0.48
Site
A
C
G
T
Site
R
Y
65 65 65 65 65 65 65 65 65 65 65 65 65 65 57 55 56 58 57 57 63 59 62 61
0.41 0.29 0.28 0.33 0.35 0.34 0.34 0.32 0.35 0.33 0.26 0.26 0.28 0.28 0.26 0.28 0.33 0.33 0.34 0.32 0.32 0.35 0.33 0.31
0.40 0.45 0.42 0.34 0.39 0.34 0.42 0.41 0.40 0.40 0.46 0.50 0.46 0.44 0.47 0.48 0.32 0.41 0.37 0.41 0.42 0.30 0.35 0.42
0.04 0.09 0.06 0.01 0.01 0.02 0.01 0.03 0.01 0.02 0.03 0.05 0.02 0.03 0.04 0.04 0.02 0.01 0.01 0.02 0.02 0.01 0.01 0.02
0.14 0.17 0.25 0.32 0.25 0.31 0.23 0.24 0.24 0.25 0.25 0.19 0.24 0.25 0.23 0.21 0.33 0.25 0.28 0.24 0.24 0.34 0.32 0.25
362 362 362 364 364 364 364 364 364 364 364 364 364 364 344 341 342 342 340 345 360 351 353 356
0.45 0.38 0.33 0.34 0.35 0.35 0.35 0.35 0.36 0.35 0.29 0.31 0.30 0.31 0.30 0.32 0.35 0.35 0.35 0.34 0.34 0.36 0.34 0.33
0.55 0.62 0.67 0.66 0.65 0.65 0.65 0.65 0.64 0.65 0.71 0.69 0.70 0.69 0.70 0.68 0.65 0.65 0.65 0.66 0.66 0.64 0.66 0.67
62.3
0.32
0.41
0.03
0.25
357
0.34
0.66
X 2 = 10.8 p = 1.0
X 2=
Third position R / Y
Third position sites
163.8 p < 0.00000001
14.7 0.9 > p > 0.5
Z 2=
X 2 =
0.35 0.31 0.26 0.36 0.37 0.35 0.35 0.34 0.34 0.34 0.28 0.29 0.33 0.31 0.30 0.34 0.37 0.35 0.34 0.33 0.36 0.36 0.33 0.34
0.34 0.32 0.38 0.31 0.32 0.28 0.33 0.34 0.34 0.34 0.34 0.33 0.33 0.33 0.33 0.32 0.25 0.27 0.29 0.29 0.32 0.26 0.26 0.30
0.20 0.22 0.21 0.09 0.08 0.10 0.09 0.11 0.11 0.11 0.15 0.15 0.12 0.14 0.15 0.11 0.09 0.12 0.11 0.13 0.11 0.10 0.11 0.11
0.12 0.15 0.15 0.24 0.22 0.26 0.23 0.22 0.20 0.21 0.23 0.22 0.22 0.22 0.22 0.22 0.29 0.26 0.25 0.26 0.21 0.28 0.30 0.25
204 204 204 204 204 204 203 204 203 204 200 201 201 202 203 203 199 203 202 203 203 200 203 202
0.17 0.19 0.15 0.21 0.16 0.19 0.18 0.18 0.20 0.19 0.19 0.18 0.19 0.21 0.20 0.19 0.20 0.19 0.20 0.20 0.17 0.20 0.20 0.20
0.48 0.48 0.47 0.39 0.40 0.37 0.39 0.39 0.38 0.41 0.35 0.37 0.39 0.35 0.37 0.41 0.37 0.38 0.37 0.40 0.40 0.37 0.38 0.39
NADH 2 0.11 0.24 0.07 0.25 0.07 0.31 0.03 0.36 0.07 0.37 0.06 0.38 0.06 0.37 0.07 0.36 0.04 0.37 0.05 0.35 0.06 0.40 0.07 0.37 0.05 0.37 0.05 0.39 0.06 0.38 0.06 0.34 0.06 0.37 0.05 0.38 0.05 0.38 0.05 0.35 0.06 0.37 0.04 0.39 0.05 0.38 0.04 0.37
118 118 118 117 118 118 118 118 118 118 113 115 114 116 117 117 114 117 117 117 117 115 117 115
0.48 0.45 0.38 0.40 0.43 0.44 0.40 0.43 0.42 0.38 0.31 0.31 0.35 0.32 0.33 0.40 0.40 0.44 0.41 0.40 0.39 0.41 0.42 0.40
0.32 0.34 0.34 0.27 0.34 0.25 0.36 0.34 0.36 0.36 0.42 0.42 0.37 0.44 0.42 0.36 0.28 0.29 0.31 0.37 0.34 0.29 0.28 0.35
0.06 0.06 0.06 0.02 0.00 0.02 0.02 0.00 0.01 0.02 0.07 0.06 0.03 0.05 0.05 0.02 0.03 0.02 0.02 0.02 0.03 0.02 0.01 0.02
0.14 0.14 0.22 0.31 0.22 0.29 0.22 0.23 0.22 0.24 0.20 0.21 0.24 0.18 0.20 0.21 0.29 0.25 0.25 0.20 0.25 0.28 0.29 0.23
332 332 332 330 329 332 332 332 332 332 323 328 325 328 327 329 324 331 331 331 331 328 330 329
0.54 0.52 0.44 0.42 0.43 0.45 0.42 0.43 0.42 0.40 0.38 0.37 0.38 0.38 0.38 0.43 0.43 0.46 0.44 0.43 0.41 0.42 0.43 0.42
0.46 0.48 0.56 0.58 0.57 0.55 0.58 0.57 0.58 0.60 0.62 0.63 0.62 0.62 0.62 0.57 0.57 0.54 0.56 0.57 0.59 0.58 0.57 0.58
0.33
0.31
0.13
0.23
203
0.19
0.39
0.06
117
0.40
0.34
0.03
0.23
330
0.43
0.57
46.4 p = 0.90
X 2 =
8.4 p = 1.0
X 2 =
0.36
X 2 = 204 p < 0.00000001
X 2 = 18.2 0.9 > p > 0.5
aFrequencies are presented for each codon position for both genes separately. X 2 values, together with corresponding p values, are presented to summarize the extent of deviation from base compositional stationarity. Note that deviations from stationarity are highly significantly different in third codon positions of both genes. However, a recoding of third position character states as either "purine" (R) or "pyrimidine" (Y) effectively eliminates the deviation from stationarity at this position. Taxon labels as in Fig. 2.
~
~.
~
~
0
~~~
,.~
~
~ ; g~
~
~
0
-"
.~
o
.
o
-.
o
.
.
9
.
.
t
[]
[]
,
[]
.
,
GO
.
.
o,
,
.
~
9
o,
.
D
9
.o
~
o
o
o?
9
o
L7O
,,
0 o
[]
,
fO
D
[]
~ B
,, 9
[]
.o
~
o
.o
o
~
9
0
[]
i
n
D
Ol
o
D
'
~
|
0
121
i
9
o
o
9
i 9
O
0
o
o
f~
O "l-
Z
O
13. LamniformSharks
A o
0.3
O
0.2
r
209
B
C
0.3
0.3 Q.
,,t,,,I
~"
"c
I
o't 0
0
0.1
LI.
"3 0,0
'
!
0.0
'
I
0.1
'
0.2
0.3
04
0.2
t~ L. ::3 E
0.1
0.0 0.0
'
I
E
= m
0.3
0.0
'
0.3
,
'
!
,
"
~
o.2
~
0.1
l 0.2
,
0.0 0.3
' 0.0
I
'
0.2
0.3
F Q
0.2
0
0.1
/ ~1~
9
0.1
I
0.1
0.0
0.3
9
0.0
I
0.2
0.3
o.1 0.0
'
0.1
E
0.2
L.
0.1
= m
D O. cO
0.2
9
I 0.1
'
i 0.2
'
0.0 0.3
0.0
,
',' 0.1
,
I 0.2
, 0.3
Relationship between mean absolute distance (used in Fig. 3) and some commonly used distance measures: (A) Jukes and Cantor (1969), (B) Felsenstein (1981), (C) Kimura (1980), (D) Kimura (1981), (E) Hasegawa et al. (1985), and (F) Steel (1994). Mean absolute distance is plotted on the abscissa for each plot.
FIGURE 4
between the two genes, the transition:transversion ratios at respective codon positions were similar for both genes.
D. Phylogenetic Analysis The most parsimonious tree resulting from the analysis in which all sites were weighted equally is shown in Fig. 5A. The number of steps required to break down each inferred clade (Bremer, 1988) is depicted on the most parsimonious tree, whereas the corresponding bootstrap consensus tree is presented to show the core hierarchical signal reflected by the data. Equivalent transversion parsimony analyses are shown in Fig. 5B. In order to expose any latent phylogenetic bias that might be caused by structured deviations from base compositional stationarity, data were subjected to two types of analysis designed to minimize such influences. In the first analysis, third position sites were recoded according to their status as purine or pyrimidine to ameliorate deviations from stationarity (see earlier discussion) and then the resultant data matrix was subjected to parsimony analysis (Fig. 6A). In the second analysis, data were subjected to LogDet transformation and subsequent neighbor-joining (Fig. 6B). This ap-
proach has been shown to be resilient to the influence of base compositional differences among taxa when sites evolve independently (Lake, 1994; Steel, 1994). Both approaches consistently show the Lamniformes to be monophyletic, the Lamnidae to constitute a monophyletic group within the order, the basking shark (Cetorhinus maximus) to be the sister taxon to the Lamnidae, each of the genera Lamna and Isurus to constitute monophyletic groups within the Lamnidae, and the great white shark C. carcharias to be the sister taxon to the genus Isurus. Both analyses also suggest that Alopias pelagicus and A. vulpinus are sister taxa and that Pseudocarcharias kamoharai, Odontaspis ferox, Megachasma pelagios, and the three thresher sharks (Alopiidae) constitute a monophyletic group. When trees resulting from these two analyses are contrasted with the bootstrap consensus tree based on equal weighting of all sites (Fig. 5A), there is considerable congruence. The bootstrap consensus is somewhat less resolved (as expected) and does not show sister group relationships between C. carcharias and the genus Isurus or between the thresher sharks A. pelagicus and A. vulpinus. From these results, it would seem that the bootstrap consensus based on equal weighting of all sites is not adversely skewed by deviations from stationarity and can be viewed (for these data, at least) as a conserva-
GAVINJ. P. NAYLORet at.
210
PARSIMONY
A
BOOTSTRAP TREES
MPT
, 35
L
~
AGCT
MPT length = 4941 RI = 0.48
|
~ 36
G.cuvieri N.brevirostris P.glauca C.plumbeus C.porosus 2 5 r - - S.lewini L__ S.tiburo
I.oxyrinchus
I.paucus
~
L.nasus
L.ditropis C.carcharias C.maximus M.pelagios P.kamoharai A.vulpinus A.superciliosus O.ferox A.pelagicus C.taurus M.owstoni carp loach trout
~
G.cuvieri N.brevirostris P.glauca C.porosus C.plumbeus 99 ~ ' S.lewini i S.tiburo 97 I.oxyrinchus 100 "--'! I.paucus 1oo C.carcharias 51 ~ L.nasus L.ditropis C.maximus . ~ M.pelagios P.kamoharai 100 74 A.pelagicus A.vulpinus i A.supercifiosus O.ferox M.owstoni ........ C.taurus ....... carp loach I trout l
1O0 I , ' ~
Strict consensus of 2 MPTs
B
24i r ~ 2
RY
MPT length = 1419
,
G.cuvieri N.brevirostris P.glauca
orosus
I--, 5 6
RI = 0 . 6 9
4
20
9 ,,,
d o
I
I
....1,r ,,, [_~
C.plumbeus S.lewini S.tiburo I.oxyrinchus I.paucus C.carcharias L.nasus L.ditropis C.maximus C.taurus A.supercifiosus M.pelagios A.pelagicus A.vulpinus P.kamoharai O.ferox M.owstoni carp loach trout
lOO !
G.cuvieri N.brevirostriSp.glauca
r~591
~"-'l~~C.porosus C.plumbeus | 100~ S.lewini L__ S.tiburo
8 7 ~ , ~ l.oxyrinchus
10~ ! 92r~ | lOO ,r
100 i
, 73
i~
,
~
I.paucus C.carcharias L.nasus L.ditropis C.maximus C.taurus M.pelagios A.pelagicus A. vulpinus A.superciliosus O.ferox P.kamoharai .......M.owstoni carp loach trout
FIGURE 5
Phylogenetic inferences based on parsimony analysis of the sequence data. Trees in the left column correspond to the most parsimonious trees (MPTs), or a strict consensus when multiple MPTs resulted from analysis. The number of steps required to break down each inferred clade (Bremer, 1988) is depicted directly on the trees. Trees in the right column correspond to the consensus of 100 bootstrap replicates. Numbers on the consensus correspond to the percentage occurrence of the clade among replicates. The bootstrap consensus trees are presented to provide a sense of the core hierarchical signal in each data set. (A) Analyses conducted on data comprising both transitions and transversions. (B) Analyses using transversional differences only.
tive estimate of the historical signal in the presented data set. Somewhat surprisingly, there is little evidence in any of the analyses to support a monophyletic Alopi-
idae. Nevertheless, there is extensive morphological support for such a grouping. In light of this, the authors propose the phylogenetic hypothesis presented in Fig. 7. This hypothesis is based on the bootstrap con-
13. LamniformSharks
A G.cuvieri N.brevirostris P.glauca I C.plumbeus C.porosus S.lewini ~ " S.tiburo I.oxyrinchus I.paucus C.carcharias L.nasus L.ditropis C.maximus C.taurus M.pelagios I r"" o.ferox L ~ A.pelagicus A.vulpinus P.kamoharai A.superciliosus M.owstoni , ~ carp loach trout
211
B
L ~
F ~
G.cuvieri N.brevirostris P.glauca C.porosus C.plumbeus S.lewini S.tiburo I.oxyrinchus I.paucus C.carcharias L.nasus L.ditropis C.maximus M.pelagios P.kamoharai A.pelagicus A.vulpinus A.superciliosus O.ferox M.owstoni C.taurus carp loach trout
Trees resulting from analyses designed to minimize the effects of deviation from base compositional stationarity. (A) Parsimony analysis of data set in which third position sites were recoded to reflect transversions only (first and second positions include both transitions and transversions). (B) LogDet transformation (Steel, 1994) of the raw data set subjected to neighbor-joining. FIGURE 6
sensus tree seen in Fig. 5B, but has been modified to incorporate a monophyletic Alopiidae and a sister group relationship between C. carcharias and the genus Isurus. These relationships are supported by morphological data and cost very few extra steps when fitted to sequence data in a parsimony framework. The modified bootstrap consensus presented in Fig. 7 comprises two trichotomies and a tetrachotomy. Although 135 hypotheses are combinatorially consistent with this consensus (3 • 3 • 15 = 135), certain arrangements within each of the polytomies require fewer steps than do their counterparts (Fig. 8). A. vulpinus and A. pelagicus are favored to be sister taxa in the apical polytomy (within Alopias). Alopias and Odontaspis are favored to be sister taxa in the midlevel polytomy, and Mitsukurina is favored to be the sister taxon to the rest of the Lamniformes in the basal polytomy. Although these tendencies may foreshadow a more fully resolved future hypothesis, the presented sequence data do not support more resolution than that shown (Fig. 7).
E. Testing Hypotheses with D N A Sequence Data Although the data do not provide robust support for any one fully resolved cladogram, they can be used to evaluate competing hypotheses concerning lamniform
relationships. Each of the hypotheses was fitted to the data set separately, using the constraints option of PAUP. A heuristic search was carried out with the hypothesis under test implemented as a constraint. The number of steps required by each hypothesis was contrasted with the number of steps required by the most parsimonious tree for the data set. The fewer "extra" steps required, the better the fit of the hypothesis to the data and, therefore, the more credible the hypothesis. The following 10 hypotheses were evaluated. Hypothesis 1: Monophyly of Alopiidae Hypothesis 2: Monophyly of the filter-feeding trait (Maisey, 1985) Hypothesis 3: Monophyly of Odontaspididae Hypothesis 4: Sister group relationship between Lamnidae and Cetorhinidae (Compagno, 1990) Hypothesis 5: Sister group relationship between a monophyletic Alopiidae and a clade comprising the Lamnidae and the Cetorhinidae (Compagno, 1990) Hypothesis 6: New hypothesis proposed herein (Fig. 7) Hypothesis 7: Compagno's (1990) hypothesis of relationships for all members within the Lamniformes (Fig. 1A)
212
GAVIN J. P. NAYLOR et al.
I.oxyrinchus I.paucus C.carcharias L.nasus L.ditropis
e-
5
C.maximus M.pelagios P.kamoharai A.pelagicus A.vulpinus A.superciliosus O.ferox M.owstoni C.taurus FIGURE 7 New hypothesis for phylogenetic relationships among lamniform sharks. This tree is based on the bootstrap consensus resulting from equal weighting of all sites (Fig. 5B), but has been modified to include a sister group relationship between Isurus (makos) and Carcharodon (great white shark) and a monophyletic Alopiidae (thresher sharks). See text for further details.
Hypothesis 8: LogDet tree (Fig. 6B) Hypothesis 9: MPT derived from equally weighted transitions and transversions (Fig. 5A) Hypothesis 10: MPT derived from tranversion parsimony analysis (Fig. 5B) Figure 9 presents the fit of these hypotheses to both the raw data set and to the data set reflecting transversional differences only. Of all of the hypotheses tested, the sister group relationship between the Lamnidae and the Cetorhinidae (H4) requires the fewest extra steps. Indeed, this relationship requires no extra steps when applied to the raw data and only one extra step when applied to the tranversion data. The monophyly of the thresher sharks (Alopiidae) was the next most tenable hypothesis, requiring 16 extra steps when fitted to the raw data and 2 extra steps when applied to the transversion data. Given that the cost of invoking monophyly for the Alopiidae is low relative to o t h e r hypotheses, and given that there is strong morphological evidence to support monophyly, it is likely that the molecular-based inference suggesting a nonmonophyletic Alopiidae is erroneous.
tree length
FIGURE 8 Distribution of tree lengths for all 135 fully resolved trees that are consistent with the incompletely resolved hypothesis shown in Fig. 7. Although combinatorially consistent, many of the arrangements cost several extra steps and are likely incorrect. Those requiring the fewest steps are: (1) In the apical polytomy, A. vulpinus and A. pelagicus as sister taxa, requiring an average of 4972.2 steps (averaged over those of the 135 arrangements that are consistent with this resolution); (2) in the midlevel polytomy, Alopias and Odontaspis as sister taxa, requiring an average of 4975.5 steps; and (3) in the basal polytomy, Mitsukurina as the sister taxon to the rest of the Lamniformes, requiring an average of 4969.2 steps. The arrow indicates where the hypothesis of Compagno (1990) would fall if included in this distribution.
Maisey's hypothesis (H2) for the sister group relationship between the two filter feeders, Cetorhinus and Megachasma, is not supported; neither is the monophyly of the family Odontaspididae (H3). Compagno's proposed sister group relationship between the Alopiidae and a clade containing the Lamnidae and Cetorhinidae (H5) is also not supported. It should be noted that as hypotheses of relationship become more restrictive or "explain more relationships," so they are prone to cost extra steps. Thus, for example, it would be inappropriate to assert that Compagno's (1990) hypothesis (H7) for the relationships among members of the entire order is less well supported than Maisey's hypothesis for the monophyly of the filter-feeding trait (H2) because Compagno's hypothesis forwards relationships for all 14 taxa while Maisey's forwards a relationship for just one pair. A more appropriate comparison to evaluate Compagno's hypothesis might be made by contrasting the extra steps required to fit Compagno's tree with those required to fit the LogDet hypothesis (H8) to the data. Both of these hypotheses are of the same "explanatory magnitude." When this is carried out, the cost of fitting Compagno's tree is 56 extra steps for the raw data and 18 extra steps for the transversion data. In contrast, the LogDet hypothesis requires 14 extra steps for the raw data and 21 extra steps for the transversion data. Thus, Compagno's hy-
13. Lamniform Sharks
213 DATA SET
HYPOTHESIS
Tis + Tvs length of MPT
4941 steps
H 1" monophyly of Alopidae
H2: monophyly of filter-feeding
Tvs only 1419 steps
4957
1421
(16)
(2)
4982
1435
(16)
(41) 4975
1433
H4: Sister relationship between Lamnidae and Cetorhinidae
4941
1420
H5: ((Alopidae (Lamnidae + Cetorhinidae))
(37)
H3: monophyly of Odontaspididae
(34)
(o)
(1)
4978
(19)
(56)
(14)
1440
(21)
1435
H9: (eq. wt. parsimony) 1 MPT
H10: (transversion parsimony) 2 MPTs
1437
(18)
4955
H8: Log Det tree
1421
(2)
4997
H7 Compagno 1990 hypothesis
1433
(14)
4960
H6: New hypothesis
(14)
(16) 4997 / 5001
(56)
(6o)
FIGURE 9 The number of steps required when different hypotheses are fitted to data. There are 10 hypotheses (H1 to H10) and two versions of the data set (one comprising transitions and transversions, the other transversions only). The number of steps required for each hypothesis is shown in its corresponding cell. The number of "extra" steps beyond the number required by the MPT is depicted in italics. Hypotheses H1 to H5 propose relationships for subsets of the taxa rather than for all 14 taxa as is the case for hypotheses H6 to H10. Each hypothesis was fitted to data as a constraint. A heuristic search was implemented to determine the shortest tree length consistent with the constraint.
pothesis has a better fit to the transversion data, but a worse fit to the raw data. This highlights the fact that hypotheses can vary in their fit according to the type of analysis employed. Indeed, when the two most parsimonious trees resulting from transversion parsimony analysis are fitted to the raw data set, they require more extra steps than do any of the other hypotheses proposed (56 and 60 extra steps, respectively).
F. Correspondence to the Fossil R e c o r d Because the fossil record of lamniform teeth is dense and continuous, it is reasonable to expect a correspondence between branch lengths inferred from sequence
data and first appearances of corresponding lineages in the fossil record. A meaningful correspondence requires (1) that the phylogenetic inference based on molecular data be correct, (2) that the fossil record be sufficiently well sampled to ensure that segments of lineages are not "missed," and (3) that fossil teeth be correctly assigned to their corresponding lineages. There is no way of evaluating conditions 1 and 2 a priori. However, it is known that the task of correctly assigning teeth to lineages (condition 3) has not been straightforward. Extant lamniforms represent the "pruned down" remnants of a greater lamniform diversity which has, for the most part, become extinct. Most of the tooth lineages in the fossil record represent
214
G A V I N J. P. N A Y L O R et al.
evolutionary "cul-de-sacs" that have no extant descendants. Many of these "cul-de-sac" lineages show evolutionary parallelism with lineages that have led to extant forms (Capetta, 1987). This has made the task of identifying and tracking anagenetic change particularly difficult and requires careful evaluation of withinand among-species variation at successive increments along lineages (Espinosa-Arrubarrena, 1987). As more of these detailed stratigraphic studies are completed, a more accurate picture of lamniform evolution should emerge from the fossil record. The authors contrasted inferred amount of molecular change (i.e., branch length) and first appearance estimates derived from the fossil record for the new hypothesis, for Compagno's (1990) hypothesis, and for Maisey's (1985) hypothesis. Sequence data were fitted to each of the hypotheses separately using the "enforced molecular clock" option of the maximum likelihood platform of PAUP ~ 4.0. Inferred branch lengths were contrasted with corresponding first appearance estimates derived from the fossil record (Fig. 10). Considerable correspondence is seen for the new hypothesis ( r 2 = 0.63) and for Compagno's (1990) hypothesis (/.2 = 0.62). Note that while the data appear to fit a "molecular clock," considerable leverage is exerted by taxa at the extremes of the distribution. Less correspondence is seen for Maisey's hypothesis (r 2 = 0.5). This may be due to the fact that Maisey's hypothesis is substantially less resolved. In the plots corresponding to the new hypothesis and to Compagno's hypothesis, the basking shark lineage Cetorhinus (g) falls conspicuously above the regression line. Indeed, 1,2 increases to 0.88 for the new hypothesis and 0.68 for Compagno's (1990) hypothesis if data for Cetorhinus are excluded. The aberrant placement of Cetorhinus could reflect an accelerated rate of molecular evolution, an incorrect first appearance estimate in the fossil record, or an incorrect placement of the taxon in both phylogenies. Because the teeth of Cetorhinus are so different from those of any other extant lamniform, we suspect that transitional tooth forms leading to the extant Cetorhinus tooth type have not yet been recognized as such in the fossil record. If this is indeed the case, then an earlier first appearance date for the Cetorhinus lineage is expected. We note, in support of this, that reference is made to an unpublished record of Cetorhinus from the Eocene of North America (Capetta, 1987; Capetta et al., 1993; D. Ward personal communication).
G. Implications of New
Hypothesis
Perhaps the most striking implication of our presented hypothesis (Fig. 7) is that filter feeding has
evolved twice within extant Lamniformes and that the similarities in jaw articulation seen between Megachasma and Cetorhinus (Maisey, 1985) are the consequence of convergence. This hypothesis (independent origins of filter feeding) is more in keeping with Compagno's (1990) view of lamniform phylogeny than it is with Maisey's (1985). However our tree differs from Compagno's. Megachasma clusters with Odontaspis, Pseudocarcharias, and the Alopiidae in our presented tree (Fig. 7) whereas Megachasma is the sister group to a clade containing Lamnidae, Cetorhinus, and the Alopiidae in Compagno's tree (Fig. 1B). Morphological data show support for some aspects of the presented hypothesis (Fig. 7). Cetorhinus and the Lamnidae are united by a number of derived traits [i.e., lunate tail, peduncle depressed with lateral keels, enlarged gill slits, presence of ectethmoid processes on the chondrocranium to limit jaw protrusion, and suborbital shelves with prominent lateral wings behind orbital notches (Compagno, 1990)]. The implied sister group relationship between Megachasma and Pseudocarcharias has less morphological support. However, possible synapomorphies include similar intestinal valve counts, dorsal fin shapes, and a diel vertical migration habit. Compagno (1990) has suggested that Megachasma may have evolved its distinctive feeding apparatus from odontaspidid-like features by jaw-size exaggeration, acquisition of papillose gill rakers, and modification of jaw protrusion mechanisms for suction feeding. Such a scenario is loosely consistent with the shared derived placement of Megachasma in a clade with Odontaspis, Pseudocarcharias, and Alopias. The presented sequence data do not support a monophyletic Odontaspididae. Although O. noronhai has not yet been sampled, it is likely that Carcharias taurus and O. ferox are nonmonophyletic. A number of authors have already suggested this (Compagno, 1990; Cappetta, 1987). Indeed, Compagno (1990) presents the Odontaspididae as two basally adjacent paraphyletic lineages (Fig. lb). We recommend a reexamination of morphological characteristics for the group. Based on molecular data and the past failure to identify compelling shared-derived features for Odontaspis and Carcharias, we anticipate the splitting of the Odontaspididae into two families to better reflect their phylogenetic distinctness. Our new hypothesis asserts that the three species of thresher shark (Alopiidae), Pseudocarcharias, Megachasma, and Odontaspis, constitute an ancient monophyletic assemblage. This grouping is found in 74% of bootstrap replicates, implying considerable reliability. However, supporting morphological evidence for this relationship has yet to be discovered. We note that this inference is at odds with prevailing views (Maisey,
13. Lamniform Sharks
215
A lineage
age MYBP 124-112 112-97 83-74 55 42 42-38 35-29
Mitsukurina Carcharias Odontaspis Alooias Megachasma Lamna Cetorhinus Carcharodon Isurus L. nasus I. oxyrinchus
B
55 5-15 29-25
stage
NEW HYPOTHESIS 0.020 I Jr 0.052 IVl
fossil representative
Aptian Albian Campanian Ypresian Bartonian Bartonian RupeUian* Danian Ypresian Pliocene Arikareean
Anornotodon principalis C. striatula O. aculeata A. crochardi
unpub, material unpub, material C. parvus
Cretolamna I. praecursor L. nas us L sp. A
C O M P A G N O 1 990 0.016
I. o x y
oo~o_J~F 0.052 ,9 7 I.pau 00271 I, 0.072 C.car I " I __ o.o33
L.nas , 0.049 ~ oo33 L.dit
oi~ I
Ig
/
O.lO9
e
/
I .
C.max
0.070
0.088 0.070 0.070
P.kam
0.07,
! 0.040M 0.074 I o.o13fl oo,, I Io .._.0087 Ib 0.126 Ia 0.126
source Cappetta, 1975 Dalinkevicius, 1935 Cappetta & Case, 1975 Ward, 1978 Ward Collection (pers. com) Ward Collection (pers. com) Leriche, 1908 Applegate & Espinosa, 1996 Leriche, 1905 Herman, 1974 Esoinosa, 1987
0.083 0.019 0.107 ,.-....-,--i,-
A.pel vu, A.sup O.fer C.tau M.OWS
M A I S E Y 1 985 0.075
l.oxy l.pau C.car L.nas L.dit C.max A.pel A.sup
f,h,l,J,k
O.
0.122 0.130
061
0.075 0.075 0.075 0.079
8
0.079 0.105
A.vul M.pel P.kam O.fer C.tau M.ows
0.109
0.075
I 0.042
0.105 0.104 L~!
0.104 0.104 0.104
l.oxy l.pau C.car L.nas L.dit A.pel A.sup A.vul C.max M.pel P.kam O.fer C.tau M.ows
C 0.14 Jr" O1
c:
O.lO
J: u r.
0.08
i (9 ,
L_ J~
.b.!. .....
0.12
I, ~
4,
0.06 0.04 0.02
IMH 0
0.14
4~
0.12-
first a p p e a r a n c e (mybp)
4,
0.08" 0.06"
0.02
0
b a : r :r
~Z
0.10"
0.06-
/
c I-4M
0.12"
0.08-
0.04-
~0 ~0 6'0 do ~00 ~0
~' i
0.10-
0.14
0.04" 20
40 ~0 ~0 ~00 ,~0 first a p p e a r a n c e (mybp)
0.02
0
~0 do ~o do 100 1~0 first a p p e a r a n c e (mybp)
A comparison of inferred branch length and first appearance times in the fossil record. (A) First appearance estimates in millions of years before present (MYBP) for lamniform lineages seen in the fossil record. The geological stage and the fossil taxon representing each lineage are also shown. Note that the first appearance time for the Carcharodon lineage is controversial. Whereas Purdy (1996) and Applegate and Espinosa-Arrubarrena (1996) suggest an early Paleocene origination for the lineage, other workers suggest a later origination for the lineage (D. Ward, personal communication). (B) Phylograms inferred for hypotheses when fitted to sequence data under an enforced molecular clock. Relative branch lengths are depicted directly on the phylograms. Taxon labels are as in Fig. 2. Letters in lowercase, bold type correspond to similarly labeled lineages in A. (C) Relationship between first appearance times (from A) and inferred branch length (from B). Letters in lowercase, bold type correspond to those in A and B. See text for details. F I G U R E 10
216
GAVIN J. P. NAYLOR et at.
1985; Compagno, 1990), which suggest a relatively recent shared ancestry between the Alopiidae and the Lamnidae with respect to Odontaspis, Pseudocarcharias, and Megachasma. Shared derived features linking the Alopiidae and the Lamnidae include an erect first dorsal fin, the partial extension of stiffening cartilagenous elements into the fin web (semiplesodic fins), and intestinal valve counts increasing to a range of 33-55 (Compagno, 1990). In addition, both the lamnids (Carey et al., 1985) and the alopiids exhibit adaptations for endothermy [A. superciliosus has a conspicuous vascular heat exchanger behind the eye (G. J. P. Naylor, personal observation)]. If the presented DNA-based hypothesis is correct and the two families do not constitute a monophyletic group (together with Cetorhinus), then these features are likely convergent. The presented phylogram (Fig. 10B) is made up of long terminal branches connected by short internodal segments. This pattern suggests a history of early diversification followed by directional selection along divergent trajectories leading to highly autapomorphic extant taxa; a pattern reflected by both morphological and molecular characters. Such patterns of diversification are known to be problematic for phylogenetic analysis. Thus, although we stand by the proposed hypothesis (Fig. 7) as the best estimate of phylogeny for the data at hand, we acknowledge that no single fully resolved tree is decisively supported by the data and that the best-fitting phylogenetic hypothesis for the group may change as more data are collected.
Acknowledgements It has taken 10 years to acquire the tissue samples required for this project. We are grateful to all those who have either donated samples directly or assisted Gavin Naylor collecting samples in the field: John Stevens (Australia); M. Miya and K. Yano (Japan); Janine Caira, Jack Casey, Jos6 Castro, Don De Maria, Ken Goldman, Nancy Kohler, John Morissey, Lisa Natanson, Heidi Robeck, Wes Pratt, and Greg Skomal (North America); Geremy Cliff and the shark net inspection crews of the Natal Sharks Board (South Africa); Shoou-Jeng Joung, Che-Tsung Chen and "Mr. Chen" (at the Nan Fan Ao fish landing Taiwan); and David Sims (United Kingdom). Most of the field work for this project was supported by NSF Grant BSR-87-08121 to G. Vermeij and G.J.P.N. The laboratory work was supported by a Sloan Postdoctoral Fellowship (to G.J.P.N.), NSF grant DEB-92-20640 to W. Brown, and the University of Nevada-Las Vegas (to A.P.M.). G.J.P.N. acknowledges access to laboratory equipment and chemicals generously made available by Tom Dowling to facilitate completion of the project while at Arizona State University. Thanks to Tom Kocher and a particularly attentive anonymous reviewer for critical comments which led to the improvement of the original manuscript.
References Applegate, S. P., and Espinosa-Arrubarrena, L. 1996. The fossil history of Carcharodon and its possible ancestor, Cretolamna: A study
in tooth identification. In "The Biology of the White Shark" (P. Klimley, and Ainsley, eds.). Academic Press, New York. Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman and Hall, New York. Bremer, K. 1988. The limits of amino acid sequence data in angiosperm phylogenetic reconstruction. Evolution 42: 795-803. Cappetta, H. 1975. S61aciens et Holoc6phale du Gargasien de la r6gion de Gargas (Vaucluse). G~ol. M~dit. 2(3):115-134. Cappetta, H. 1987. Chondrichthes. II. Mesozoic and Cenozoic Elasmobranchii. In "Handbook of Paleoichthyology," Vol. 3b. Gustav Fischer Verlag, Stuttgart. Cappetta, H., and Case, G. R. 1975. Contribution a l'6tude des s61aciens du groupe monmouth (Campanien-Maestrichtien) du New Jersey. Palaeontographica Abt. A, (1-3) 1-46. Capetta, H., Duffin, C., and Zidek, J. 1993. Chondrichthyes. In "The Fossil Record 2" (M. J. Benton, ed.). Chapman and Hall, London. Carey, F. G., Casey, J. G., Pratt, H. L., Urquhart, D., and McCosker, J. E. 1985. Temperature heat production and heat exchange in lamnid sharks. Memoirs of the Southern California Academy of Sciences, Vol. 9, 24 May, pp. 92-108. Collins, T. M., Wimberger, P. H. and Naylor, G. J. P. 1994. Compositional bias, character state bias, and character state reconstruction using parsimony. Syst. Biol. 43(4):482-496. Compagno, L. J. V. 1984. "FAO Species Catalogue." Vol. 4, parts I and 2: Sharks of the World. An annotated and illustrated catalogue of shark species known to date. FAO Fish. Synop. 125. Compagno, L. J. V. 1988. "Sharks of the Order Carcharhiniformes." Princeton University Press, Princeton, NJ. Compagno, L. J. V. 1990. Relationships of the Megamouth shark, Megachasma pelagios (Lamniformes:Megachasmidae), with comments on its feeding habits. In "Elasmobranchs as Living Resources: Advances in the Biology, Ecology, Systematics, and the Status of the Fisheries" (H. L. Pratt Jr., S. H. Gruber, and T. Taniuchi, eds.), pp. 357- 379. NOAA Technical Report 90. Dalinkevicius, J. A. 1935. On the fossil fishes of the Lithuanian halk. I. Selachii. Mere. Fac. Sci. Univ. Vytautas le Grand IX:243m305. De Salle, R., Freedman, T., Prager, E. M., and Wilson, A. C. 1987. Tempo and mode of sequence evolution in mitochondrial DNA of hawaiin Drosophila. J. Mol. EvoI. 26:157-164. Eernisse, D. J. 1992. DNA translator and aligner: HyperCard utilities to aid phylogenetic analysis of molecules. Comput. Appl. Biosci. (CABIOS) 8:177-184. Eitner, B. J. 1995.Systematics of the genus Alopias (lamniformes: Alopidae) with evidence for the existence of an unrecognized species. Copeia (3)562-571. Espinosa-Arrubarrena, L. 1987. "Neogene Species of the Genus Isurus (Elasmobranchii, Lamnidae) in Southern California, U.S.A. and Baja California Sur, Mexico." Unpublished Masters thesis. Farris, J. S. 1983. The logical basis of phylogenetic analysis. In "Advances in Cladistics" (N. I. Platnick and V. A. Funk, eds.), Vol. 2, pp. 1-36. Columbia Press, New York. Farris, J. S., Kallersjo, M., Kluge, A. G., and Bult, C. 1995. Testing significance of incongruence. Cladistics 10: 315-319. Fechhelm, J. D., and McEachran, J. D. 1984. A revision of the electric ray genus Diplobatis with notes on the interrelationships of Narcinae (Chondrichthes, Torpediniformes). Bull. Flor. State. Mus. Biol. Sci. 29:171-209. Felsenstein, J. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 17:368-376. Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783-791. Friedlander, T. P., Reiger, J. C., and Mitter, C. 1994. Phylogenetic information content of five nuclear gene sequences in animals: Initial assessment of character sets from concordance and divergence studies. Syst. Biol. 43(4):511-525.
13. Lamniform Sharks
Gillespie, J. H. 1986a. Variability of evolutionary rates of DNA. Genetics 113:1077-1091. Gillespie, J. H. 1986b. Natural selection and the molecular clock. Mol. Biol. Evol. 3:138-155. Goldman, N., and Yang, Z. 1994. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11: 725- 736. Graybeal, A. 1994. Evaluating the phylogenetic utility of genes: A search for genes informative about deep divergences among vertebrates. Syst. Biol. 43:174-193. Hasegawa, M. H., Kishino, and Yano, T. 1985. Dating of the humanape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160-174. Hasegawa, M., and Hashimoto, T. 1993. Ribosomal RNA trees misleading? Nature 361:23. Herman, J. 1974. Quelques restes de S61aciens r6colt6s dans les sables du Kattendijk a Kallo. I. SelachiimEuselachii. Bull. Soc. Beige Gdol. 83(1):15-31. Hillis, D. M., Bull, J. J., White, M. E., Badgett, M. R., and Molyneux, I. J. 1992. Experimental phylogenetics: Generation of a known phylogeny. Science 255:589. Hillis, D. M., Huelsenbeck, J. P., and Cunningham, C. W. 1994. Application and accuracy of molecular phylogenies. Science 264: 671-677. Huelsenbeck, J. P., and Hillis, D. M. 1993. Success of phylogenetic methods in the four-taxon case. Syst. Biol. 42:247-264. Huelsenbeck, J. P., Swofford, D. L., Cunningham, C. W., Bull, J. J., and Waddell, P. J. 1994. Is character weighting a panacea for the problem of character weighting in phylogenetic analysis? Syst. Biol. 43: 288- 291. Irwin, D. M., Kocher, T. D., and Wilson, A. C. 1991. Evolution of the cytochrome B gene. J. Mol Evol. 32(2): 128-144. Jackson, J. B. C., and Cheetham, A. H. 1994. Phylogeny reconstruction and the tempo of speciation in cheilostome Bryozoa. Paleobiology 20: 407-423. Jukes, T. H., and Cantor, C. R. 1969. Evolution of protein molecules. In "Mammalian Protein Metaobolism" (H. N. Munro, ed.), pp. 21-132. Academic Press, New York. Kellog, E. A., and Birchler, J. A. 1993. Linking phylogeny and genetics: Zea mays as a tool for phylogenetic studies. Syst. Biol. 42(4): 415-439. Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120. Kimura, M. 1981. Estimation of evolutionary distances between homologous nucleotide sequences. Proc. Natl. Acad. Sci. USA 78: 454-458. Kimura, M. 1983. "The Neutral Theory of Molecular Evolution." Cambridge University Press, Cambridge. Kocher, T. D., Conroy, J. A., McKaye, K. R., Stauffer, J. R., and Lockwood, S. F. 1995. Evolution of NADH dehydrogenase subunit 2 in East African cichlid fish. Mol. Phylog. Evol. 4: 420-432. Lake, J. A. 1994. Reconstructing evolutionary trees from DNA and protein sequences: Paralinear distances. Proc. Natl. Acad. Sci. USA 91:1455-1459. Leriche, M. 1905. Les poissons eocenes de la belgique. Mem. Mus~e Roy. Hist. Nat. Belgique 33:49-228. Lockhart, P. J., Howe, C. J., Bryant, D. A., Beanland, T. J., and Larkum, W. D. 1992. Substitutional bias confounds inference of cyanelle origins from sequence data. J. Mol. Evol. 34:153-162. Lockhart, P. J., Steel, M. A., Hendy, M. D.. and Penny, D. 1994. Recov-
217
ering evolutionary trees under a more general model of sequence evolution. Mol. Biol. Evol. 11:605-612. Loomis, W. F., and Smith, D. W. 1990. Molecular phylogeny of Dictyostelium discoideum by protein sequence comparison. Proc. Natl. Acad. Sci. USA 87:9093-9097. Lundberg, J. G. 1992. The phylogeny of Ictalurid catfishes: A synthesis of recent work. In "Systematics, Historical Ecology, and North American Freshwater Fishes" (R. L. Mayden, ed.), pp. 392-420. Stanford University Press, Palo Alto, CA. Maisey, J. G. 1984. Higher elasmobranch phylogeny and biostratigraphy. Zool. J. Linn. Soc. 82:33-54. Maisey, J. G. 1985. Relationships of the megamouth shark, Megachasma. Copeia 228-231. Martin, A. P. 1993. Hammerhead shark origins. Nature 364:494. Martin, A. P., Naylor, G. J. P., and Palumbi, S. R. 1992. Rates of mitochondrial DNA evolution in sharks are slow compared with mammals. Nature 357:153-155. Norell, M. A., and Novacek, M. J. 1992. The fossil record and evolution: Comparing cladistic and paleontologic evidence for vertebrate history. Science 255:1690-1693. Palumbi, S. R., Martin, A., Romano, S., McMillan, W. O., Stice, L., and Grabowski, G. 1991. "The Simple Fool's Guide to PCR," version 2. University of Hawaii Zoology Department, Honolulu, HI. Penny, D., Hendy, M. D., Zimmer, E. A., and Hamby, R. K. 1990. Trees from sequences: Panacea or Pandora's box? Aust. Syst. Bot. 3:21-38. Purdy, R. W. 1996. Paleoecology of fossil white sharks. In "The Biology of the White Shark" (P. Klimley and Ainsley, eds.). Academic Press, New York. Sage, R. D., Atchley, W. R., and Campanna, E. 1993. House mice as models in systematic biology. Syst. Biol. 42(4):523-561. Sanger, F., Nicklen, S., and Coulson, A. R. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Nat. Acad. Sci. USA 74: 5463-5467. Sanger, F., Coulson, A. R., Barrell, B. G., Smith, A. J., and Roe, B. A. 1980. Cloning in single-stranded bacteriophage as an aid to rapid DNA sequencing. J. Mol. Biol. 143:161-178. Sidow, A., and Wilson, A. C. 1990. Compositional statistics: An improvement of evolutionary parsimony and its application to deep branches in the tree of life. J. Mol. Evol. 31:51-68. Sidow, A., and Wilson, A. C. 1991. Compositional statistics evaluated by computer simulations. In "Phylogenetic Analysis of DNA Sequences" (M. M. Miyamoto and J. Cracraft, eds.), pp. 129-146. Oxford University Press, New York. Sober, E. 1988. "Reconstructing the Past: Parsimony Evolution and Inference." MIT Press, Cambridge, MA. Steel, M. A. 1994. Recovering a tree from the leaf colorations it generates under a markov model. Appl. Math. Lett. 7:19-23. Steel, M. A., Lockhart, P. J., and Penny, D. 1993. Confidence in evolutionary trees from biological sequence data. Nature 364:440442. Swofford, D. L. 1996. "PAUP*4.0: Phylogenetic Analysis Using Parsimony," version d42. Laboratory of Molecular Systematics, Smithsonian Institution, Washington, D.C. Thompson, J. D., Higgins, D. G., and Gibson, T. J. 1994. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673-4680. Ward, D. J. 1978. Additions to the fish fauna of the english Palaeogene. 1. Two new species of Alopias (thresher shark) from the english Eocene. Terti. Res. 2: 23- 28.
218
GAVIN J. P. NAYLOR et aI. APPENDIX h
Collection Data for Specimens Sequenced Gene
Taxon
Alopias pelagicus Alopias pelagicus Alopias pelagicus Alopias pelagicus Alopias superciliosus Alopias superciliosus Alopias superciliosus Alopias superciliosus Alopias superciliosus Alopias vulpinus Alopias vulpinus Carcharodon carcharias Carcharodon carcharias Carcharodon carcharias Carcharodon carcharias Cetorhinus maximus Cetorhinus maximus Carcharhinus plumbeus Carcharhinus plumbeus Carcharhinus porosus Carcharhinus porosus Carcharias taurus Carcharias taurus Carcharias taurus Carcharias taurus Galeocerdo cuvier Galeocerdo cuvier Isurus oxyrinchus Isurus oxyrinchus Isurus oxyrinchus Isurus paucus Isurus paucus Isurus paucus Lamna ditropis Lamna ditropis Lamna nasus Mitsukurina owstoni Mitsukurina owstoni Megachasma pelagios Negaprion brevirostris Negaprion brevirostris Odontaspis ferox Prionace glauca Prionace glauca Pseudocarcharias kamoharai Pseudocarcharias kamoharai Sphyrna lewini Sphyrna lewini Sphyrna tiburo Sphyrna tiburo
Locality
Tissue sampled by
Suao, fish landing, Taiwan Suao, fish landing, Taiwan Taiwan Taiwan 37 ~ 13N 74 ~ 31W, off Virginia coast 35 ~ 31N 74 ~ 46W, off North Carolina coast Suao, fish landing, Taiwan Taiwan 28 ~ 10N 79 ~ 55W off Florida coast Bayshore, Long Island Japan South shore, Long Island Pacifica, California Farallon Islands, California Off Natal Coast, South Africa Barbican, Plymouth, UK Australia Off M a k a p u point, Oahu, Hawaii 35 ~ 25N 74 ~ 53W, off Cape Hatteras Manzanilla Bay, Trinidad Fish landing, Port of Spain, Trinidad 36 ~ 26N 75 ~ 41W, off North Carolina coast Steinhardt a q u a r i u m 31 ~ 29N 80 ~ 52W, off Georgia coast Off Natal Coast, South Africa 30 ~ 02N 80 ~ 38W, off Florida coast Off M a k a p u point, Oahu, Hawaii Ocean City, M a r y l a n d 35 ~ 31N 74 ~ 46W, off Virginia coast Point Pleasant, N e w Jersey Florida, Keys Off South Florida coast Off South Florida coast Off North California coast Japan Gulf of Maine Off coast of Tasmania Japan Japan 25 ~ 19N 81 ~ 56W, SW coast of Florida Off Cosgrove, Florida Keys South Cat Caye, Bahamas 220 miles S.S.W off Oahu, Hawaii 38 ~ 48N 72 ~ 58W, off N e w Jersey coast Suao, fish landing, Taiwan Suao, fish landing, Taiwan Hawaii 30 ~ 10N 80 ~ 11W, off Florida coast Manzanilla Bay, Trinidad Manzanilla Bay, Trinidad
G.J.P.N. G.J.EN. C. Chen C. Chen G.J.EN. G.J.EN. G.J.EN. C. Chen G.J.P.N. G.J.P.N. M. Miya G.J.P.N. Cal A c a d e m y K. G o l d m a n G. Cliff D. Sims J. Stevens G.J.P.N G.J.P.N. G.J.P.N. G.J.P.N. G.J.P.N Cal A c a d e m y G.J.P.N. G. Cliff G.J.P.N. G.J.P.N G.J.P.N. G.J.P.N G.J.P.N D. de Maria J.I. Castro J.I. Castro Cal A c a d e m y M. Miya G.J.P.N. J. Stevens M. Miya K. Yano, M. Miya G.J.P.N. D. de Maria J.I. Castro G.J.P.N G.J.P.N. S.J. Joung S.J. Joung A.P.M. G.J.P.N. G.J.P.N. G.J.P.N.
ID a 954 925 AlsuTai81 Alpe 871 874 1001 AlsuTai 622 429 Alvu MM13 864 1060 1061 677 1058 CemaAJSI 556 912 495 477 916 1064 627 SA1 888 553 237 873 412
614 JIC#10 JIC#8 1062 Ladi MM12 633 1057 Miow JI Mepe
617 558 JIC#7 540 920 1033 1034 Sple 886 500 501
aAssigned field n u m b e r in G.J.P.N. or A.P.M. data base.
NADH Cyto2 chrome b x x x m x x x m -x x x x x ~ x x ~ x m x x x ~ ~ x ~ x x ~ x x x x x x x x x ~ x x m x x x ~ x ~ x
m x x
x x x x
x x x x x x m x x x
x x
x x x x x x x x x x x
CHAPTER
14 Radiation of Characiform Fishes: Evidence from Mitochondrial and Nuclear DNA Sequences Guillermo Orti
Department of Genetics University of Georgia Athens, Georgia30602
the largest characiforms are predatory forms like Hydrocynus and Salminus, known to reach 130 and 100 cm in length and a weight of 38 and 17 kg, respectively (G6ry, 1977; Sverlij and Espinach Ros, 1986). The smallest characiforms are known as "miniature" forms (Weitzman and Vari, 1988) whose adults do not exceed 26 mm standard length (e.g., "tetras," glandulocaudines, and lebiasinids). A few representative characiform taxa discussed in this chapter are illustrated in Fig. 1. The only comprehensive account of the order published to date is that of G6ry (1977). The restriction of characiforms to exclusively freshwater habitats links their distribution closely to the dynamics of geological history, qualifying them as an important model group for biogeographic studies (e.g., Myers, 1938; 1949). The establishment of phylogenetic relationships among characiform lineages has been difficult and controversial, and so far only addressed using morphological characters (e.g., Weitzman and Fink, 1983; Buckup, 1991; Lucena, 1993; Uj, 1990; Vari, 1995). None of these studies included representatives from all extant families and considerable disagreement exists among them (compare hypotheses A-D in Fig. 2). Although relationships among the African groups and between these and the Neotropical taxa remain equivocal, all studies suggest that neither African nor Neotropical taxa form reciprocally monophyletic groups (Fig. 2). Based on this information, Lundberg (1993)
I. I n t r o d u c t i o n
The order Characiformes includes ecologically and morphologically diverse fishes that live in rivers and lakes in Africa and the Neotropics. They are divided into 14 or 16 families (G6ry, 1977, and Greenwood et al., 1966, respectively, see Table I), 4 of which are endemic to Africa (ca. 200 species), and the rest are found only in South America (more than 1200 species). They include well-known forms like "piranhas" and "tetras." The vast array of trophic specializations found among characiforms is comparable to that of cichlids. The range of feeding modes varies from detritivory (mud-eating prochilodontids and curimatids), herbivory (plant-eating citharinids and anostomids), and planktivory (plankton-filtering Anodus and Clupeocharax), to predation (Hepsetus, Hoplias, Hydrocynus, and Salminus), fin eating, and scale eating (some distichodontids and characids), and to voracious group predation (piranhas). Peculiar morphological and physiological adaptations permit the survival of some groups in extreme hypoxic conditions, typical of flood-plain environments (e.g., air-breathing in erythrinids and swollen lips of "pacus"). Almost unique among teleosts, "hatchetfishes" (gasteropelecids) have adaptations for "flight" up to several centimeters above water. Some groups exhibit parental care (erythrinids, Hepsetus, piranhas, and lebiasinids). Size range is also remarkable: MOLECULAR SYSTEMATICS OF FISHES
219
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
220
GUILLERMO ORT!
TABLE I
Characiform Classifications, with Some Common Names of Representative Species" Greenwood et al. (1966)
G6ry (1977)
1. Characidae (includes "African Characidae," ~ which equal Gery's Alestidae): African and American tetras, piranhas, bloodfins, silver dollars, dientudos, etc. 2. Erythrinidae: trahiras 3. Ctenoluciidae: South American pike-like characiforms 4. Hepsetidae~: African pike-like characiforms 5. Cynodontidae 6. Lebiasinidae: pencilfishes and pyrrhulins 7. Parodontidae 8. Gasteropelecidae: hatchetfishes 9. Prochilodontidae: bocachicos, curimbatas, sabalos 10. Curimatidae: curimatas 11. Anostomidae: headstanders and leporins 12. Hemiodontidae 13. Chilodontidae: headstanders 14. Distichodontidae ~ 15. Citharinidae 9 16. Ichthyboridae~: scale eaters
Alestidae, ~ Crenuchidae, Characidiidae, Characidae, Serrasalmidae
Erythrinidae Ctenoluciidae Hepsetidae ~ Raphiodontinae (within Characidae) Lebiasinidae Parodontinae (within Hemiodidae) Gasteropelecidae Prochilodinae (within Curimatidae) Curimatidae (including Prochilodinae, Chilodinae, and Anodinae) Anostomidae Hemiodidae (including Parodontinae) Chilodinae (within Curimatidae) Distichodinae ~ (within Citharinidae ~) Citharinidae ~ (including Distichodinae ~ and Ichtyborinae ~) Ichthyborinae ~ (within Citharinidae ~)
"For the present study, the scheme of Greenwood et al. (1966) is adopted, with the following changes: (1) the family Cynodontidae is included as a tribe in the family Characidae, as suggested by Howes (1976); (2) the family Ichthyboridae is included as a subfamily of Distichodontidae, following Vari (1979); and (3) the characid subfamilies Characidiinae and Crenuchinae are grouped in the family Crenuchidae (Buckup, 1991). As a consequence, 15 characiform families are recognized (see the Appendix). Asterisks denote African taxa.
Citharinus
Acestrorhynchus
Pygopristis
Hepsetus
Gymnocorymbus
Camegiella
Alestes
Hoplias
Ctenolucius
Representativecharaciform taxa. (Top row) African species: Citharinus (Citharinidae), Hepsetus (Hepsetidae), and Atestes (Characidae, Alestinae). (Lower two rows) Neotropical species: Acestrorynchus (Characidae, Characinae), Gymnocorymbus (Characidae, Tetragonopterinae), Hoplias (Erythrinidae), Pygopristis (Characidae, Serrasalminae), Carnegiella (Gasteropelecidae), and Ctenolucius (Ctenoluciidae). Drawings are not to scale.
FIGURE 1
suggested that the origin of most lineages of Characiformes must have preceded the breakup of Gondwanaland. As a consequence, their current distribution can only be explained by a remarkably disproportionate extinction of characiform lineages in Africa (barring unlikely marine dispersal events). The African and South American continents drifted apart approximately 84-106 million years ago (Pitman et al., 1993; Parrish, 1993), thus setting a time frame for the major cladogenetic events among Characiformes. This chapter summarizes molecular phylogenetic studies of characiform fishes, based on mitochondrial and nuclear DNA sequence data (Orti, 1995; Orti and Meyer, 1996; Orti et al., 1996; Orti and Meyer, 1997). Mitochondrial ribosomal genes (fragments of the 12S and 16S subunits) were surveyed at different levels of taxonomic inclusiveness to investigate the dynamics of nucleotide substitution. Knowledge of the evolutionary behavior of molecular characters is essential to inform phylogenetic analyses and to assess the validity and the limits of resolution of the data. This chapter compares data sets comprising increasing levels of taxonomic divergence: (1) among closely related spe-
14. Radiation of Characiform Fishes
221
A
q
~
Serrasalminae
Triportheus Astyanax Cheirodon Pseudocorynopoma Oligosarcus Aphyocharax c Poptella Tetragonopterus Cynopotamus Gnathocharax
---i
E Acestrorhfnchus Rhaphiodon
~
I L-Ill
Lebiasinidae Erythrinidae
__[~
C _~
Ctenoluciidae Lebiasinidae Erythrinidae
Erythrinidae l ~ l ~ I Ctenoluciidae
r
Ctenoluciidae
~
Curimatidae Prochilodontidae Anostomidae Chilodidae Hemiodidae Characidiidae Cynodontidae Acestrorhynchidae Cynopotamidae Characidae Aphyocharacidae Tetragonopteridae Serrasalmidae Bryconidae Uestida,
-L~Brycon Chalceus
-4
i '
Ctenoluciidae Erythrinidae
I~~~
Lebiasinidae
Acestrorhynchus
_ _ ~ Cynopotamus
Tetragonopterus Oligosarcus Brycon
q
Hemiodontidae Characidiinae
~
Curimatidae Prochilodontidae Anostomidae Chilodontidae Parodontidae E
/ I] [,,-~i~,[.;.[.]il 6~_l::]I
Gymnotiformes I =9 Siluriformes ~ o Characiformes ~-~ Cypriniformes Gonorynchiformes Esociformes Osmeriformes Salmoniformes ~ Neoteleostei Clupeiformes
o .~.
-
.~
FIGURE2
Phylogenetic hypothesis among characiform and basal euteleost taxa, based on morphological evidence. (A) Family Characidae and four outgroups, modified from Lucena (1993); (B) order Characiformes, after Uj (1990); (C) order Characiformes (in part), modified from Buckup (1991); (D) four characiform family assemblage (see outgroup taxa in A), after Vari (1995); and (E) superorder Ostariophysi and Protacanthopterygii, after Fink and Fink (1981) and Nelson (1994). African characiform taxa are enclosed in black boxes.
cies and genera of "piranhas" and "pacus" within the subfamily Serrasalminae (family Characidae); (2) among subfamilies within the family Characidae; (3) among families of Characiformes; and (4) among orders of Ostariophysans. The first part of the chapter presents phylogenetic hypotheses derived from each
data set and discusses their validity based on comparative analyses of base substitution patterns. The second part presents phylogenies obtained from analysis of a nuclear protein-coding gene: ependymin. This gene codes for a major glycoprotein component of the extracellular fluid in the brain of fishes (reviewed in Shash-
222
GUILLERMO ORTi
oua, 1991; Hoffmann, 1994). DNA sequences from 13 representative characiform taxa and several outgroups are analyzed to gauge the utility of this nuclear marker for characiform systematics. Finally, a short discussion of the systematic and biogeographic implications of the phylogenetic results is included.
II. Materials and M e t h o d s
Fish tissues preserved in 70% ethanol were obtained from samples from natural populations and commercial sources. The Appendix contains the taxonomic arrangement and GenBank (and Museum) accession numbers for specimens and DNA sequences used in this study. Orti (1995), Orti and Meyer (1996), Orti et al. (1996), and Orti and Meyer (1997) should be consulted for more information on laboratory and analytical procedures, which will only be outlined here. Briefly, genomic DNA was extracted by proteinase K digestion and phenolic purification (of muscle tissue), and fragments of the mitochondrial rRNA genes were amplified by the polymerase chain reaction (PCR) using universal primers (Saiki et al., 1988; Kocher et al., 1989; Palumbi et al., 1991). Direct sequencing by the dideoxy method (Sanger et al., 1977) was performed on singlestranded DNA from asymmetric PCR amplifications (Gyllensten and Erlich, 1988). Fragments of approximately 345 and 530 bp long were sequenced from the 12S and 16S rRNA genes, respectively. Clustal W (Thompson et al., 1994) was used to align these sequences simultaneously (both fragments combined in a single file), and alignments were refined using a piranha rRNA secondary structure model (Ort~ et al., 1996). To study nucleotide substitution patterns, stems (complementary/paired sequences) and loops (unpaired) were identified for all sequences based on the structural model. Alignmentambiguous sites (Gatesy et al., 1994) were determined to assess the effects of the alignment algorithm on phylogenetic inferences. They were identified by performing additional alignments with different gap:change ratios (opening/extending gap cost values used in Clustal W were 10/5, 5/4, 20/8, while keeping substitution cost = 1). A site was considered ambiguous when gap assignments were unstable, differing among alignments performed with alternative parameter values. Inferred nucleotide substitution patterns and compensatory mutations in stems are described for three data sets, representing the taxonomic divergence sampled in this study. Nucleotide substitutions among sequences were estimated by tracing inferred changes on the best tree, rather than by pairwise comparisons
(Fitch and Markowitz, 1970; but see Collins et al., 1994). This approach provides an estimate of the changes that occurred across the phylogeny, under the assumption of maximum parsimony (Swofford and Maddison, 1992). Average changes of character state were computed using MacClade (Maddison and Maddison, 1992). Based on secondary structure models, nucleotide composition and substitution rates for stem and loop regions were calculated separately, following the method outlined in Vawter and Brown (1993). The ependymin gene was partially sequenced using the same approach as described earlier except that cDNA from brain tissue, rather than total genomic DNA, was used as a template for PCR. Translationally active RNA was extracted from freshly obtained fish brains and reverse transcribed to cDNA with an oligo d(T)16 primer using the GeneAmp RNA PCR kit (Perkin Elmer). Primers for PCR were based on published ependymin sequences for cypriniforms, salmoniforms and the herring (e.g., Mfiller-Schmid et al., 1993). Using cDNA rather than genomic DNA as a template for PCR presented two advantages: (1) increased specificity (because of higher relative frequency of target DNA to total DNA during amplification) and (2) only coding regions were amplified for sequencing (avoiding intron sequences which are presumably uninformative for this level of inquiry). Phylogenetic analyses were based on parsimony, maximum likelihood (ML; Felsenstein, 1981), and distance-matrix methods, using PAUP (Swofford, 1993), fastDNAml (Olsen et al., 1994), MOLPHY (Adachi and Hasegawa, 1994), and MEGA (Kumar et al., 1993), respectively. Bootstrap analysis (Felsenstein, 1985), maximum likelihood ratio tests (Kishino and Hasegawa, 1989), successive approximations by character reweighting (Farris, 1969; Carpenter, 1988), and decay analysis (Bremer, 1988) were used to assess confidence in the resulting topologies and to test alternative hypotheses.
III. Results and D i s c u s s i o n
A. E v i d e n c e f r o m M i t o c h o n d r i a l D N A Sequences 1. Mitochondrial Data Sets
Analyses of nucleotide variation were based on three data sets. A fourth data set, with representatives of all characiform families (38 taxa), was used only for phylogenetic analysis. The "serrasalmin" data set (Orti et al., 1996) contained 33 taxa, representing most genera in the subfamily (except Pygopristis, Ossubtus, and
14. Radiation of Characiform Fishes TABLE H
223
S u m m a r y of Three Mitochondrial D N A Data Sets (12S and 16S rRNA Genes) a Total No. of aligned sites
No. of variable sites
No. of phylogenetically informative sites
12S stems 12S loops
163-155-155 173-130-142
19-47-51 62-70-81
15-31-35 46-53-62
12S all
348-319-325
84-123-142
63-89-104
16S stems 16S loops
216-174-171 245-273-258
41-49-49 111-157-160
18-31-33 66-127-129
Data
16S all
539-548-545
177-270-283
101-191-189
Total (12S + 16S)
887-867-870
261-393-425
164-280-293
Alignmentambiguous sites excluded
ND-729-747
ND-270-320
ND-169-200
For each data set, variable and phylogenetically informative sites are given. In each column, values for the serrasalmin data set (33 taxa) are followed by values for characiform (27 taxa) and ostariophysan (22 taxa) data sets. Structural categories are based on the models presented in Figs. 5 and 6. ND, no data. a
Utiaritichthys), and three characid outgroups. Several species per genus and more than one individual per species were included to give a good representation of closely related individuals. The "characiform" data set contained 27 taxa, mostly within the family Characidae, and followed Lucena (1993) in the selection of taxa (see Fig. 2A). It represents an intermediate level of taxonomic diversity. The "ostariophysan" data set included 22 taxa from all five orders of ostariophysans, representing the most divergent set of taxa sampled in this study. Table II contains a summary of the three data sets and also shows the amount of variation and the number of phylogenetically informative sites in stem and loop regions of the 12S and 16S fragments. Among the 867 to 887 aligned nucleotide sites, variation was detected in 261 sites among serrasalmin sequences and up to 425 sites among ostariophysans. The aligned DNA sequences for all data sets are presented in Orti (1995). 2. Compensatory Mutations
by a high proportion of compensatory changes. Most base changes in stems did not break base-pairing interactions: there were significantly more compensatory mutations than expected by chance alone (cf. Dixon 180 160O9
140120100_
r
.m t,,,_ >
806040-
20-
16S, stems I
12S, stems
I
12 o9
co
10
-e-
12S, loops
-a-
16S, loops
O9 > O9
c-
The number of variable sites observed in each structural category and the transition/transversion ratios for the three data sets are plotted in Fig. 3. Transition/ transversion ratios were considerably higher in stems than in loops for both genes, presumably reflecting structural constraints (Vawter and Brown, 1993). Because the G-U pair has a stable conformation in stems, most transitions do not disrupt pairing (e.g., G-U may change to G-C or A-U by a single transition, without disrupting the stem structure). Strong selection for maintaining secondary structure was also suggested
J
O9
8 6
C 0
4-
c"
O9
2
I-
0
~ I
Serrasalminae
I
Characiformes
6
b I
Ostariophysi
Number of variable sites (top) and transition/transversion ratios (bottom) for changes reconstructed on the most parsimonious tree for each of three mitochondrial DNA data sets (12S and 16S). Values for stem and loop regions for each gene fragment are shown separately. FIGURE 3
224
GUILLERMO ORTI
and Hillis, 1993). For example, for the characiform data set, a total of 192 out of 243 substitutions in stems did not disrupt base pairing (and of 243, only 50 nondisrupting mutations are expected by chance alone). The percentage of all potential compensatory mutations observed was close to 70% for the three data sets, but as the level of divergence among sequences increased, a decrease in the frequency of single substitutions in stems was observed (i.e., a nucleotide changing in one strand only). For the serrasalmin data set 61% of all changes in stems were single substitutions, but this value dropped to 36% among ostariophysan sequences. Therefore, increasing sequence divergence was paralleled by a higher frequency of double substitutions in stems (i.e., a nucleotide change accompanied by a change in the corresponding base on the opposite strand), and with an increase in the proportion of transversions (Fig. 3). These observations suggest that although the fraction of mutations in stems resulting in compensatory changes is not affected by the level of divergence among sequences, the kinds of substitutions involved (single or double) are different. 3. Nucleotide Variation and Saturation
An increase in the number of variable sites, but a drop in transition/transversion ratios were both correlated with increasing taxonomic diversity (Fig. 3). However, differences in these two parameters between the characiform and the ostariophysan data sets were relatively small compared to differences between the characiform and the serrasalmin data sets. The same pattern can be observed in a comparison of the number of changes per site (Fig. 4, right). The small differences between the characiform and the ostariophysan data sets in both of these parameters suggest that nucleotide changes may have reached saturation beyond the interfamilial level (especially in loop regions, where the reconstructed number of changes per site is about 2.5; Fig. 4). Among genes and structural categories, the 16S and 12S fragments were overall equally variable (when number of variable sites were corrected for category size), but loop regions (in both genes) were more variable than stems. The pattern of relative frequency of change per category was not affected by the level of taxonomic diversity (Fig. 4, left). A dramatic view of why saturation at the nucleotide level might occur among sequence comparisons beyond the interfamilial level can be seen by identifying the variable sites in the secondary structural models (Figs. 5 and 6). Structural constraints limit variation to well-defined regions of the molecules (shown by lower-case letters in Figs. 5 and 6). These regions al-
Freq. of Change per Category
No. of Changes per Site
Sites U2 S
[ 6s
mong genera errasalminae)
Stems U 2s
Loops U 2s
All Sites
~,,,
-7
~, ~ ,
0
0.2
0.4
0.6
0
0
0.2
0.4
0.6
0
'
i
, 11
,
i
,
2
i
,
i
,
3
I
os
2S
Stems Loops
~2s
All Sites
Stems
1
2
3
2S
I
~2s
Loops 0
0.2
0.4
..... L 0.6
I
iii ' 0
i
,
i 1
,
i
,
i 2
,
i
, 3
Distribution of variation among structural categories in the 12S and 16S sequences. From top to bottom: changes among genera within the subfamily Serrasalminae (33 taxa), among families within the order Characiformes (27 taxa), and among orders within the superorder Ostariophysi (22 taxa). Histograms on the left show the relative frequency of change in the different genes and in stems and loops (number of changes observed in the category divided by total number of changes). Histograms on the right show the amount of change per site in each category (number of changes in category divided by category size). All changes were reconstructed on the most parsimonious trees with MacClade.
FIGURE 4
ready become apparent even among closely related serrasalmin sequences, and only a few more sites are seen to vary when taxonomically more distant sequences are compared. Almost identical profiles in the sliding-window analyses of the Characiformes and ostariophysan data sets suggest that the variation recorded among characiform families might be close to the maximum "allowed" by structural constraints. Indeed, the most divergent characiform sequences were 21.3% different (Gnathocharaxand Hoplias ), the same as the maximum sequence divergence observed between otophysan orders (21.9% between Crossostoma and Boulengerella), and only slightly lower than the maxi-
14. Radiation of Characiform Fishes
G-c
G-C
A--U A
G
225
G
UC -G u
G--C A C--G C-G G-cU E~ c--G C A A G U'G
12S rRNA
A--U U--A cCGCC-GCu
UC_GeA ~
5'-Uc
9
A--U
,
A
-c
-C
%
AA
u u
%u
/ cO o,~,c,<~~ o ~ o"o o
i
A
u ~ AA
b .......
ca c u
,uU> ~
G" a
AAu
u ",G
~c,%oi ,~176
A
cc
. C , ~ A~;L'C
AG
cu
AG c u" . %'u u C "~'~ O
A,,. ~GUC u A /
3' 8 f
6 b
c
U....A
J-~
~.~
A
C A
4 2 0
u u
d
g
AA ~
U
GG
Gc
9
~ c
i
j
AAa
k
Gen. (Serra)
8 6 4 2 0
Fam. (Chara)
Orders (Ostario) Posit ion
FIGURE 5 Secondary structure proposed for the 12S fragment of Pygocentrus nattereri (from Orti et al., 1996; reproduced with permission). All sequences presented in this study could be easily adapted to this model. The three panels under the model show a sliding window analysis of variation (window size = 7, overlap = 1) for the serrasalmin, characiform, and ostariophysan data sets (from top to bottom, respectively). The number of variable sites in the window (Y axis) is plotted at the position along the sequence (X axis). Lowercase letters indicate regions where variation among sequences is greatest.
mum sequence divergence (24%) between gonorhynchiforms and characiforms (Fig. 7, bottom). In contrast, the more rapidly evolving ependymin gene does not exhibit saturation and its sequence divergence can be
seen to increase sharply along the same taxonomic axis (Fig. 7, top). Clearly, a greater insight into the pattern of character evolution can be gained by using more than a single molecular marker (see Section III, B).
GUILLERMO ORTt.
226 b
uCi!ic:Ci:cC -i
~~..... CcSc'CU~'
I
a
k 1 G
:c-
r~lo-o G-U
c
-G
G-C
_u
uG-~ Uu-A"
.........
Uu
I~1 ~c,,
GA
,l~
C GU
uo..<' ~o \ , ,c u %% G G G A %ACG o u~___~
~
AG A
......
G
A
%h . ~ "%, ,),~c~y va
,,, ,".4.LL~_~ ~ ~
A
A G
o~
cC
Co~c , r~ cC ~
CG AAUCCGCU GA II Iiiiiiii I GC UUGGGGGA CU A AU A c A G UUA
_U G U U -G -G -A -U C
rRNA
,,u
A UA
II
16S
G-
~
c~_~,
A GA C - G
A
tA
)t
\~. o\\"
uc-Gu AA
G Ac
- c::ii~
&
A
G
Jl~
~
C C U C
-A -G -G -A
-CAucCUAAuGGUGf'~A II
IIIIIII
}
, AG AUUA UCG Cc ~ ~ 3GA
7
8 6 o -o
.c_
4
h
m
P
q
(Serra)
6
>
o n
Gen.
(2)
._
e
1 k
0 8
_Q
d
i
2
c ._
._
ab
g
4 2
Fam.
0
(Chara)
8 6 4
Orders
2
(Ostario)
0
Posit ion
FIGURE 6 Secondary structure proposed for the 16S fragment of Pygocentrus nattereri and sliding window analysis of variation for three data sets of increasing taxonomic divergence from top to bottom (see legend to Fig. 5).
4. Phylogenetic Relationships among Serrasalmin Genera All three methods of phylogenetic inference used resulted in highly resolved, largely congruent topologies
for this data set. The best maximum likelihood tree obtained using fastDNAml (Olsen et al., 1994) with default parameters, global rearrangements and the jumble option (five times out of five runs) is shown in Fig. 8B. The transition/transversion ratio used was 2.0,
14. Radiation of Characiform Fishes 50
Ependymin
~ g
--t
40 tO C
h
30
r 20
C
f
> 10*
0
r
b~ I
I
= 50- 12S and 16S c"
227
Metynnis Catoprion Pygopristis Pygocentrus Pristobrycon Serrasalmus
I
40-
I i
o 30n 20.
Colossoma Piaractus Mylossoma Myleus Mylesinus Utiaritichthys Acnodon
.....
ii~i
~!ii!i iiiii;a ....
~ii il
b
3
1
10i
Serra
6
Chara
!
Otoph
B
~ Pygocentrus 82 [-" Pygocentrus oo Serrasalmus ~ . ~ 100 45 Serrasalmus Pris tobrycon lOO Pristobrycon str. 68 100 I Pristobrycon str. Catoprion 80 , Metynnis 89 I Metynnis ~ Myleus Myloplus 83 ~ Myleus Myloplus .~ a7 ...... Myleus Myloplus Myleus Prosomyleus 9Q Myleus Myleus 73 70 | sa [- Myleus Myleus lOO r- Mylesinus s4 99 Mylesinus ~
I
Ostario
FIGURE 7 Average and range percentage DNA sequence difference (uncorrected) for pairwise comparisons between taxa. Serra, only comparisons among genera within the Serrasalminae (12S and 16S) or comparisons among closely related species in the family Characidae (ependymin); Chara, only comparisons among taxa from different characiform families; Otoph, only comparisons among taxa from different otophysan orders (Characiformes, Gymnotiformes, Siluriformes, and Cypriniformes); and Ostario, only comparisons between Gonorhynchiforms and taxa from all otophysan orders (12S and 16S) or comparisons between salmoniforms and otophysans (ependymin). For ependymin sequences, the most and least divergent pairs of taxa in each taxonomic assemblage are indicated by letters: (a) Alestes-Phenacogrammus (subfamily Alestinae), (b) Paracheirodon-Gymnocorymbus (in subfamilies Cheirodontinae and Tetragonopterinae, respectively), (c) Phenacogrammus-Distichodus, (d) Metynnis-Nannobrycon, (e) Cyprinus-Schilbe, (f) ParacheirodonPimelodus, (g) Schilbe-Salmo, (h) Boulengerella-Esox. For the mitochondrial DNA data sets, the most and least divergent pairs are indicated by numbers: (1) Acnodon-Metynnis, (2) Myleus-Mylesinus, (3) Gnathocharax-Hoplias, (4) Prochilodus-Cyphocharax, (5) Crossostoma-Boulengerella, (6) Hypostomus-Distichodus, (7) Kneria-Nannostomus, and (8) Kneria-Citharinus.
but a value of 4.0 gave identical results. 12S and 16S data fit this tree significantly better than they fit the topology suggested by Machado-Allison (1982, 1983), according to the method of Kishino and Hasegawa (1989). The difference in in likelihood (+ SE) between topologies is -119 + 26. Neighbor-joining analysis (Saitou and Nei, 1987) using Kimura's (1981) distances placed Catoprion as the sister group of the Pygocentrus + Serrasalmus + Pristobrycon clade instead of with P. striolatus, as shown in Fig. 8B. All other branches were as shown, with high bootstrap support for most nodes (values above branches in Fig. 8B). Parsimony analysis resulted in three equally short trees (L = 642, CI = 0.51 excluding uninformative characters), when all characters were uniformly weighted. These trees differed from each other only in the placement of the Piaractus
68
~
6g
'
N. g e n . A
F- Acnodon
100 100
r--"" :100
48
100
~
os
100 gg
100
73
_ 96 I
i
a3 93 I
Acnodon
Mylossomad.
Mylossoma p. Mylossoma a. Colossoma F Colossoma 94..~_1--Piaractus m. Piaractus m. Piaractus b. Piaractus b. Chalceus Tfiportheus Gnathocharax
FIGURE 8 (A) Serrasalmin phylogeny based on morphology (Machado Allison, 1982, 1983) showing the "a" and "b" clades; the broken line represents uncertainty on the position of Myleus. (B) Phylogenetic tree obtained by maximum likelihood analysis of 12S and 16S mitochondrial DNA sequences. Bootstrap values for neighborjoining and parsimony analyses are shown above and below each branch, respectively. Branches without numbers received bootstrap values <50 or were not supported by neighbor joining or parsimony.
species, but differed from the one shown in Fig. 8B by (1) the position of Acnodon as the sister group of the "Myleus" clade, (2) some relationships within the "Myleus" clade, and (3) by reversing the positions of Metynnis and the Catoprion + Pristobrycon striolatus clades. However, the 50% majority-rule consensus tree resulting from 100 bootstrap replications (with heu-
228
GUILLERMOORTI
ristic search with three replications using random addition of taxa) is congruent with the tree shown in Fig. 8B. Resulting bootstrap values are also shown in Fig. 8B (below the branches). A posteriori reweighting based on the RC index (Farris, 1969) of the uniformly weighted data set also resulted in three trees that differed by the positions of the Piaractus species (with respect to each other), but in which Acnodon is placed as shown in Fig. 8B. Weighting schemes in which transversions were counted two and four times as much as transitions also gave congruent results and similar bootstrap values as those shown in Fig. 8B. The topology shown in Figure 8A required 705 steps (63 extra steps) when enforced on the molecular data set. When only stem characters (33 phylogenetically informative sites) were used, a total of 842 equally parsimonious trees were recovered, a strict consensus of which still showed serrasalmin monophyly, and the basal Piaractus + Colossoma + Mylossoma clade. The small number of informative characters in stem regions is clearly not sufficient to resolve relationships among most taxa, but is still able to recover the most basal nodes of the tree shown in Fig. 8B. This suggests that the more slowly evolving stem sites did not accumulate enough phylogenetic information for estimating recent divergences. Because most of the resolution is provided by the loop and "other" sites in the rRNA sequences, downweighting stem changes because of nonindependence is irrelevant in the context of the present phylogenetic analysis. Loop sites alone (418 sites), however, result in 18 most parsimonious trees, a strict consensus of which has most of the structure shown in Fig. 8B. Loops changes alone, however, fail to recover the monophyly of the Piaractus + Colossoma + Mylossoma clade and result in a trichotomous resolution among Acnodon, the "Myleus" clade, and the "piranha" clade. A combination of the slowly evolving stem regions and the fast evolving loop regions results in the complete resolution of relationships, as shown in Fig. 8B. The 12S and 16S molecular data set provides a robust estimate of serrasalmin relationships (Fig. 8B) that does not support a previous hypothesis based on morphology (Fig. 8A), but that agrees with cytogenetic and parasitological data (Orti et al., 1996). Division of the subfamily Serrasalminae into two groups (lineages a and b, Fig. 8A) is not supported by mitochondrial DNA sequences. Instead, three major groups are defined (the "pacu, .... myleus," and "piranha" clades). The earliest divergence within the serrasalmins gave rise to the well-defined group of plant-eating fishes (Piaractus, Colossoma, and Mylossoma). The genus Acnodon forms the sister group to the other two clades. The genera Myleus, Mylesinus, and "N. gen. A" form a
group that receives good support from molecular data. The third group contains the "true piranhas," predominantly carnivorous fish with one row of sharp tricuspid teeth on each jaw (Serrasalmus, Pygocentrus, Pristobrycon, Catoprion, andmmissing in this s t u d y m Pygopristis), and the plant-eating Metynnis. Relationships among the piranha genera in this clade remain somewhat unresolved because relatively few informative characters are available. A molecular marker with a higher rate of evolution than the mitochondrial rRNA genes might be more appropriate in investigating this problem further.
5. Phylogenetir Relationships among Characiform Families Two data sets were analyzed to assess relationships at this level. The "characiform" data set with 27 taxa, mostly from the large family Characidae, was originally conceived as a comparison to the phylogenetic results of Lucena (1993). The second, 38-taxon data set includes representatives from all 15 characiform families. For both of them, the clade formed by the African characiform families Citharinidae and Distichodontidae (Vari, 1979) was used as an outgroup. Morphological (Fink and Fink, 1981) and molecular evidence (see later) suggests that this clade is the putative sister group to all other characiforms. Alignment-ambiguous sites (Gatesy et al., 1994) were detected in loop regions and comprised 138 and 140 sites for the characiform 27 and 38 taxon data sets, respectively. Phylogenetic analyses including and excluding these sites are presented first for the 27 taxon data set. Parsimony analysis, using all sites with equal weight and treating alignment gaps as "missing," resulted in three equally shortest trees (L = 1564, CI = 0.34; RI = 0.36). With a posteriori reweighting (maximizing the RC index), one of the shortest trees was obtained (Fig. 9). However, bootstrap analysis only supported eight clades with values above 50: (1) Nannostomus + Pyrrhulina (family Lebiasinidae), (2) Hoplias + Hepsetus, (3) Boulengerella + Ctenolucius (family Ctenoluciidae), (4) Phenacogrammus + Hydrocynus + Alestes (African subfamily Alestinae) + Acestrorhynchus, (5)Poptella + Oligosarcus + Astyanax, (6) Brycon + Salminus (subfamily Bryconinae, in part), (7) Pygocentrus + Colossoma (subfamily Serrasalminae), and (8) Citharinus + Distichodus. Neighbor-joining analyses also supported groups 1-8 with high bootstrap values, but relationships among components differed from those obtained with parsimony and were not supported by bootstrap analysis (Fig. 9). Maximum likelihood analyses were performed with fastDNAml using empirical base frequencies (A = 0.288, C = 0.248, G = 0.248, T = 0.216) and a transition/transversion
14. Radiation of Characiform Fishes
77~8 ~iPyrrhufin
Nannostomus a -}1 Hoplias --~
~2L-.---~ 1O0
~858 73
_3 2
I
Boulengerella
Ctenolucius
1 3
Gnathocharax
Acestrorhynchus 100 99
[~.~.-|~J~lLoZ~]~tlIl~
L ~
d I ,d t
4
Raphiodon Aphyocharax
Cynopotamus Cheirodon 1199 r - - Poptella | H 92~-5Olig~
II
~ -
r~ ~
Ih H "~
099 97
-
---~
Triportheus
99
959 ~
~
Astyanax
Tetragonopterus Brycon
Bry Salminus__}6
Chalceus | Pygocentrus-~ -1 L..... Colossoma J ~
j8
FIGURE 9 Shortest tree resulting from a posteriori character reweighting for the characiform data set with 27 taxa. The original three equally parsimonious trees used as the starting point for successive approximations were obtained when all sites were uniformly weighted (all characters used, gaps coded as missing). The reweighted tree had a rescaled consistency index (RC = 0.40). Bootstrap values for neighbor-joining and parsimony analyses are shown above and below each branch, respectively. Branches without numbers had bootstrap values <50 in both analyses. Numbered braces highlight clades supported by bootstrap analyses (see text). African taxa are enclosed in black boxes.
ratio of 2. The best tree, obtained 6 times out of 10 "jumbled" replications, also contained all eight clades mentioned earlier and had a topology very similar to the tree shown in Fig. 9. When the 138 alignment-ambiguous sites were excluded from the analyses, parsimony yielded four equally shortest trees (L = 835; CI = 0.36; RI = 0.39). A strict consensus of these trees was mostly unresolved, but contained the earlier mentioned clades plus a clade formed by Tetragonopterus + Chalceus. The strict consensus tree also grouped serrasalmins (clade 7) with Hepsetus + Hoplias (clade 2) as sister groups, and this group with lebiasinids (clade 1) + Rhaphiodon. A posteriori reweighting resulted in a single, completely resolved tree, with all of the previously mentioned clades, but different in several respects from the one shown in Fig. 9. However, only components 1-8 were supported by bootstrap analysis with values above 50 (and similar to those shown in Fig. 9). All other relationships received poor support.
229
The most conservative strategy for obtaining reliable phylogenetic information, according to the substitution pattern analysis described earlier, would be to eliminate saturated transitions and alignmentambiguous sites (which are in the fastest-evolving regions within loops). Using this strategy, parsimony analysis resulted in 302 equally shortest trees. A posteriori reweighting reduced this number to 15 equally parsimonious trees, a strict consensus of which is shown in Fig. 10. Most of the eight clades discussed earlier are present in the strict consensus tree. However, Lebiasinidae (component 1), Serrasalminae (component 7), and Hoplias + Hepsetus (component 2) are not supported in this analysis. Neighbor-joining analyses using only transversions gave a similar result, except that Serrasalminae (7) was supported, but Lebiasinidae (1) and Hoplias + Hepsetus (2) were not. Parsimony and neighbor-joining analyses were also congruent in defining relationships among Hepsetus, Hoplias, and Ctenoluciidae. However, bootstrap support for these groupings was very low (values not shown). Although no single set of relationships was firmly supported by this data set (other than the components discussed earlier), data were used to test alternative topologies using the likelihood ratio test of Kishino Pyrrhufina . . . . 1 Hoplias. . . . _ _ ~ Boulengerella 3 Ctenolucius 2 i repsetu i . . . .
L
_Aces,onchus} | ;)e[:~n~._[~- Ir.lni lei ll1.11
Ileste
Poptella
}
Oligosarcus 5 Astyanax Tetragonopterus Cheirodon Aphyocharax
Brycon
E
Salminus Gnathocharax riportheus Chalceus
}6
Cynopotamus Raphiodon
j
Colossoma . . . .
7
Pygocentrus -- -- -- --7 "-~
Nannostomus . . . .
1
}8 FIGURE 10 This strict consensus of 15 trees was obtained by a posteriori reweighting, excluding alignment ambiguous sites, and considering only transversions. Numbered braces identify clades also obtained in Fig. 9 and dashed lines with the number of clades shown in Fig. 9 not supported in this strict consensus. African taxa are shown in black boxes.
230
GUILLERMO ORTI
and Hasegawa (1989). In addition to testing different results obtained with different phylogenetic methods and inclusion sets, the main purpose of these comparisons was to evaluate a previous hypothesis (Lucena, 1993; Fig. 2A) and the phylogeny implied by a single vicariant event separating African and Neotropical lineages (i.e., monophyly of the Neotropical taxa). These alternative topologies were obtained by optimizing (with parsimony or with ML) while enforcing topological constraints. For example, the African taxa (Hepsetus and Alestinae) were forced to a basal position, joining the Citharinus + Distichodus clade in the "Neotropical monophyly" constraint tree, and the best solution for a fully resolved tree was searched for using PAUP or NUCML. Similarly, the topology used to test Lucena's hypothesis was the best (ML or parsimony) tree among the 945 fully resolved trees that agree with the partially resolved topology shown in Fig. 2A. The following tree topologies were compared: (1) best parsimony "reweighted" tree, using all sites (Fig. 9); (2) best ML tree from fastDNAml, using all sites; (3) best ML tree from fastDNAml, excluding alignmentambiguous sites; (4) best parsimony tree using all data, constraining monophyly of all Neotropical taxa; (5) best ML tree using all data, constraining monophyly of all Neotropical taxa; (6) best parsimony tree (equal to best ML tree) excluding alignment-ambiguous sites, constraining monophyly of all Neotropical taxa; (7) best ML tree using all sites, constraining the topology to satisfy Lucena's hypothesis (see Fig. 2A); (8) best parsimony "reweighted" tree, excluding alignmentambiguous sites and transitions (the best ML tree of the 15 trees summarized in Fig. 10); and (9) best ML tree excluding alignment-ambiguous sites, constraining the topology to satisfy Lucena's hypothesis. When all sites were considered, the best parsimony tree was not significantly different from the best ML tree, but all other alternative topologies tested were significantly worse (data not shown, see Orti and Meyer, 1997). Hypotheses of Neotropical monophyly and Lucena's (1993) hypothesis were not supported by the 12S and 16S data sets. Of all alternative topologies tested, only topology 6 (just described) was not rejected, but was close to being significantly worse than the best ML tree (topology 3) when tested with the data set excluding alignment-ambiguous sites (AL6 = -44.2 + 27). Parsimony analysis of the 38 taxon characiform data set, using all sites with equal weight and treating alignment gaps as "missing," resulted in eight equally shortest trees (L = 2059, CI = 0.28, RI = 0.37). All eight components described earlier are also present in the consensus, plus a few others shown in Fig. 11 and dis-
cussed later. For simplicity, only results obtained when excluding alignment-ambiguous sites will be presented because all major differences between inclusion sets are not strongly supported. Parsimony analysis excluding 140 alignmentambiguous sites resulted in 14 shortest trees (with gaps treated as+"missing," L = 1052, CI = 0.29, RI = 0.39). A posteriori reweighting (Farris, 1969) resulted in a single tree, shown in Fig. 11. The same pattern as with the 27 taxon data set emerges, but in addition to the previously described components, four more clades are supported. Clade 9 (family Gasteropelecidae), clade 10 (Prochilodus + family Curimatidae), and clade 11 (characid subfamily Glandulocaudinae) are strongly supported by bootstrap analysis (both neighborjoining and parsimony) and each constitutes a wellsupported unit using morphological characters. The best tree from ML searches (log likelihood -6604.3) had the same topology as one of the 14 shortest parsimony trees. In fact, all 14 shortest trees from parsimony had very similar log likelihoods and did not differ significantly according to the test of Kishino and Hasegawa (1989). Neighbor-joining analysis (using Kimura's distances) supported the same components as parsimony (see neighbor-joining bootstrap values in Fig. 11) but resulted in a globally different topology, which required 28 extra steps for parsimony and had a significantly worse likelihood than the best tree. Forcing the monophyly of Neotropical taxa resulted in 13 extra steps for parsimony and significantly worse log likelihood values (not shown).
6. Phylogenetic Relationships among Ostariophysan Orders Gonorhynchiforms (family Kneriidae) were used as outgroup taxa for the otophysans (Fink and Fink, 1981). Parsimony analysis of the 22 taxon data set resulted in a single most parsimonious tree (length = 1460) when all characters were equally weighted and gaps were coded as missing data (Fig. 12). Monophyly of all orders except Characiformes was well supported; the characiform clade formed by Citharinus and Distichodus grouped with catfishes instead of with the other characiforms. Forcing characiform monophyly required only three extra steps (L = 1463). There were nine suboptimal trees a single step longer (L --- 1461), a strict consensus of which showed poor resolution among orders. Bootstrap values lower than 50 for all relationships among orders also indicate lack of support for the basal branches of the tree. Low consistency indices (CI = 0.42, RI = 0.38) indicate that the level of homoplasy in the data set is high. The tree resulting
14. Radiation of Characiform Fishes
•98••• 98 I
62 ~
231
Nann~176 ] Pyrrhufina___J Lebiasinidae Pygocenttus ~
Co/ossoma j Serrasalminae
Floplias
51 ~ | I r ~ i ' ~ t ~ J
Erythrinidae Hepsetidae
r ~-- Raph/odon 82
i
I i
98 [
I
I
921 so 11oo / 95 I
a']N ' 192 I 100 !
9080r ~
~ I
} t AFR
Apareiodon Parodontidae Camegiella Gasteropelecus} 9 Gasteropelecidae Abramites Lepotfnus } An~omidae Chilodus Chilodontidae Characidium Crenuchidae Hemiodus Hemiodidae Prochilodus Prochilodontidae } Cyphocharax - ~ 10 Steindachnerina J Curimatidae Boulengerella } } Ctenolucius Ctenoluciidae t Gnathocharax Acestrorhynchus ~,,,._ci
Brycon Salminus Chalceust
} Alestinae
AFR I
Ij t
9 8 ~ ~ Cyn~176 Cheirodon Corynopoma--1 Glandulo89 ! Gephyrocharax/11caudinae Poptella Oligosarcus Astyanax Aphyocharax Tetragonopterus 98
89 [
IIq/h~JicIiI~m Citharinidae I ~iP/Rtt'4al~ttlL3 Distichodontidae AFR
Single shortest tree for 38 characiform taxa obtained with a posteriori reweighting on 14 equally parsimonious trees, when alignment ambiguous sites were excluded. Numbers above and below branches are bootstrap values from neighbor-joining and parsimony analyses, respectively (only values above 50 are shown). Numbered braces (9-11) identify clades supported by bootstrap analyses, other than clades 1-8 shown in Fig. 9. Family names are given next to each taxon, except for the family Characidae. For the Characidae, only subfamilial groupings supported by bootstrap analyses are indicated (Serrasalminae and Glandulocaudinae). Numbered arrows mark putative drift-vicariant events between African and Neotropical groups, and crosses identify clades postulated to have gone extinct in Africa (see text). African taxa are shown in black boxes. FIGURE 11
from neighbor-joining analysis also grouped Distichodus and Citharinus with catfishes and placed Gymnotiformes as the sister group of Characiformes + Siluriformes. The same result was obtained with maximum likelihood analysis (ML), using the fastDNAml program. When the 123 alignment-ambiguous sites were excluded from the analysis, three equally parsimonious trees resulted. A strict consensus of these trees showed highly unlikely relationships (e.g., with cyprinids
nested within the catfish, and distichodontids as the sister group of all other otophysans). Similarly, including these alignment-ambiguous sites but using only transversions resulted in six equally parsimonious trees, leaving the deeper nodes unresolved and characiform monophyly unsupported. As suggested by the substitution pattern analysis, the sequences contained in this data set do not provide reliable information to resolve relationships among these orders with confidence.
232
GUILLERMO ORTI
saccharide units and conferring calcium-binding capacity to the molecule (M(iller-Schmid et al., 1993). The additional potential N-glycosylation site at position 139 is shared by characiform sequences (except Distichodus), which is not found in any of the other species. Orti and Meyer (1996) provide a detailed analysis of nucleotide substitution patterns, base composition, codon usage, and phylogenetic information content of these ependymin sequences. This chapter only reviews the most significant phylogenetic results.
Ii r.... ! Hypostomus Ir-~ ' Trichomycterus 17oqs3"~J I Malapterurus Siluriformes [ ~ . . . . Cetopsis 931
IR~L~I'~~
63 ir~t
oo 100
Eigenmannia Rhamphichthys Apteronotus Gymnotiformes Hoplias Leporinus
, 65
65
"
Metynnis Chalceus Paracheirodon Nannostomus
Boulengerella
Gasteropelecus F ..... Crossostoma loo Cyprinus Kneria J Parakneria
'~176
2. Phylogenies from Ependymin
. . . . . . . . . . . .
Cypriniformes
i I I
~ I
FIGURE 12 Shortest tree for the 22 taxon ostariophysan data set (all characters equally weighted, gaps treated as "missing"). Gonorhynchiforms (Kneria and Parakneria) were treated as the outgroup. Tree length L = 1460, CI = 0.429 (excluding uninformative characters), and RI = 0.384. Bootstrap support is shown only for those branches with values >50 (neighbor-joining bootstrap values above branches, parsimony values below branches). African characiform taxa are shown in black boxes.
B. Evidence from Ependymin Sequences 1. Variation among Ependymin Sequences About 600 bp of the cDNA sequence was obtained for 13 characiforms, two gymnotiforms, and four catfish taxa. The inferred amino acid sequences, aligned with published sequences from cyprinids, salmoniforms, and a herring, are shown in Fig. 13. Percentage sequence differences among species were large (Fig. 7), confirming previous observations that ependymin is a rapidly evolving gene (MLiller-Schmid et al., 1993). Even cysteine residues are not fully conserved, but the pattern of variation seems to agree with major taxonomic divisions. For example, cysteine residues are found at position 20 for catfish taxa only and at positions 154 and 155 for most (but not all) catfish, gymnotids, and characiforms (Fig. 13). Length variation among sequences, comprising 1-8 amino acid residues, is most noteworthy in comparisons involving catfish taxa; Distichodus and Nannobrycon share a deletion at position 48. Conserved features in the sequences, including potential N-glycosylation sites and cysteine and tryptophan residues, are also shown in Figure 13. The most conserved region is located around the potential N-glycosylation site at position 80. This site is presumably necessary for binding crucial oligo-
Data sets including all 25 taxa and only catfish, electric fish, and characiforms (19 taxa) were analyzed separately. Figure 14 shows parsimony trees obtained with these data sets using different weighting strategies. For 25 taxa, a total of 588 bp were aligned of which 442 were variable and 359 phylogenetically informative. When third codon positions were excluded, only 258 sites were variable and 193 were phylogenetically informative. A herring (Clupea harengus) was used as the outgroup. Currently accepted relationships among these orders, based on morphology, are shown in Fig. 2E. Most parsimonious trees obtained by excluding transitions in third codon positions and by eliminating third positions completely were mostly congruent with each other, but differed somewhat from the ones using all characters with equal weight (Fig. 14A and B). The most basal branches on the trees resulting from all weighting strategies are congruent and receive very high bootstrap support. Protacanthopterygii (Esox + Salmo), Otophysi (cyprinids + characiforms + siluriforms + gymnotiforms), Characiphysi (characiforms + siluriforms + gymnotiforms), Gymnotiformes, and Cyprinidae are all strongly supported. In contrast, characiform monophyly is not well supported as the African Distichodus either tends to branch off before electric fish (Fig. 14B) or forms a trichotomy with siluriforms and gymnotiforms (Fig. 14A). Although the topology within characiforms is congruent for trees A and B, it receives very low bootstrap support, except for the grouping of Alestes + Phenacogrammus (African subfamily Alestinae, family Characidae) and Gymnocorymbus + Paracheirodon (Neotropical family Characidae). Note that Chalceus and Metynnis, traditionally included in the family Characidae, come out in separate branches while Gasteropelecus (family Gasteropelecidae) groups with Gymnocorymbus + Paracheirodon. The most important difference between the trees is the relationship among electric fish, catfish, and characiforms. Whereas tree A (Fig. 14) groups catfish with electric fish (the "traditional" hypothesis also shown in Fig. 2), tree B suggests a closer
14. Radiation of Characiform Fishes o o
:r: ::r: :r: :I: :I: :i: :z: :z: ~
:1: >.
~
:z: ::I: :i: :z: :r: :z: :z: "v ~
~
~
~
~. ~
~
< < < < < < < < < < < < < < < < < < < < < < ~ < <
m
mm
mm
[~ ~
~
~.~
~ ~
~ ~
>
>:>
:E
mm
[~ M
M M
I >
I >
mm
M
~
~ ~E
~ ~
~
~ ~
~
> >
~
>
>
>
I >
I >
I >
.
M
~
ooooooooo
I >
mm
I >
M
~
I >
I >
m< ~
~
~ ~
<<
M M
M
~
~
> >
mm M
~
~ ~
~ ~-~ ~
>
~
M
.
I >
> >
I I I :> :> >
M
I >
> >
~ ~
~.OM
~
r ~ ~-~ ~-~ ~
oooooooo.
I >
~.~
.
M
.
m<~
M M
M
~
~ ~
~
~
~
~ ~
r,.>
>
> >
>
I I :> >
I >
I >
~
I >
~. ~.~. ~.~. ~ . ~
~.~
~.~. ~.~. <
~. ~ . ~ . ~
<
<
~
~
~
~ . ~ . ~ . ~ . ~ . ~ . ~ ~ . ~ . ~ . ~ . ~ . ~ . ~ ~ ~.~.~.~.~.~. ~ . ~ . ~ . ~ . ~ . ~ ~ ~
~<~
.... ~ o . ~ ~ ~
lUUUUUUUUUUUUUUUUUUUUUUUUU
~
~.~ ~ ~- ~-~ ~
oooo
I I ::> >
>m<
233
Z
Z
Z
Z
Z
Z
Z
ZZ
I
I
I
I
I
I
I
I
O
O
ZZ
Z~
Z ~
I
ZZ
Z
~
~
~ O M ~
> ~
o
>4 ~
~'~ ~
I
I
I
I
I
O
O
O
0
>
>
>
>
I
I
I
I
I
I
I
I
I
I
I
>
I ~
O
O
~
>
>
>
O >
0
O
>
>
>
>
~
>
>
O >
O >
~ >
0 >
0 >
0 >
0 >
O >
0
~
~
~
~
lUUUUUUUUUUUUUUUUUUUUUUUUUl F~3 M ~ > > >
~. M r,.>
..-I ~-I r
r
<
<
r.~ ~I ~.I M > > > >
M r,,.1M M > > > >
,-I .-.I r~...-I ,._I ~-I ~
~'. r,. <
<
r,. r,. <
<
<
r.~ M > ~
F~3 ~ >
M ~ > >
M >
<
r
t90
:z: ~
~
r,. <
.... ~
.... ~
o
~
~. ~. ~ ~ e,. e,. LO t9
~. ~. ~. ~d ~ ~ r e,. r,.9 t9 L9 t9
~ t9
~. ~ ~ r L9
~ t9
<
r- r
r,. r,. <
<
r,. <
~: ~
r,. <
<
~
<
~ ~ L9 t 9
<
<
M ~
..-I ._.I t~...-I .--I ~-I ..-I t~. [a~ [a~ [~ ..-I ~
~
<
r~,3 M M > > ~
~: ~
~
~
<
~
~
.--I
r~ <(~
~
~ ~. ~ ~ ~ L9 r'- LO L9 t9 ~
<
:I: ~ L9 r
~] (.9
<
ul
<
t9 t9 r,. r~. t9 t9 r,. t~. ~ t9 L9 ~ t9 t9 r,. r,. L9 L9 r,. r~. t9 L9 t9 D
t9 r D ~ t9 ~. t9 ~
n ~
~ ~
r~. t9 t9 t9 t9 t9 t9 r,. O t9 t9 t9 ~ ~
>
>
>
>
r,. >
>
r
r,- >
t9 L9 r'. r ~ ~
[a-
~ ~
0
,,.~,
~
~" ~. ~ ~
~" ~:
~" ~" 0 ....
~.~
>
>
>
>
r
>
>
>
>
~
t9 t9 e- L9 L9 t9 >
~
~
~. ~. ~
~E ~.
O
r~" O
O
~ ~. ~. ~-
~. ~ ~~. ~. ~- ~.
m ~,. ~. ~.
~. ~. ~. ~.
r,. >
t9 e'. t'. t9 ~9 t9 t9 t9 e- t9 ~
r~" e" r ~. ~-. ~. ~.
r
~. ~:
~
~:
o- ~. o- 0 . . . . . .
~
0
~
~
~
~: ~:
~. ~
~. ~:
~ ~
~:
~
~. ~
~: ~
~
>
~
0
~
0
0
0
0
~. 0
0
0
0
00]
~
~
,-.
~
~
~
~
,,-
~
~
~
~:
z
C~ O e O 0
r
e" 0
0
C~ 0
r
>
~
~1 0.1 r ~ 0
0
~. ~. ~. ~.
~. ~. ~. ~.
~. ~. ~. ~.
~. ~. ~. ~.
~. ~. ~. ~.
~~ ~. ~.
m u~ ~ ~=
m vl t9 ~.
m u~ t9 ~
m u~ < ~
m ~ u~ < :~ ~= u~
I'
J
~. ~. ~. ~.
~ ~ ~. ~.
lln v/////~
I I I
~
> > > > > >
> >
> >
>
~
> >
> > >
> >
> ~
0
0
~ u~ ~: o
~ ~ ~
:E: >
[~ ~
~
r
~. ~. ~. ~.
I U O O U U O O O O O O O O O O O O O O O O U ~ O ]
~
~: u~ o <
~: u~ ~ <
~
0
0
~
~
~
~
0
~
0
0
~ ~
0
~ ~
0
~
0
~
0
0
0
~
~
~
~
~
0
~
0
0
~ ~
0
~ ~
0
~
0
~
0
0
~
~
~
0
0
0
~
Z
~
Z
~
~ ~
0
~
~
~
~
~
~
~
~ ~-~ :> :>
Fill F/////~
I
I
FIGURE 13
relationship between electric fish and characiforms. Both of these topologies are stable to a posteriori reweighting (Farris, 1969; Carpenter, 1988). For example,
the tree obtained when this procedure is applied to the data set with all characters equally weighted shows Distichodus branching off before Siluriformes +
oo, ~ L92.r~ ~ Fr~r'-r~ r~
r"-! J I P--I
q t
_.li
I
I
i
~
g0
~I~
,
I""
II "-J~[:] =P"T"I';'. i ~ l i i T i i l q ~
Hoplias Boulengerella Chalceus Gymnocorymbus Paracheirodon
L.po nu. Metynnis
Nannobrycon
97 35
100
100
1oo
A
~l't~rlll Rhamphichthys Eigenmania Hypostomus Pimelodus Synodontis Schilbe Cyprinus Carassius Danio [--- Esox Salmo Clupea
r
'J'r:]'r'r'~'z' lt:l'*~'Jl'I:m"-' Hoplias i J"l Boulengerella I J_..j Chaiceus I J Gymnocorymbus ~ J Paracheirodon--J J 70 I i Gasteropelecus 5 Hemiodus. Leporinus Metynnis Nannobrycon Eigenmania--] 98
Rhamphichthys---J liwrtr'~,'~r~z Schilbe~ Pirnelodus Hypostomus Cyprinus, ~ Carassius Danio Esox Salmo--'] Clupea
B
~ !~[:]=P._~4,.-~.1.1.p [.t Metynnis Hemiodus Leporinus
Chalceus 1ool ~ Gymooco~bu~ s3J "K! ' Paracheirodon II ~ - Gasteropelecus J[! Boulengerella [~90 J L - Nannobrycon
[
80 Schilbe --'-Ip Synodontis imelodus Hyposlomus
C
1 tree, L=377 C1=0.69
R1=0.68
10o 100
All positions, no TS in 3rd 2 trees, L=1029 C1=0.63 R1=0.69
Hemiodus ~'~5 Le po ri n u s--- I /
Nannobrycon---u I I 13 Boulengerella---J J Hoplias -----J
[::::: Gymnotiformes ~ \\\ \\
[\\\'q\
kN\ Siluriformes \\]
t ~,'T:I:FII Eigenmania j 100 Rhamphichthys Schilbe eus~ Synodonlis Pimelod Hyposlomus
5 I st and 2nd positions only
52
Gasteropelecus "'-]58 h Gymnocorymbus ~ J J Paracheirodon J r ~]o Chalceus-a 1132 Metynnis ---.-J J I ~ - . ~ h3 "-.J,I'I,P-W..,;TeI,,,,,,,[-t J I~,'
L_ Hoplias
lm]~ml".liT:FI~-, Eigenmania Rhamphichthys
40
71
Synodontis
All positions, equal weights 2 trees, L=1542 C1=0.53 R1=0.59
~
I
D
100
5 ,.,,., All positions, no TS in 3rd 2 trees, L=581 O1=0.69 R1=0.68
Parsimony trees from ependymin cDNA sequences. (A) Strict consensus tree obtained using all taxa and all characters with equal weight. (B) Strict consensus tree using all taxa and excluding transitions in third codon positions. For A and B, boldface type, thicker branches, and a solid bar identify characiform taxa, whereas boxes with horizontal lines, crossed-hatched, and open identify gymnotiform, siluriform, and cyprinid taxa, respectively. (C) Shortest tree using first and second codon positions only. (D) Strict consensus tree excluding transitions in third codon positions. For C and D, characiform taxa in boldface type belong to families other than the Characidae; branch lengths are proportional to the number of changes (scale corresponding to five changes is shown). For all trees, bootstrap values are shown above the branches only when those branches were recovered in the bootstrap majority-rule consensus tree. L, tree length; CI, consistency index (excluding uninformative characters); and RI, retention index. African taxa are enclosed in black boxes. FIGURE 14
14. Radiation of Characiform Fishes
Gymnotiformes and is equal to one of the shortest trees. Of these two alternative hypotheses, tree A (Fig. 14) is less well resolved, has lower bootstrap values, and a lower consistency index (CI) than tree B as a likely consequence of considering "noisy" third codon positions. Furthermore, forcing the topology shown in Fig. 14B on the data set with all characters equally weighted required only four additional steps (L = 1546), in contrast to eight additional steps required by the topology shown in Fig. 14A (L = 1037) on data excluding transitions in third positions. Excluding the fast-evolving third codon positions also results in higher bootstrap support for grouping the electric fish with characiforms (Fig. 14B) instead of with catfish (Fig. 14A). An alternative approach to test for how well particular clades are supported by data is by inspection of suboptimal trees ("decay analysis or Bremer support," Bremer, 1988), counting how many extra steps are required to collapse the clade of interest. For the clade grouping electric fish with catfish (Fig. 14A) two extra steps are required (with all characters, equal weights), whereas for the clade grouping electric fish with characiforms (Fig. 14B) three additional steps are required to break the group up (with no transitions in third positions). Although no statistical value can be attached to these decay indices, they also suggest that the grouping of electric fish with characiforms receives slightly better support than its alternative. Neighbor-joining analyses, with or without third codon positions included, always grouped electric fish with characiforms. Bootstrap support (500 pseudoreplicates) was very high for Protacanthopterygii, Otophysi, Cyprinidae, Gymnotiformes, and Siluriformes (values > 90) when all positions were included in the analysis. The main difference between trees including or excluding third codon positions was the placement of cyprinids and of Distichodus. When all positions were considered, characiform monophyly was supported with a bootstrap value of 63, and electric fish and characiforms were grouped together with a bootstrap value of 42. When third positions were excluded, Distichodus grouped with electric fish and this clade grouped with characiforms, supported by bootstrap values of 29 and 67, respectively. Excluding third positions also had the effect of placing cyprinids as the sister group of characiforms + electric fish, to the exclusion of catfish. Protein Poisson-corrected distances and Kimura (1981) distances excluding third positions resulted in the same topology. Relationships among characiform lineages were poorly supported in the neighbor-joining trees, but agreed with parsimony analyses in placing Distochodus at the base of characiforms and in grouping Alestes + Phenacogrammus and Paracheirodon + Gymnocorymbus + Gasteropelecus with high bootstrap support.
235
Maximum likelihood analysis was used to compare alternative hypotheses. The rate of change at each codon position was estimated by counting the number of changes reconstructed over the shortest tree (tree B in Fig. 14) using the program MacClade. These values were 373, 270, and 860 for first, second, and third positions, respectively. They were used as auxiliary information with the input to the fastDNAml program to activate the "categories and rates" option (Olsen et al., 1994). Five runs of the program using the jumble input option (27,249 trees examined) resulted in the same best tree every time (identical to tree B in Fig. 14), with a log likelihood of -6906.79. The alternative topology (Fig. 14A) had a log likelihood of -6929.36. The same best tree (Fig. 14B) was obtained in 3 out of 10 "jumbled" runs of fastDNAml with only first and second positions in the data set. To evaluate the extent to which the best tree is significantly better than its alternatives, the standard errors (SE) of the differences between log likelihoods (A/i, Kishino and Hasegawa, 1989) were computed using the program NUCML 2.2 (Adachi and Hasegawa, 1994; Hasegawa et al., 1985) for trees A and B (and alternative topologies, not shown), using data sets including either all positions or only first and second positions (NUCML does not allow rate categories in the input). The differences in log likelihood between trees are not statistically significant because all upper bounds of the 95% confidence intervals are greater than zero. According to Kishino and Hasegawa (1989), this means that none of the best trees is significantly better than the alternative hypotheses. However, the data set including only first and second codon positions provides somewhat better resolution among alternative trees than the one including all positions. First and second codon positions seem to be less "noisy" over the whole data set. For the comparison between tree A and tree B, ( A l l _ 3 -4- SE) is -6.8 + 9.0 (all data) and -10.1 + 7.2 (first and second only), the SE being larger than the difference in the first case, smaller and closer to being significant in the second case (even though it used only two-thirds of all sites). Using protein sequences, the best tree from maximum likelihood analysis (PROTML 2.2, Adachi and Hasegawa, 1994) is the tree shown in Figure 14B, but differences between the log likelihood of this tree and alternative topologies were not statistically significant, according to the test of Kishino and Hasegawa (1989). Although maximum likelihood analyses also favor the grouping of electric fish with characiforms, more data are obviously necessary to determine with confidence the best phylogenetic hypothesis. In order to test for the effect of the choice of taxa (see Lecointre et al., 1993) on the resolution of characiform relationships, the more distant taxa were excluded from the analysis and only catfish and electric fish
236
GUILLERMO ORTI
were used as outgroups. Although different results were obtained for different character weighting and reconstruction methods used, some elements were common to all results. The basal position of Distichodus and the grouping of Alestes and Phenacogrammus (Alestinae) and of Paracheirodon, Gymnocorymbus, and Gasteropelecus were found in all trees obtained and were supported by relatively high bootstrap values (Fig. 14C and D). These relationships were stable to outgroup choice because they were also retrieved when all 25 taxa were used (Fig. 14A and B). The position of Chalceus and Metynnis remained uncertain, but they never grouped together with the other taxa in the Characidae. A close relationship between Leporinus and Hemiodus, only weakly suggested in trees A and B (Fig. 14), seems to receive better support with a closer outgroup and downweighting third codon positions (trees C and D, Fig. 14). The major discrepancy among trees A - D involves the position of Hoplias and Boulengerella. When third codon positions (or only third position transitions) were excluded from the analysis, these taxa are no longer placed with Alestes + Phenacogrammus as a derived group within the Characiformes, but rather branch out next from Distichodus, at the base of the characiform clade. The same pattern is observed when amino acid sequences are used for parsimony analysis. Although no firm set of relationships can be established among characiform lineages other than those mentioned earlier, the monophyly of Neotropical taxa seems a very unlikely hypothesis. Under all alternative weighting strategies, Distichodus comes out as the sister group of all other characiforms, and the Alestinae always groups among the Neotropical taxa. Forcing monophyly of Neotropical taxa results in 7, 8, and 10 extra steps when all characters were equally weighted, when transitions in third positions were excluded, and when third positions were excluded from the analysis, respectively. Mitochondrial DNA sequence evidence (see earlier discussion) also suggests that the African and Neotropical lineages do not form reciprocally monophyletic groups. Neighbor-joining analysis of the 19 taxon data set (with catfish as the outgroup) always resulted in a monophyletic Characiformes with Distichodus branching out at the base. As in parsimony analysis, by excluding third codon positions (or using protein distances) the placement of Hoplias and Boulengerella in the tree changed from being close to the Alestinae to a more basal position in the characiform clade. The grouping of Leporinus and Hemiodus was also supported, but the monophyly of neither Characidae nor characiforms was supported by neighbor-joining bootstrap analyses. The topology of the best tree from fastDNAml (with the categories and rates options) is the same as that shown in Fig. 14D.
C. Systematic and Biogeographic Implications 1. Sequence Variation and the Limits of Phylogenetic Resolution Comparisons of 12S and 16S sequences among characiform families showed a slightly lower level of mean sequence divergence (14.9%) than comparisons among orders of otophysans (17.3%) (see Fig. 7). Assuming rate constancy across all lineages, this observation could be taken as evidence for dating the origination of the major lineages of Characiformes very close to the origin of the otophysan orders (cypriniforms, catfishes, electric fishes). Alternatively, similar values of sequence divergence among lineages may reflect saturation at the DNA level, given the structural constraints on sequence variation discussed earlier. As pointed out, transition/transversion ratios (Fig. 3), the amount of change per site in different data sets (Fig. 4), and sliding window analyses of variation (Figs. 5 and 6) all indicate that beyond the family level, multiple changes per site are to be expected in the 12S and 16S DNA sequences. Furthermore, even though average divergence between gonorhynchiforms and otophysans (21.1%) suggests that divergence values among otophysans (17.3%) might be close to but have not yet reached complete saturation, maximum divergence values among characiform families, otophysan, and ostariophysan orders were essentially all the same (21.3, 21.9, and 24%, respectively; Fig. 7), indicating that, indeed, saturation is a problem beyond the family level. Comparison of ependymin DNA and amino acid sequence divergences (Fig. 7) clearly shows that the mitochondrial rRNA genes have reached saturation. For ependymin, amino acid sequence divergence between Distichodus and the other characiforms (close to 22%) was slightly smaller than divergence between characiforms and electric fish (25%) and than between characiforms and cyprinids (27%). But ependymin amino acid sequence divergence between characiforms and catfishes and between cyprinids and electric fishes was above 34%. Furthermore, distances among characiform taxa other than Distichodus were lower than 15%. In the 12S and 16S sequences no such difference in sequence divergence among characiform taxa including or excluding the distichodontid-citharinid lineage was found.
2. Relationships among Orders 12S and 16S data did not contain appropriate information to establish relationships at this level (Fig. 12). But, given that ependymin sequences show nonsaturating levels of divergence even among the most divergent taxa, can we expect well-supported phylogenies
14. Radiation of Characiform Fishes for high-order relationships? One of the most significant results obtained from the phylogenetic analysis of ependymin is the highly supported sister group relationship of Esox and Salmo (Fig. 14), corroborating, in part, the notion of Protacanthopterygii (sensu Rosen 1973, 1974) also adopted by Nelson (1994, see Fig. 2E). Although this result was previously reported by M(iller-Schmid et al. (1993), its implication for lower euteleostean systematics remained unnoticed. The superorder Protacanthopterygii, containing a diverse assemblage of basal "Division III" fishes, was advanced in the seminal paper by Greenwood et al. (1966), but shortly after its inception all groups except Salmoniformes were removed (Rosen, 1973). The monophyly of Salmoniformes, which included Esocoidei (pikes, mudminnows, and Lepidogalaxias), Argentinoidei plus Osmeroidei (smelts and their relatives), and Salmonoidei (salmonids), was proposed based on gill arch anatomy (Rosen, 1974). But esocoids were later removed from the Salmoniformes and were regarded as the primitive sister group of euteleosts (Fink and Weitzman, 1982; Lauder and Liem, 1983; Fink, 1984). Salmoniformes became coextensive with Salmonidae, and much controversy clouded the relationships among salmonids, pikes, and the other euteleosts (for a review see Fink, 1984; Begle, 1991, 1992; Nelson, 1994). Morphological analyses have been complicated because a high proportion of characters show evolutionary losses and reductions or mosaic evolution, or exhibit a primitive condition for the euteleosts (Begle, 1992; Nelson, 1994). Ependymin DNA sequences have established the first molecular evidence for the monophyly of a group containing salmonids and esociforms, and hold great promise for the resolution of higher order relationships of fishes (Fig. 2E). The sister group relationship of electric fish (Gymnotiformes) and Characiformes suggested by ependymin sequences (Fig. 14B) constitutes a significant departure from the currently accepted hypothesis of otophysan relationships (Fig. 2E; Fink and Fink, 1981), but had been considered the "traditional" hypothesis before 1981 (e.g., Regan, 1922; Weitzman, 1962; Greenwood et al., 1966; Rosen and Greenwood, 1970). Gymnotiforms were then thought to be highly modified characins, albeit only based on circumstantial evidence (e.g., Mago-Leccia and Zaret, 1978). The first explicit cladistic analysis of morphological characters published by Fink and Fink (1981) proposed 20 synapomorphies for the clade formed by catfish + electric fish. More recently, Dimmick and Larson (1996) presented molecular data (1200 bp of mitochondrial DNA sequences encompassing most of the 12S and 16S genes and the intervening valine tRNA gene, and 1200 bp from the small and large subunit nuclear-encoded rRNA genes) that support the alternative hypothesis suggested by ependymin sequences. Analyzed sepa-
237
rately and combined, nuclear and mitochondrial sequence data independently support the grouping of Gymnotiformes and Characiformes (Dimmick and Larson, 1996). In agreement with the morphological evidence (Fink and Fink, 1981), ependymin (and the nuclear and mitochondrial sequences of Dimmick and Larson) support the basal position of cypriniforms among otophysan lineages (Figs. 12 and 14A and B). 3. Relationships among Characiform Families
Whether saturation plagues the 12S and 16S data sets at the family level is less apparent, but it might be suggested by the differences in sequence divergence discussed earlier. Low consistency indices of the phylogenetic trees obtained for the different data sets indicate a high degree of homoplasy at every level. For example, the consistency index was 0.50, 0.34, and 0.42 for the serrasalmin (33 taxa), characiform (27 taxa), and ostariophysan (22 taxa) data sets, respectively. Mindell and Honeycutt (1990) and Hillis and Dixon (1991) suggested that mitochondrial ribosomal genes could resolve phylogenetic relationships among taxa that had diverged as long as 300 or 65 million years ago, respectively. The oldest unequivocal gonorhynchiform fossils date from the early Cretaceous (Patterson, 1975, 1984), and the earliest otophysan fossils are late Cretaceous catfishes and characiforms (reviewed by Lundberg, 1993, 1996). This suggests that the otophysan stem group had originated before the separation of Africa and South America (Lundberg, 1993), dated at 84-106 million years ago (Pitman et al., 1993; Parrish, 1993). Fossils do not provide detailed evidence on the sequence of origins of the main otophysan and characiform lineages, but suggest a window of application for the 12S and 16S molecular markers closer to 100 than to 300 million years. Given these limitations of the ribosomal DNA sequences for comparisons among characiform families, only a few hypotheses of relationships among Characiformes could be established with confidence. These were the clades numbered 1 - 12 (Figs. 9-11), of which only three propose interfamilial (or subfamilial) sister group relationships, in addition to the cithariniddistichodontid clade already discussed. A close relationship of Prochilodontidae and Curimatidae was proposed by Vari (1983) and Buckup (1991) and was supported by molecular data (Fig. 2C and component 10, Fig. 11). Within the Characidae, the systematic position of Oligosarcus (subfamily Acestrorhynchinae) close to Astyanax (subfamily Tetragonopterinae) and Poptella (subfamily Stethaprioninae) was strongly supported by molecular data (component 5, Figs. 9-11), but a close relationship of Astyanax with Tetragonopterus, both tetragonopterines, was not supported. Oligosarcus was traditionally placed with Acestrorhynchus,
238
GUILLERMOORTI
but Buckup (1991), Lucena (1993), and P. Petry (personal communication) found evidence for a closer relationship of Oligosarcus with tetragonopterines (Fig. 2C) than with Acestrorhynchus. Lucena (1993) proposed a close relationship of Poptella with Tetragonopterus, but not with Astyanax (Fig. 2A). The third component supported by molecular data is formed by Hepsetus and Hoplias (number 2, Figs. 9-11), members of African and South American families Hepsetidae and Erythrinidae, respectively. Its relevance for biogeography and systematics of characiform fishes is discussed later. Ependymin sequences also failed to provide robust phylogeny estimates for characiform families (Fig. 14A-D). However, the position of Distichodus as a primitive taxon among characiforms is well established (Fig. 14), corroborating the mitochondrial DNA results (Fig. 12) and previous morphological evidence (Fink and Fink, 1981; Buckup, 1991). Distichodus forms part of a well-defined monophyletic lineage of African characiforms composed of the families Distichodontidae and Citharinidae (Vari, 1979). Among the South American Characidae, a close relationship between Paracheirodon ("neon tetra," subfamily Cheirodontinae) and Gymnocorymbus ("black tetra," subfamily Tetragonopterinae) is strongly suggested by ependymin (Fig. 14). Tetragonopterines and cheirodontines were also suggested by Lucena (1993) to be closely related (Fig. 2A). The genera Metynnis ("silver dollar," subfamily Serrasalminae) and Chalceus (subfamily Bryconinae), usually included in the Characidae, are not shown here to form a monophyletic group with the other characids (Fig. 14). The placement of serrasalrains (represented by Metynnis, Colossoma, and Pygocentrus in various trees) among the other putative characid taxa remained equivocal (Figs. 9-12 and 14). In an extensive survey of morphological characters, Machado-Allison (1983) presented convincing evidence for monophyly of the subfamily Serrasalminae but also failed to find the sister group of this unit among characids. More recently, Lucena (1993) proposed a monophyletic group including (in addition to other taxa) serrasalmins, Chalceus, Brycon, and Alestinae (Fig. 2A). Gasteropelecus (family Gasteropelecidae) is shown here to have a close relationship with Gymnocorymbus + Paracheirodon to the exclusion of Chalceus and Metynnis, based on ependymin (Fig. 14). Based on 12S and 16S sequences, gasteropelecids come out as the sister group of a clade containing anostomids, Chilodus and Characidium, in a clade which also includes Raphiodon and Apareiodon (Fig. 11) or of Boulengerella in the most inclusive ostariophysan data set (Fig. 12). The selection of taxa clearly has a major impact on inferences about the phylogenetic position of gasteropelecids. This effect was illustrated by Lecointre et al (1993) using a gnathostome 28S rRNA data set. The gastero-
pelecids were considered a subfamily of the family Characidae (Weitzman, 1960) but were later elevated to the rank of family by Greenwood et al. (1966). The suggestion that the family Characidae (sensu Greenwood et al., 1966) will undergo major taxonomic changes as phylogenetic relationships among the major lineages are established has been mentioned repeatedly (e.g., Weitzman and Fink, 1983; Buckup, 1991; Lucena, 1993) and seems to be supported by molecular data discussed herein.
4. African-South American Relationships and Biogeography A close relationship of Distichodus + Citharinus with the African subfamily Alestinae is not supported by ependymin, mitochondrial DNA sequences, or morphological evidence (Buckup, 1991). Hypotheses of the monophyly of Neotropical taxa were rejected by the mitochondrial DNA sequences (see earlier discussion). Therefore, at least three levels of Afro-South American sister group relationship have been suggested (Fig. 11, arrows 1-3; Fig. 14): (1) between the distichodontids (plus citharinids) and the rest of the characiforms (discussed earlier), (2) between Hoplias and Hepsetus, and (3) between the alestins and a group of undetermined South American characiforms. The sister group relationship of the African pike-characiform Hepsetus and the Neotropical family Erythrinidae, genus Hoplias (Figs. 9-11), was also suggested by Uj (1990). Although this hypothesis seems well supported by molecular data (but see Fig. 10), ctenolucids and erythrinids (both Neotropical groups) or ctenolucids alone were proposed as the sister group of Hepsetus, based on morphology (Fig. 2; Buckup, 1991; Lucena, 1993; Vari, 1995). The third clade with a trans-Atlantic sister group relationship included the African subfamily Alestinae and some Neotropical lineages (mitochondrial DNA data suggest Acestrorhynchus to be the closest Neotropical taxon to alestins, see Figs. 9-11). However, relationships of Alestinae and Acestrorhynchus with Neotropical characids are controversial (Fig. 2), and no agreement may be reached regarding the systematic position of these two groups based on morphology (Uj, 1990; Buckup, 1991; Lucena, 1993) and molecular data. Mean percentage sequence divergences (12S and 16S genes) between the African taxa and their corresponding Neotropical sister group were 16.2% for Distichodus + Citharinus, 11.2% for Hepsetus, and 15.1% for the Alestinae, respectively. Divergence between Hepsetus and ctenolucids (putative sister groups according to morphological studies) was 16.6%. These values are within the same range of divergence values recorded among the other families of Characiformes (and below the 21-24% saturation value shown in Fig.7), suggesting that most lineages (families) of characiform fishes
239
14. Radiation of Characiform Fishes had originated before the vicariant event separating African and Neotropical taxa, approximately 100 million years ago. If Characiformes experienced a rapid evolutionary radiation, comparable to that of cichlid fishes in East African lakes (e.g., Greenwood, 1984; Meyer, 1993), but 100 million years ago, resolution of phylogenetic relationships among the major lineages is not expected to be easily obtained. Poor resolution of relationships among characiform taxa using phylogenetic analyses of ependymin and mitochondrial DNA sequences and conflicting phylogenetic hypotheses from morphological data seem to agree with this prediction. Analyzing the phylogenetic hypothesis of Buckup (1991) in a biogeographic context, Lundberg (1993) also arrived at the conclusion that the major groups of characiforms had originated before the African-South American vicariant event (although the proposed African-South American sister group relationships differed). He then raised the important question of why most of the characiform groups now endemic to the Neotropics do not have close relatives in the African fauna. Assuming a strict vicariant view and no dispersals of characiforms across the widening Atlantic ocean, the present biogeographic distribution implies a remarkably high rate of extinction among African characiforms (Lundberg, 1993). For example, if the cladogram shown in Fig. 11 is taken at face value, then all six lineages enclosed in boxes and indicated by a cross must have gone extinct in Africa after the continental break. Although the fossil record of Characiformes is not very informative to test this hypothesis, intriguing fossils described by Greenwood and Howes (1975) and Stewart (1994) merit discussion. These are teeth and skulls of Miocene to lower Pleistocene age that were assigned to now extinct characiform fishes (Sindacharax lepersonnei and S. deserti), apparently widespread in northern and eastern Africa. They show greater similarity with the teeth of modern serrasalmins like Colossoma and Piaractus than with any African characiform fish (Greenwood and Howes, 1975; Stewart, 1994). Serrasalmins form a well-supported monophyletic taxon endemic to South America (Machado-Allison, 1982; Fig. 8, and clade number 7, Figs. 9-11) that includes herbivorous forms like Colossoma and Piaractus, considered the primitive sister group to the more derived predatory piranhas (e.g., Pygocentrus; Fig. 8). The systematic position of serrasalmins within Characiformes could not be resolved with confidence in the present study (Figs. 9-11, and 14), but no close relationship of serrasalmins with other Neotropical characids was suggested. South American serrasalmin fossils indicate that forms similar to Colossoma had differentiated by at least 13 million years ago (Lundberg et al., 1986; Lund-
berg, 1996). Considering that serrasalmins are exclusively freshwater fishes, if Sindacharax really belongs to the serrasalmin clade, the origin of serrasalmins would have to be unequivocally placed before the AfricanSouth American continental split (84 million years ago), in agreement with conclusions from DNA sequence divergences discussed earlier. S indacharax would also provide an example of extirpation in Africa of one trans-South Atlantic clade (Lundberg, 1993). Fossil serrasalmins from Miocene Amazonian-Orinocoan faunas discovered in the present Magdalena River basin in Colombia provide a good example for extirpation of a clade from a formerly diverse fauna (Lundberg et al., 1986). The depauperate fauna of the present Magdalena River does not include Colossoma and piranha species, and local extinction due to tectonism and climatic changes during the Cenozoic was suggested to explain the loss of diversity (Lundberg et al., 1986; Lundberg and Chernoff, 1992). Similar geological and climatic processes might have affected a previously characiform-rich African fauna and may be invoked to explain why only three lineages of characiforms are found there at present. Paleocene tectonic movements of the African plate and post-Miocene aridification affected the African continent more severely than South America and might have caused the well-known paucity of the tropical African flora (Goldblatt, 1993). Two alternative hypotheses are also plausible. Extinction of characiform lineages in Africa could also have resulted from competition with other fish groups that invaded that continent after the Gondwanian fracture. For example, knerids, notopterids, mormyriforms, and cypriniforms are freshwater fishes present in Africa but not in South America. Cyprinids such as Barbus and Labeo have been suggested to enter Africa from Asia during the late Miocene (Stewart, 1994). No evidence for this kind of competitive exclusion process is available. Another alternative scenario assumes ad hoc geographic distributions to minimize the number of extinctions: members of a clade, or single species that later gave rise to the clade, could have been restricted to a small part of the Gondwanian land mass and carried off in toto when the continent broke up. This assumption would reduce the number of necessary extinction events of characiform lineages needed to explain their modern geographic distribution.
Acknowledgments This work was supported by Doctoral DissertationImprovement Grant BSR9112367to G. Orti and grants to A. Meyer (BSR9119867, BSR9107838) and M. A. Bell (INT9117104) from the U.S. National Science Foundation. All the molecular work reported here was conducted in A. Meyer's laboratory. The author thanks numerous colleagues who contributed valuable specimens. A. Meyer, M. A. Bell,
240
GUILLERMO ORTI
D. Futuyma, W. Eanes, and R. Vari provided helpful comments on earlier versions of the manuscript. This paper was prepared in partial fulfillment of requirements for the Ph.D. in Ecology and Evolution by G. Orti. This is contribution 960 from the Graduate Program in Ecology and Evolution at SUNY at Stony Brook.
References Adachi, J., and Hasegawa, M. 1994. "MOLPHY: A Program Package for Molecular Phylogenetics, V. 2.2." The Institute of Statistical Mathematics, Tokyo. Alves-Gomes, J. A., OrtL G., Haygood, M., Heiligenberg, W., and Meyer, A. 1995. Phylogenetic analysis of the South American electric fishes (order Gymnotiformes) and the evolution of their electrogenic system: A synthesis based on morphology, electrophysiology, and mitochondrial sequence data. Mol. Biol. Evol. 12: 298-318. Begle, D. P. 1991. Relationships of the osmeroid fishes and the use of reductive characters in phylogenetic analysis. Syst. Zool. 40: 33-53. Begle, D. P. 1992. Monophyly and relationships of the argentinoid fishes. Copeia 350-366. Bremer, K. 1988. The limits of amino acid sequence data in Angiosperm phylogenetic reconstruction. Evolution 42: 795-803. Buckup, P. A. 1991. "The Characidiinae: A Phylogenetic Study of the South American Darters and Their Reonships with Other Characiform Fishes." Ph.D. dissertation, The University of Michigan, Ann Arbor, MI. Carpenter, J. 1988. Choosing among equally parsimonious cladograms. Cladistics 4:291-296. Collins, T. M., Wimberger, P. H., and Naylor, G. J. P. 1994. Compositional bias, character-state bias, and character-state reconstruction using parsimony. Syst. Biol. 43:482-496. Dimmick, W. W., and Larson, A. 1996. A molecular and morphological perspective on the phylogenetic relationships of the otophysan fishes. Mol. Phylo. Evol. 6:120-133. Dixon, M. T., and Hillis, D. M. 1993. Ribosomal RNA secondary structure: compensatory mutations and implications for phylogenetic analysis. Mol. Biol. Evol. 10:256-267. Farris, J. S. 1969. A successive approximations approach to character weighting. Syst. Zool. 18:374-385. Felsenstein, J. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 17:368-376. Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783-791. Fink, S. V., and Fink, W. L. 1981. Interrelationships of the Ostariophysan fishes (Teleostei). Zool. J. Linn. Soc. 72:297-353. Fink, W. L. 1984. Basal euteleosts: Relationships. In "Ontogeny and Systematics of Fishes" (H. G. Moser, eds.). American Society of Ichthyologists and Herpetologists Special Publication 1. Fink, W. L., and Weitzman, S. H. 1982. Relationships of the stomiiform fishes (Teleostei), with a description of Diplophos. Bull. Mus. Comp. Zool. 150:31-93. Fitch, W. M., and Markowitz, E. 1970. An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem. Genet. 4: 579-593. Gatesy, J., DeSalle, R., and Wheeler, W. C. 1994. Alignmentambiguous nucleotide sites and the exclusion of data. Mol. Phylo. Evol. 2:152-157. G6ry, J. 1977. "Characoids of the World." Tropical Fish Hobbyist Publications, Neptune City, NJ. Goldblatt, P. 1993. Biological relationships between Africa and South America: An overview. In Biological Relationships between Af-
rica and South America" (P. Goldblatt, ed.), pp. 3-14. Yale University Press, New Haven, CT. Greenwood, P. H. 1984. African cichlids and evolutionary theories. In "Evolution of Fish Species Flocks." (A. A. Echelle and I. Kornfield, eds.), pp. 13-19. University of Maine Press, Orono, ME. Greenwood, P. H., and Howes, G. J. 1975. Neogene fossil fishes from the lake Albert-Lake Edward rift (Zaire). Bull. Brit. Mus. (Nat. Hist.) Geol. 26: 69-127. Greenwood, P. H., Rosen, D. E., Weitzman, S. H., and Myers, G. S. 1966. Phyletic studies of teleostean fishes, with a provisional classification of living forms. Bull. Am. Mus. Nat. Hist. 131:339-455. Gyllensten, U. B., and Erlich, H. A. 1988. Generation of singlestranded DNA by the polymerase chain reaction and its application to direct sequencing of the HLA-DQa locus. Proc. Natl. Acad. Sci. USA 85: 7652- 7656. Hasegawa, M., Kishino, H., and Yano, T. 1985. Dating of the humanape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160-174. Hillis, D. M., and Dixon, M. T. 1991. Ribosomal DNA: Molecular evolution and phylogenetic inference. Q. Rev. Biol. 66: 411-453. Hoffmann, W. 1994. Ependymins and their potential role in neuroplasticity and regeneration: Calcium binding meningeal glycoproteins of the cerebrospinal fluid and extracellular matrix. Int. J. Biochem. 26:607-619. Kimura, M. 1981. Estimation of evolutionary distances between homologous nucleotide sequences. Proc. Natl. Acad. Sci. USA 78: 454-458. Kishino, H., and Hasegawa, M. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J. Mol. Evol. 29:170-179. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., P/i/ibo, S., Villablanca, F. X., and Wilson, A. C. 1989. Dynamics of mitochondrial DNA evolution in animals. Proc. Natl. Acad. Sci. USA 86: 6196-6200. Kumar, S., Tamura, K., and Nei, M. 1993. "MEGA: Molecular Evolutionary Genetics Analysis, V. 1.0." The Pennsylvania State University, University Park, PA. Lauder, G. V., and Liem, K. F. 1983. The evolution and interrelationships of the Actinopterygian fishes. Bull. Mus. Comp. Zool. 150: 95-197. Lecointre, G., Philippe, H., LG H. L. V., and Le Guyader, H. 1993. Species sampling has a major impact on phylogenetic inference. Mol. Phyto. Evot. 2:205-224. Lucena, C. A. S. D. 1993. "Estudo filogen6tico da famflia Characidae com uma discussao dos grupos naturais propostos (Teleostei, Ostariophysi, Characiformes)." Doutoramento diss., Universidade de Sao Paulo, Brazil. Lundberg, J. G. 1993. African-South American freshwater fish clades and continental drift: Problems with a paradigm. In "Biological relationships between Africa and South America" (P. Goldblatt, eds.), pp. 156-199. Yale University Press, New Haven, CT. Lundberg, J. G. 1996. Fishes of the La Venta Fauna: Additional taxa, biotic and paleoenvironmental implications. In "Vertebrate Paleontology in the Neotropics: The Miocene Fauna of La Venta Colombia" (R. F. Kay et at., eds.), pp. 67-91. Smithsonian Institution Press, Washington, DC. Lundberg, J. G., and Chernoff, B. 1992. A Miocene fossil of the Amazonian fish Arapaima (Teleostei, Arapaimidae) from the Magdalena River region of Colombia: Biogeographic and evolutionary implications. Biotropica 24:2-14. Lundberg, J. G., Machado-Allison, A., and Kay, R. F. 1986. Miocene characid fishes from Colombia: Evolutionary stasis and extirpation. Science 234: 208-209. Machado-Allison, A. 1982. "Studies on the Systematics of the Sub-
14. Radiation of Characiform Fishes
family Serrasalminae (Pisces-Characidae)." Ph.D. dissertation, The George Washington University. Machado-Allison, A. 1983. Estudios sobre la sistem~tica de la subfamilia Serrasalminae (Teleostei, Characidae). II. Discusi6n sobre la condici6n monofil6tica de la subfamilia. Acta Biol. Venez. 11: 145-195. Maddison, W. P., and Maddison, D. R. 1992. "MacClade: Analysis of Phylogeny and Character Evolution, V. 3.0." Sinauer Associates, Sunderland, MA. Mago-Leccia, F., and Zaret, T. M. 1978. The taxonomic status of Rhabdolichops troscheli (Kaup, 1856) and speculations on gymnotiform evolution. Environ. Biol. Fish. 3:379-384. Meyer, A. 1993. Phylogenetic relationships and evolutionary processes in East African cichlid fishes. Trends Ecol. Evol. 8:279284. Mindell, D. P., and Honeycutt, R. L. 1990. Ribosomal RNA in vertebrates: Evolution and phylogenetic implications. Annu. Rev. Ecol. Syst. 21:541-566. M~ller-Schmid, A., Ganss, B., Gorr, T., and Hoffmann, W. 1993. Molecular analysis of ependymins from the cerebrospinal fluid of the orders Clupeiformes and Salmoniformes: No indication for the existence of an euteleost infradivision. J. Mol. Evol. 36:578-585. Myers, G. S. 1938. Freshwater fishes and West Indian zoogeography. Annu. Rep. Smith. Inst. 1937:339-364. Myers, G. S. 1949. Salt-tolerance of freshwater fish groups in relation to zoogeographical problems. Bijdragen tot de Dierkunde 28: 315-322. Nelson, J. S. 1994. "Fishes of the World." Wiley, New York. Olsen, G. J., Matsuda, H., Hagstrom, R., and Overbeek, R. 1994. fastDNAml: A tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci. 10: 41-48. OrtL G. 1995. "The Evolutionary Radiation of Characiform Fishes: A Molecular Phylogenetic Perspective." Ph.D. dissertation, State University of New York at Stony Brook. Ortf, G., and Meyer, A. 1996. Molecular evolution of ependymin and the phylogenetic resolution of early divergences among euteleost fishes. Mol. Biol. Evol. 13:556-573. Orti, G., and Meyer, A. 1997. The radiation of characiform fishes and the limits of resolution of mitochondrial ribosomal DNA sequences. Syst. Biol., 46:75-100. Orti, G., Petry, P., Porto, J. I. R., J6gu, M., and Meyer, A. 1996. Patterns of nucleotide change in mitochondrial ribosomal RNA genes and the phylogeny of piranhas. J. Mol. Evol. 42:169-182. Palumbi, S., Martin, A., Romano, A., McMillan, W. O., Stice, L., and Grabowski, G. 1991. "The Simple Fool's Guide to PCR." Department of Zoology and Kewalo Marine Laboratory, University of Hawaii, Honolulu, HI. Parrish, J. T. 1993. The palaeogeography of the opening South Atlantic. In "The Africa-South America connection" (W. George and R. Lavocat, eds.), pp. 8-27. Clarendon Press, Oxford. Patterson, C. 1975. The distribution of Mesozoic freshwater fishes. M~m. Mus. Natl. Hist. Nat. Paris A Zool. 88:156-174. Patterson, C. 1984. Chanoides, a marine Eocene otophysan fish (Teleostei: Ostariophysi). J. Vertebr. Paleontol. 4: 430-456. Pitman, W. C. I., Cande, S., LaBrecque, J., and Pindell, J. 1993. Fragmentation of Gondwana: The separation of Africa from South America. In "Biological Relationships between Africa and South America" (P. Goldblatt, ed.), pp. 15-34. Yale University Press, New Haven, CT. Regan, C. T. 1922. The distribution of the fishes of the order Ostariophysi. Bijdragen tot de Dierkunde, Amsterdam 22:203-208. Rosen, D. E. 1973. Interrelationships of higher euteleostean fishes. In "Interrelationships of Fishes" (P. H. Greenwood, R. S. Miles,
241
and C. Patterson, eds.), pp. 397-513. Academic Press, London. Rosen, D. E. 1974. Phylogeny and zoogeography of salmoniform fishes and relationships of Lepidogalaxias salamandroides. Bull. Am. Mus. Nat. Hist. 153:265-326. Rosen, D. E., and Greenwood, P. H. 1970. Origin of the Weberian apparatus and the relationships of ostariophysan and gonorhynchiform fishes. Am. Mus. Novitat. 2428:1-25. Saiki, R. K., Gelfand, D.H., Stoffel, S., Scharf, S., Higuchi, R., Horn, G. T., Mullis, K. B., and Erlich, H. A. 1988. Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487-491. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Sanger, F., Nicklen, S., and Coulson, A. R. 1977. DNA sequencing with chain terminator inhibitors. Proc. Natl. Acad. Sci. USA 74: 5463-5467. Shashoua, V. E. 1991. Ependymin, a brain extracellular glycoprotein, and CNS plasticity. Ann. N.Y. Acad. Sci. 627:94-114. Stewart, K. M. 1994. A late Miocene fish fauna from Lothgam, Kenya. J. Vertebr. Paleontol. 14:592-594. Sverlij, S. B., and Espinach Ros, A. 1986. E1 Dorado, Salminus maxillosus (Pisces, Characiformes) en el Rio de la Plata y Rio Uruguay inferior. Rev. Invest. Desarrollo Pesquero 6:57-75. Swofford, D. L. 1993. "PAUP: Phylogenetic Analysis Using Parsimony, V.3.1.1." Illinois Natural History Survey, Champaign, IL. Swofford, D. L., and Maddison, W. P. 1992. Parsimony, characterstate reconstructions, and evolutionary inferences. In "Systematics, Historical Ecology, and North American Freshwater Fishes." (R. L. Mayden, ed.), pp. 186-223. Stanford University Press, Stanford, CA. Thompson, J. D., Higgins, D. G., and Gibson, T. J. 1994. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673-4680. Uj, A. 1990. "Etude comparative de l'osteologie cranienne des poissons de la famille des Characidae et son importance phylogenetique." Ph.D. dissertation, Universit6 de Geneva. Vari, R. P. 1979. Anatomy, relationships, and classification of the families Citharinidae and Distichodontidae (Pisces, Characoidea). Bull. Brit. Mus. (Nat. Hist.) Zool. 36:261-344. Vari, R. P. 1983. Phylogenetic relationships of the families Curimatidae, Prochilodontidae, Anostomidae, and Chilodontidae. Smith. Contrib. Zool. 378:1-60. Vari, R. P. 1995. The Neotropical fish family Ctenoluciidae (Teleostei: Ostariophysi: Characiformes): Supra and intrafamilial phylogenetic relationships, with a revisionary study. Smith. Contrib. Zool. 5 6 4 : 1 - 97. Vawter, L., and Brown, W. M. 1993. Rates and patterns of base change in the small subunit ribosomal RNA gene. Genetics 134: 597-608. Weitzman, S. H. 1960. Further notes on the relationships and classification of the South American characid fishes of the subfamily Gasteropelecinae. Stanford Ichthyol. Bull. 7:217-239. Weitzman, S. H. 1962. The osteology of Brycon meeki, a generalized characid fish, with an osteological definition of the family. Stanford Ichthyol. Bull. 8:1-77. Weitzman, S. H., and Fink, W. L. 1983. Relationships of the neon tetras, a group of South American freshwater fishes (Teleostei, Characidae), with comments on the phylogeny of New World Characiformes. Bull. Mus. Comp. Zool. 150: 339-395. Weitzman, S. H., and Vari, R. P. 1988. Miniaturization in South American freshwater fishes: An overview and discussion. Proc. Biol. Soc. Wash. 101:444-465.
242
GUILLERMO ORTI
Appendix Below is a classification of fish taxa discussed in this chapter, with the GenBank accession numbers (GB) of their DNA sequences (12S, 16S, and ependymin indicated by "epy"). African taxa are indicated by "AFR." Serrasalmin specimens have been numbered from I to 34 and are referred to by these numbers in Orti et al. (1996). When voucher specimens were deposited in museum collections, their accession numbers are preceded by INPA for the specimens deposited at the Instituto Nacional de Pesquisas da Amazonia, Manaus, Brazil, and by USNM for those at the U.S. National Museum of Natural History (Washington, DC). Order Characiformes 1. Family Hepsetidae (AFR) Hepsetus odoe. GB: U33852, U33992. 2. Family Citharinidae (AFR) Citharinus congicus. GB: U33826, U33993. 3. Family Distichodontidae (AFR) Distichodus sp. GB: U33827, U33994, epy: U33477. 4. Family Crenuchidae Characidium sp. (USNM 318101). GB: U33828, U34030. 5. Family Characidae Subfamily Alestinae (AFR) Alestes sp. GB: U33829, U33995, epy: U33475. Phenacogrammus sp. GB: U33830, U33996, epy: U33476. Hydrocyon sp. GB: U33960, U33997. Subfamily Characinae Tribe Characini Cynopotamus sp. (USNM 325689). GB: U33961, U33998. Gnathocharax steindachneri. GB: U33589, U33624. Tribe Acestrorhynchini Acestrorhynchus sp. GB: U33962, U33999. Oligosarcus sp. (USNM 235690). GB: U33963, U34000. Subfamily Raphiodontinae Rhaphiodon vulpinus. GB: U33964, U34001. Subfamily Bryconinae Tribe Salminini Salminus sp. GB: U33965, U34002. Tribe Bryconini Brycon sp. (USNM 326005). GB: U33966, U34003. Chalceus macrolepidotus. GB: U33587, U33622, epy: U33478. Tribe Triportheini Triportheus paranensis. GB: U33588, U33623. Subfamily Aphyocharacinae Aphyocharax sp. GB: U33968, U34005. Subfamily Glandulocaudinae Corynopoma riisei. GB: U33969, U34006. Gephyrocharax sp. GB: U33970, U34007. Subfamily Stethaprioninae Poptella sp. GB: U33971, U34008. Subfamily Tetragonopterinae Astyanaxfasciatus. GB: U33972, U34009. Tetragonopterus sp. GB: U33973, U34010. Gymnocorymbus ternetzi. GB: epy: U33480. Subfamily Cheirodontinae Cheirodon sp. (USNM 325676). GB: U33974, U34011. Paracheirodon innesi. GB: U33975, U34012, epy: U33479.
Subfamily Serrasalminae genus Pygocentrus 1. P. nattereri. GB: U33558, U33590. 2. P. nattereri. GB: U33558, U33590. 3. P. nattereri (INPA 10143). GB: U33558, U33590. 4. P. nattereri (USNM 325686). GB: U33559, U33591. genus Serrasalmus 5. S. spilopleura (USNM 325683). GB: U33560, U33592. 6. S. n.sp. 2n = 58. GB: U33561, U33593. 7. S. compressus (cf. altuvei? 2n = 60). GB: U33562, U33594. genus Pristobrycon 8. P. sp. GB: U33563, U33595. 9. P. striolatus. GB: U33597, U33596. 10. P. striolatus. GB: U33564, U33598. genus Catoprion: 11. C. mento. GB: U33565, U33599. 12. C. mento (INPA 10145). GB: U33565, U33599. genus Metynnis 13. M. sp. GB: U33566, U33600. epy: U33481. 14. M. cf. mola (INPA 10146). GB: U33567, U33601. genus Myleus 15. M. Myloplus rubripinnis. GB: U33568, U33602. 16. M. Myloplus asterias. GB: U33569, U33603. 17. M. Myloplus tiete (INPA 10147). GB: U33570, U33604. 18. M. Prosomyleus schomburgkii. GB: U33571, U33605. 19. M. Myleus pacu. GB: U33572, U33606. 20. M. Myleus pacu. GB: U33573, U33607. genus Mylesinus 21. M. paraschomburgkii. GB: U33574, U33608. 22. M. paraschomburgkii. GB: U33574, U33609. genus 'N. gen. A' 23. N. gen. A n.sp. (R. Xingu, Parfi, Brazil). This specimen could not be assigned to any valid genus of the Serrasalminae, but is similar in many respects to Utiaritichthys and Myleus (J6gu unpublished data). GB: U33575, U33610. genus Acnodon: 24. A. normani. GB: U33576, U33611. 25. A. normani. GB: U33577, U33612. genus Mytossoma 26. M. duriventri (INPA 10154). GB: U33578, U33613. 27. M. paraguayensis (INPA 10152). GB: U33579, U33614. 28. M. aureum (INPA 10153). GB: U33580, U33615. genus Colossoma 29. C. macropomum (INPA 10149). GB: U33581, U33616. 30. C. macropomum (INPA 10150). GB: U33582, U33617. genus Piaractus 31. P. mesopotamicus (INPA 10151). GB: U33583, U33618. 32. P. brachipomus (INPA 10148). GB: U33584, U33619. 33. P. mesopotamicus. GB: U33585, U33620. 34. P. brachipomus. GB: U33586, U33621. 6. Family Erythrinidae Hoplias malabaricus. GB: U33976, U34013, epy: U33485. 7. Family Ctenoluciidae Ctenolucius sp. GB: U33977, U34014. Boulengerella maculata. GB: U33978, U34015. Boulengerella sp. GB: epy: U33486. 8. Family Lebiasinidae Nannostomus sp. GB: U33979, U34016. Pyrrhulina sp. (USNM 325675). GB: U33980, U34017. Nannobrycon sp. GB: epy: U33487. 9. Family Hemiodontidae Hemiodus sp. GB: U33981, U34018, epy: U33484.
14. Radiation of Characiform Fishes 10. Family Parodontidae Apareiodon affinis. GB: U33982, U34019. 11. Family Gasteropelecidae Carnegiella sp. GB: U33983, U34020. Gasteropelecus sp. GB: U33984, U34021, epy: U334482. 12. Family Curimatidae Cyphocharax gilberti (USNM 318079). GB: U33985, U34022. Steindachnerina sp. (USNM 325691). GB: U33986, U34023. 13. Family Prochilodontidae Prochilodus lineatus. GB: U33987, U34034. 14. Family Anostomidae Abramites sp. GB: U33988, U34025. Leporinus obtusidens. GB: U34031, U34026. Leporinus sp. GB: epy: U33483. 15. Family Chilodontidae Chilodus sp. GB: 33989, U34027. Order Gymnotiformes Family Eigenmanniidae Eigenmannia sp. GB: U15269, U15245 (from Alves-Gomes et al., 1995). Eigenmannia sp. GB: epy: U33492. Family Rhamphichthyidae Rhamphichthys sp. GB: U15257, U15233 (Alves-Gomes et al., 1995). Rhamphichthys sp. GB: epy: U33493. Family Apteronotidae Apteronotus albifrons. GB: U15275, U15226 (from AlvesGomes et al., 1995) Order Siluriformes Family Loricariidae Hypostomus sp. GB: epy: U33488. Hypostomus sp. GB: U15263, U15239 (from Alves-Gomes et al., 1995). Family Cetopsidae Cetopsis sp. GB: U15272, U15248 (from Alves-Gomes et al., 1995).
243
Family Trichomycteridae Trichomycterus sp. GB: U15251, U15227 (from Alves-Gomes et al., 1995). Family Malapteruridae Malapterurus sp. GB: U15261, U15237 (from Alves-Gomes et al., 1995). Family Pimelodidae Pimelodus sp. GB: epy: U33489. Family Schilbeidae Schilbe sp. GB: epy: U33490. Family Mochokidae Synodontis sp. GB: epy: U33491. Order Cypriniformes Family Cyprinidae
Cyprinus carpio. GB: X61010, epy: U00432. Carassius auratus. GB: epy: U00433, X14134. Danio rerio. GB: epy: M89643. Family Gastromyzontidae Crossostoma lacustre. GB: M91245. Order Gonorhynchiformes Family Kneriidae Kneria sp. GB: U33990, U34028. Parakneria sp. GB: U33991, U34029. Order Salmoniformes Family Salmonidae Salmo salar. GB: epy: M93699. Order Esociformes Family Esocidae Esox lucius. GB: epy: L09066. Order Clupeiformes Family Clupeidae
Clupea harengus. GB: epy: L09065.
This Page Intentionally Left Blank
C H A P T E R
15 The Evolution of Blennioid Fishes Based on an Analysis of Mitochondria112S rDNA CAROL A. STEPIEN, ALISON K. DILLON, MERIEL J. BROOKS, KRISTEN L. CHASE, and ALLYSON N. HUBERS Department of Biology Case Western Reserve University Cleveland, Ohio 44106
I. Introduction
morphy of the epaxial musculature (which is absent in the family Labrisomidae). Six families are presently recognized in the Blennioidei: Clinidae (clinid kelpfish), Labrisomidae (labrisomid kelpfish), Chaenopsidae (tube blennies), Tripterygiidae (triplefin blennies), Blenniidae (combtooth blennies), and Dactyloscopidae (sand stargazers; Fig. 1 and Table I; Springer, 1993). Blennioid groups have generated considerable systematic interest, including the following contemporary studies of the phylogenetic relationships of some component taxa; Fukao and Okazaki (1987), Acero (1987), Williams (1990), Stepien and Rosenblatt (1991), Hastings (1991), Stepien (1992), Stepien et al. (1993), Springer (1993), Fricke (1994), and Hastings and Springer (1994). Historically, relationships among blennioid taxa and related groups have been controversial (Springer, 1993; Johnson, 1993; Stepien et al., 1993). Studies based on morphological data have not resolved higher-level relationships among blennioid families, tribes, and other suborders (see summary by Springer, 1993). Earlier work illustrated the utility of molecular data from allozyme studies (Stepien and Rosenblatt, 1991; Stepien, 1992; Stepien et al., 1993) and nuclear ribosomal DNA internal transcribed spacer (ITS) sequences (Stepien et al., 1993) to address evolutionary
Blennioids are a suborder of perciform teleost fishes comprising approximately 732 species, 127 genera, and six families (Table I; Nelson, 1994). They are present in most temperate and tropical nearshore marine habitats, with a few species in brackish and fresh water (summarized in Springer, 1993; Nelson, 1994). They are among the most common demersal fishes (Springer, 1993), but may be overlooked due to their relatively small sizes and cryptic color patterns (Stepien, 1986a,b, 1987; Stepien et al., 1988). Their distinguishing characteristics include elongate dorsal and anal fins and jugular pelvic fins (see Fig. 1). Springer (1993) defined the Blennioidei by the following combination of characters (some of which may be plesiomorphies): anal fin with one or two spines and all simple soft rays; pelvic fins with one spine, two to four simple soft rays, and insertion ahead of the pectorals; paired nostrils; cirri often present on the head; a single bone representing infrapharyngobranchials 2-4; no autogenous parhypural (absent or fused to hypurals); hypurals 3 and 4 fused to each other and to the urostylar centrum; and pelvic bones shaped in a nut-like pod. Johnson (1993) added the synapomorphy of the first vertebra lacking a neural spine, and Mooi and Gill (1995) described a synapoMOLECULAR SYSTEMATICS OF FISHES
245
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
246
CAROL A. STEPIEN et al.
TABLE I
Summary of Taxonomy of the Suborder Blennioidei; Number of Taxa, Distribution, Primary Morphological Characters, and Genera Sequenced a
Taxon 1. Family Clinidae Clinid kelpfish a. Tribe Ophiclinini Snake blennies Ophiclinus Sticharium b. Tribe Clinini Klipfish Heteroclinus c. Tribe Myxodini Kelpfish Clinitrachus Gibbonsia Heterostichus Myxodes 2. Family Labrisomidae Labrisomid kelpfish a. Tribe Cryptotremini Auchenionchus b. Tribe Neoclinini Neoclinus c. Tribe Mnierpini Rock skippers Mnierpes d. Tribe Labrisomini Labrisomus Malacoctenus e. Tribe Starksiini Starksia f. Tribe Paraclinini Exerpes Paraclinus g. Unknown placement (may be Stathmonotus 3. Family Chaenopsidae Tube blennies A can them blemaria Chaenopsis Emblemaria 4. Family Tripterygiidae Triplefin blennies a. Tribe Lepidoblenninae Axoclinus Karalepis b. Tribe Tripterygiinae Rosenblatella Notoclinus Triperygion 5. Family Blenniidae Combtooth blennies a. Tribe Salariini Ecsenius Entomacrodus Ophioblennius Rhabdoblennius
N taxa 3 tribes 26 genera 89 species 4 genera 12 species
17 genera 68 species 5 genera 9 species
6 tribes 16 genera 106 species 4 genera 7 species I genus 9 species 2 genera 2 species
Distribution Marine; mostly temperate
Southern Australia
Characters b Ceratohyal connected to dentary symphysis; scales small and embedded and radii in all fields Dorsal and anal fins united to caudal fin; cirri and lateral line reduced; male intromittent organ; ovoviviparous
Indo West-Pacific and New Zealand; mostly temperate; 4 tropical species
Male intromittent organ; ovoviviparous
Temperate New World and Mediterranean (Mediterranean) (northeastern Pacific) (northeastern Pacific) (southeastern Pacific)
Oviparous; often sexually dimorphic in size; females larger; males guard nests
Temperate eastern Pacific Temperate eastern Pacific and western Pacific Tropical eastern Pacific
Scales with radii confined to anterior margin; scales sometimes absent, but never small and embedded Branched caudal fin rays Tube dwellers Thickened corneas; divided eyes; thickened anal fin rays; amphibious
2 genera 35 species
Tropical New World and Africa
No known morphological synapomorphies
2 genera 24 species 2 genera 21 species
Tropical New World
Male intromittent organ; ovoviviparous or oviparous Spine on opercle
New World; tropical and warm temperate
Chaenopsidae; Hastings and Springer, 1994) 6 species New World; tropical 11 genera New World; Pacific and Atlantic; most 64 species tropical and some warm temperate
28 genera 2 tribes 103 species 9 genera 31 species
Unique testis lobe arrangement tube dwellers; lack scales; no lateral line; single epural; two infraorbital bones
Atlantic, Indian, and Pacific; tropical and warm temperate; greatest diversity in New Zealand
Dorsal fin divided in three parts; no spine on first segmented dorsal ray
Circumglobal; mostly marine tropical, some temperate; many estuarine
Comb-like teeth; scales absent; coracoid ankyased to cleithrum
Primarily Indo-West Pacific
Two, four, or five circumorbitals; three or four segmented pelvic rays
19 genera 72 species
6 tribes 55 genera 346 species 26 genera 198 species
(continues)
15. Blennioid Relationships TABLE I
247
(Continued)
Taxon
N taxa
Distribution
b. Tribe Parableniini Hypsoblennius Parablennius c. Tribe Omobranchini Omobranchus d. Tribe Nemophini Saber-toothed blennies Petroscirtes 6. Family Dactyloscopidae Sand stargazers Myxodagnus
14 genera 70 species
Circumglobal; mostly marine; tropical to temperate
Branched caudal fin rays; five circumorbital bones
7 genera 30 species 5 genera 48 species
Indo-Pacific and one Caribbean spp. (introduced); marine, some fresh water Indian and Pacific Oceans marine; one brackish and fresh water
Unbranched caudal fin rays; two segmented pelvic fin rays Unbranched fin rays; swim bladder present; basisphenoid absent
9 genera 41 species
New World in Pacific and Atlantic Oceans; tropical and warm temperate
Fringed upper gill cover; gill membranes separate and free from isthmus; no endopterygoid
Characters b
aBased on Springer (1993) and Nelson (1994). bCaution: Some of these characters are probably plesiomorphies.
G
H
C
I
relationships a m o n g blennioid taxa. The objective of the present study was to use mitochondrial 12S r D N A sequences in order to test the m o n o p o l y of the six Blennioid families and m a n y of their c o m p o n e n t tribes, analyze the evolutionary relationships a m o n g them, and examine their possible relationships to outgroups. Blennioid higher taxa have distinctive distributional patterns in several marine provinces (see Table I; Springer, 1982, 1993). Although the majority of blennioids are primarily tropical groups, the family Clinidae and the labrisomid tribes Neoclinini and Cryptotremini are primarily temperate and antitropically distributed (Fig. 2; Hubbs, 1952; Stephens and Springer, 1973; Springer, 1993). Evolutionary relationships a m o n g the families and tribes analyzed in this study (Table I) offer a biogeographic f r a m e w o r k to address hypotheses of the relative ages of tropical versus temperate groups, relationships among Old and N e w World taxa, and questions of dispersal and distributional history of nearshore fishes. A. Hypotheses Tested
F FIGURE 1 Drawings of representative taxa [reprinted from "Fishes of the World," 3rd. edition by J. S. Nelson (1994). Reprinted with permission of John Wiley & Sons, Inc.]. Blennioidei: (A) Clinidae, (B) Labrisomidae, (C) Chaenopsidae, (D) Tripterygiidae, (E) Blenniidae, (F) Dactyloscopidae. Zoarcoidei: (G) Stichaeidae, (H) Pholidae, [reproduced with permission of Miller and Lea (1972),California Department of Fish and Game], (I) Zoarcidae. Notothenioidei: (J) Nototheniidae, (K) Bathydraconidae.
Some of the p r i m a r y evolutionary and biogeographic questions that m a y be addressed with a resolved p h y l o g e n y for these groups include: (1) Are the temperate members of the Clinidae and Labrisomidae ancestral to the tropical labrisomid and chaenopsid clades? (2) What is the relationship of the Mediterranean m y x o d i n clinid Clinitrachus to the N e w World clinids? (3) What are the relationships between the Blenniidae and Tripterygiidae? H o w are they related to the other blennioids? (4) Are the dactyloscopids appropriately g r o u p e d with the blennioids? (5) H o w are the blennioids, zoarcoids, and notothenioids related?
CAROL A. STEPIEN et al.
248
o
o
~_... . . . .
I
~
,
~ -
.
~
~-
~
~
-~'~
.
~
~,
,~.
_
.
"~i!
~
,E
o
I
I i
i
!
I ., :::,
i
"
--~
L
.
q
I
r
-
0
o
o
u
0
o
15. Blennioid Relationships
B. Relationships of the Family Clinidae George and Springer (1980) redefined the family Clinidae (Fig. 1A), excluding the Tripterygiidae, Labrisomidae, and Chaenopsidae, and adding the tribe Ophiclinini. Clinids can be distinguished by several characters, including a cord-like ligament extending from the ceratohyal to the symphysis of the dentaries and the presence of radii on all margins of the scales (Hubbs, 1952; Springer et al., 1977; Springer, 1993). The Clinidae contains three tribes; the matrotrophic (ovoviviparous) Clinini and Ophiclinini, and the oviparous Myxodini (George and Springer, 1980; Stepien and Rosenblatt, 1991). The family largely has a temperate distribution, except for five tropical species in the tribe Clinini (Fig. 2; Springer, 1982, 1993). A fossil clinid very similar to the extant Mediterranean myxodin Clinitrachus has been described from the Miocene of Romania (Bannikov, 1989; also see Springer, 1993), which is the sole known fossil record of the family. The clinids present an interesting biogeographic scenario in that the live-bearing and egg-laying tribes do not overlap in distribution (Fig. 2A), and it has been postulated that live-bearing taxa are more derived (Wourms and Lombardi, 1992). The question of origin of antitropical taxa and whether they are ancestral to tropical groups, such as most of the labrisomid tribes (except Neoclinini and Cryptotremini) and the Chaenopsidae (Briggs, 1974), may also be addressed using these groups.
C. Relationships of the F a m i l y Labrisomidae The family Labrisomidae (Fig. 1B) has often been regarded as the sister group to the Clinidae or as part of the Clinidae (together with the Chaenopsidae; Hubbs, 1952). Studies by Springer (1993) and Hastings and Springer (1994) did not find morphological synapomorphies to define the Labrisomidae. Labrisomid scales (when present) have radii only on the anterior margin and are never small and embedded, which are apparent plesiomorphies distinguishing them from clinids (Hubbs, 1952; Stephens and Springer, 1973; George and Springer, 1980; Springer, 1993). The absence of an anterior extension of the dorsal epaxial slip to the skull is an apparent reversal to an ancestral state that may characterize the Labrisomidae (Mooi and Gill, 1995). Most labrisomids are found in the New World, except for six species of Neoclinus in the northwestern Pacific (Fukao and Okazaki, 1987) and the eastern Atlantic species Labrisomus nuchipinnis and Malacoctenus africanus (Fig. 2B, Table I; Springer, 1993). A fossil labrisomid (Labrisomus pronuchipinnis) has been described from Miocene deposits in the Mediterranean, where the family is no longer represented (Springer, 1970; George and Springer, 1980), and is the sole known fossil.
249
There are few known morphological synapomorphies to suggest relationships among the tribes; however, allozyme data provided some synapomorphies which support presently recognized tribal groupings (Table I; Stepien et al., 1993). Allozyme data suggested that the labrisomids are paraphyletic and that the labrisomid tribe Cryptotremini may be the sister group of the clinids (Stepien et al., 1993); these hypotheses are tested in this Chapter. Consensus trees from allozyme data failed to conclusively resolve relationships among the labrisomid tribes Cryptotremini, Paraclinini, and Starksiini (Stepien et al., 1993), which are further examined with DNA sequences. Inclusion of the tribe Neoclinini in the Labrisomidae is controversial, and Hastings and Springer (1994) suggest that it belongs in the family Chaenopsidae. Neoclinins are provisionally treated as labrisomids here, as indicated by allozyme characters (Stepien et al., 1993). Their relationships to chaenopsids and labrisomids are tested. Familial affinity of the small, rarely observed eel-like genus Stathmonotus is also examined. Stathmonotus has been classified as a labrisomid but, most recently, as a chaenopsid (Hastings and Springer, 1994).
D. Relationships of the Family Chaenopsidae The family Chaenopsidae (tube blennies; Fig. 1C) is restricted to the tropical and temperate New World and is defined by several morphological synapomorphies (Table I; Springer, 1993; Hastings and Springer, 1994). Parsimonious phylogenies based on nuclear rDNA sequence and allozyme data in Stepien et al. (1993) supported the traditional concept of a close relationship among the families Clinidae, Labrisomidae, and Chaenopsidae, as suggested by morphological data (Hubbs, 1952; Stephens, 1963; Stepien, 1992; Springer, 1993), which is further examined in this study. The Chaenopsidae has often been regarded as most closely related to the neoclinin labrisomids (Hubbs, 1952; Stephens, 1963; Springer, 1993; Hastings and Springer, 1994). Most-parsimonious phylogenies based on allozyme and rDNA sequence data suggested that the Chaenopsidae may be the sister group to a clinid-labrisomid clade (Stepien et al., 1993). However, the next parsimonious alternative phylogeny based on rDNA sequence data placed the Labrisomidae and Chaenopsidae as sister groups. These possible relationships are examined in this study.
E. Relationships of the F a m i l y Tripterygiidae The Tripterygiidae (triplefin blennies: Fig. 1D) is widely distributed in temperate and tropical regions
250
CAROL A. STEPIENet al.
throughout the Atlantic, Indian, and Pacific Oceans. Tripterygiids have a dorsal fin divided into three distinct segments: the first two are composed of spines and the third with seven or more soft rays. They are also defined by the synapomorphy of lack of a dorsal fin spine on the pterygiophore supporting the first segmented dorsal fin ray (Table I; see Springer, 1993). They have been assumed to be related to the Clinidae / Labrisomidae/Chaenopsidae clade and to the Blennidae (Springer, 1993), and these relationships are tested in the present study. The relationship between the two subfamilies (Lepidoblenninae and Tripterygiinae; Table I) is also examined. A fossil species (Tripterygion pronasus) has been described in Miocene deposits from the Mediterranean Sea (Arambourg, 1927; Wirtz, 1980) and one of the members of this genus is included in this study.
F. Relationships of the Family Blenniidae The combtooth blennies (Fig. 1E), family Blennidae, are widely distributed in the Atlantic, Indian, and Pacific Oceans and the Mediterranean Sea. Blennnies are a species-rich group and are defined by the synapomorphies of their comb-like teeth (in most), the nonprotractile premaxillae, the ankylased coracoid, and a vertical pair of processes on each side of the urohyal (Springer, 1993). Six tribes are recognized, some of which are undefined by morphological synapomorphies (Table I); four are included here. Some tribal relationships were hypothesized by Smith-Vaniz (1976). This study tests relationships among these tribes, as well as the monophyly of two of them (Salariini and Parablenniini).
G. Relationships of the F a m i l y Dactyloscopidae The family Dactyloscopidae (sand stargazers, Fig. 1E) is found exclusively in warm temperate and tropical marine waters of the New World. Dactyloscopids are well characterized by several synapomorphies, including a unique branchiostegal pump, finger-like elements on the upper edge of the gill cover, and lack of vomerine teeth (Table I; see Springer, 1993). Springerand Friehofer (1976) and Springer (1993) placed the dactyloscopids in the Blennioidei, but various researchers have included it in other groups. Inclusion of Dactyloscopidae in the Blennioidei is tested in the present study.
Anderson, 1994). The four blenniiform suborders recognized by Nelson (1994), Blennioidei, Zoarcoidei, Notothenioidei, and Trachinoidei, have been regarded as being closely related. A possible synapomorphy is that the pelvic fin, when present, originates in front of the pectorals in all species of the four suborders (see Springer, 1993; Nelson, 1994). However, morphological characters suggesting their relationships, including this fin placement, may alternatively be due to evolutionary convergence for bottom-dwelling modes of life (Rosenblatt, 1984). Whether these four groups represent monophyletic lineages, are each other's closest relatives, or have closer affinities with other groups is presently uncertain. The relationships among three of these suborders, Blennioidei, Zoarcoidei, and Notothenioidei, and some of their component families are examined in this study. Members of the suborder Zoarcoidei (Fig. 1F, G, and H) are united by having a single nostril, loss of the basisphenoid, and the structure of the adductor mandibulae (Anderson, 1994). The zoarcoids are found primarily in the North Pacific (Table I; Anderson, 1994; summary in Nelson, 1994). In the authors' study, relationships among the families Zoarcidae (Fig. 1I), Stichaeidae (Fig. 1G), and Pholidae (Fig. 1H) are tested. The perciform suborder Notothenioidei (Fig. 1J and K) contains biochemically derived low-temperature specialists (Eastman, 1993) that are primarily found in coastal Antarctica. Analysis of this group using molecular characters may aid in the understanding of the biogeographic origins of modern Antarctic fish fauna. Notothenioids are united by having one nostril on each side of the head and by the loss of one pectoral actinost (Table II; summarized in Eastman, 1993; Miller, 1993; Nelson, 1994). A fossil notothenioid has been described from the late Eocene of Antarctica (Balushkin, 1994). Relationships of the notothenioids to blennioids are problematic as there are no known morphological synapomorphies linking them (Eastman and Grande, 1989). The study described in this chapter also examines the relationship between the notothenioid families Nototheniidae (Fig. 1J) and Bathydraconidae (Fig 1K). One of the hypotheses examined is whether Pagothenia is an early offshoot of the Nototheniidae, as projected by Eastman and Grande (1989).
II. Materials and Methods
H. Relationships with Other Suborders Relationships of blenny-like perciform fishes have been debated in modern ichthyology (see Gosline, 1968, 1971; Greenwood et al., 1966; Rosenblatt, 1984;
A. Collection o f Specimens Fishes were collected by netting intertidally with use of the anesthetic quinaldine or subtidally by hand
15. Blennioid Relationships TABLE H
Taxon 1. Suborder Zoarcoidei
A. Family Stichaeidae Pricklebacks
251
Summary of Outgroup Taxa Sequenced a
N taxa
Distribution
8 families 95 genera 318 species 36 genera 65 species
Marine; primarily North Pacific
Single nostril; no known synapomorphies
Marine; primarily North Pacific, a few North Atlantic
Elongate dorsal fin
4 genera 14 species 2 genera 4 species 2 genera 10 species 45 genera 220 species
Marine; North Atlantic and North Pacific
Elongate dorsal fin; small pectoral fins; rudimentary or no pelvic fin
Marine; most North Atlantic and North Pacific
All with single nostril; postorbital lateralis canal ends at the lateral extrascapulars, free of the pelvic bone
Many Antarctic endemics
Pelvic fins with one spine; single nostril on each side; three flat, plate-like pectoral radials Gill membranes in fold across isthmus; body scaled; mouth protractile
Characters
Dictyosoma Ptectobranchus B. Family Pholidae gunnels Subfamily Apodichthyinae
Apodichthys (Xererpes) Subfamily Pholinae
Pholis C. Family Zoarcidae Eelpouts
Lycodes (Aprodon) Lycodichthys Zoarces 2. Suborder Notothenioidei
A. Family Nototheniidae Cod icefishes 1. Subfamily Notothenninae
Notothenia 2. Subfamily Trematominae
Pagothenia Trematomus (Pseudotrematomus) B. Family Bathydraconidae Dragonfishes
5 families 46 genera 122 species 17 genera 50 species 8 genera 30 species 4 genera 14 species 10 genera 16 species
Marine; coastal Antarctic and southern hemisphere
Marine; Antarctic
Gymnodraco Parachaenichthys
Gill membranes united; mouth nonprotractile; no spinous dorsal fin
aBased on Miller (1993), Nelson (1994), and Anderson (1994).
nets while scuba diving. Specimens were sacrificed either by freezing in liquid nitrogen or on dry ice or were placed directly in 95% ethanol. Notothenioids and some zoarcoids were obtained from frozen tissue collections of George Somero (Hopkins Marine Laboratory, Pacific Grove, California). All frozen samples were stored at - 8 0 ~ until use. For large specimens, either liver or muscle tissue was used for DNA extractions. For small specimens, the gut was removed and one side of the fish was used. Voucher specimens were formalin-preserved when sufficient in number and many were deposited in the Marine Vertebrates Collections at Scripps Institution of Oceanography, University of California, San Diego.
B. Preparation of DNA, Amplification, and Sequencing Frozen tissues were pulverized in liquid nitrogen using a cylindrical stainless-steel mortar and pestle.
Ethanol-preserved tissues were wrapped in foil, placed in liquid nitrogen, and pulverized with a hammer. DNA was extracted in a guanidine thiocyanate buffer (Perbal, 1988) to circumvent degradation, purified using proteinase K, RNase, phenol, and chloroform, and then precipitated following methods used in the authors' laboratory (Stepien et al., 1993; Stepien, 1995). A small sample of the DNA was run on a mini-gel to verify relative amounts and quality. Mitochondrial (mt) DNA primers used included 12S light strand 5'-AAACTGGGATTAGATACCCCACTAT -3' and 5'-GTCAGGTCAAGGTGTAGCAAT-3' and 12S heavy strand 5'-AGGAGGGTGAcGGGcGGTGTGT -3' from Kocher et al., (1989) and Titus and Larson (1995). The primer for the heavy mitochondrial strand was end labeled with biotin (Hultman et al., 1989) for later separation of the double-stranded polymerase chain reaction (PCR) product by means of Dynal streptavidin magnetic beads (Dynal Corp.). Procedures, amounts of reagents, and buffers followed the Perkin-
252
CAROL A. STEPIEN et at.
Elmer protocol in their AmpliTaq DNA polymerase kit (Perkin-Elmer Inc., N808-0167). Typical amplification parameters were 35 cycles of denaturation at 96~ for 45 sec, annealing at 53~ for 55 sec, and polymerization at 72~ for 90 sec. Amplified DNA was then bound to Dynabead M-180 streptavidin (Dynal Corp.), which produced high yields of purified, single-stranded template DNA for sequencing (Hultman et al., 1989; Uhlen, 1989). Sanger dideoxy sequencing (Sanger et al., 1977) was performed by means of Sequenase II and PCR product sequencing kits (Amersham/U.S. Biochemical Corp.), using the complementary primer and the purified, single-stranded DNA as a template. Samples from sequencing reactions were run on 6% acrylamide gels with constant temperatures of 50~ at approximately 2500 V. Samples were usually run on three separate gels for 2.5, 5, and 8 hr. in order to resolve sequences at various distances to 500 bp from the primer. Gels were transferred to blotting paper, dried for 2 hr, and visualized by autoradiography after 72 hr or longer of exposure to Kodak X-OMAT film. Sequences from gels were read into a Macintosh computer using an IBI/Kodak digitizer and MacVector-AssemblyLIGN software (International Biotechnologies, Inc., 1992).
C. Alignment and Data Analysis Sequences were aligned with each other using MacVector and AssemblyLIGN IBI/Kodak sequence analysis software and by hand. Pairwise (p) genetic distances, which are the proportion of nucleotide sites differing between each pair of sequences, were calculated using the phylogenetic analysis of parsimony (PAUP~4.0) (Swofford, 1996), and their standard errors were determined using MEGA (Kumar et al., 1993). Neighbor-joining (Saitou and Nei, 1987) clustering analyses were used to generate distance trees from the p distances using PAUP ~ 4.0 (Swofford, 1996). Support of the data set for nodes of the trees was determined by 100 bootstrapping replications, and a standard error test for the interior branch lengths of the neighbor-joining tree was conducted using MEGA (Kumar et al., 1993). For purposes of providing a very rough comparison of possible relative divergence times, a "conventional" mtDNA calibration rate of 1% sequence divergence per million years (myr) for an ectothermic animal was used (Brown et al., 1979; Avise, 1994). Preliminary results indicated that 12S rDNA sequences appeared to evolve in blennioids at moderately rapid rates in comparison with other mtDNA regions. Caution should be used with such extrapolations to evolutionary times because different nucleotide positions and genes within mtDNA may evolve at varying rates within some lineages (Gillespie, 1986; Moritz et al., 1987) and the pace
of mtDNA evolution has been linked to differences in metabolic rate and/or to body size differences in some groups (Thomas and Beckenbach, 1989; Martin et al., 1992; Rand, 1994; see Section IV). Most of the blennioids examined in the present study were approximately similar in size, ranging from about 4 to 8 cm TL; inhabit similar nearshore habitats, from intertidal to approximately 30 m in depth; and are warm temperate to tropical species (see Section I and Table 1). Exceptions in this study are members of the notothenioid outgroup and the zoarcid Lycodichthys, which inhabit the much colder waters of Antarctica, and some zoarcids (i.e., Lycodes and Zoarces) and stichaeids (Plectobranchus), which inhabit deeper, colder waters of temperate regions. These taxa thus have markedly lower metabolic rates, which may influence the rates of mitochondrial evolution (See review by Rand, 1994). In the present study, approximate divergence estimates were compared with independent estimates from the fossil record, geologic events, and other genetic distance studies, including DNA and allozyme analyses, where available. For groups of taxa analyzed with both mtDNA and allozyme (Stepien and Rosenblatt, 1991; Stepien, 1992) data, regression analysis (SPSS, 1992, version 5.0.1) was used to compare the p distances with Nei's (1972) D values. Maximum parsimony in the PAUP~4.0 program (Swofford, 1996) was the primary method used to analyze relationships from the blennioid DNA sequences. Characters were coded as unordered, and uninformative characters and missing data were excluded. Deletions were treated as single, independent characters. Fifty separate heuristic searches with random input order of taxa were used to analyze the entire data set for all taxa, due to its size. The trees were rooted to the Notothenioidei and Zoarcoidei. After initial PAUP heuristic searches of all taxa were completed, individual families and clades of families were analyzed separately using either exhaustive searches or the branchand-bound algorithm (Hendy and Penny, 1982). Members of the sister family and several other outgroup taxa, determined from the prior heuristic searches of all taxa, were designated as outgroups. Independent searches tested different relative weightings for transversions and transitions, according to their relative frequencies in the data set, as well as insertions and deletions. Consistency indices (CIs), lengths of the most-parsimonious and near-most-parsimonious trees, and strict and 50% majority-rule consensus trees were used to evaluate competing phylogenies. Support of the data set for nodes was estimated with 500 bootstrap replications of the data set and either the branch-andbound algorithm (Hendy and Penny, 1982) or heuristic searches, when size of the data set precluded the former.
15. Blennioid Relationships Distance clustering trees, such as neighbor joining, are based on reducing the character-state data set to a single n u m b e r (the p distances here) between each pair of taxa. Although they are useful for comparing overall amounts of sequence divergence, as used in this study, distance models are generally not regarded as a rigorous approach for evaluating and comparing phylogenies. In contrast, m a x i m u m p a r s i m o n y analyses are based on character state changes t h r o u g h o u t the data set and allow competing phylogenies to be systematically compared (see discussions by Avise, 1994; Swofford et al., 1996). For this reason, in cases of discrepancy between the two types of trees in the present study, the p a r s i m o n y tree was regarded as the more likely evolutionary scenario. The authors also tested for possible unequal rates of nucleotide evolution due to the secondary structure in the paired (stem) versus unpaired (loop and single stranded) elements of the mitochondrial 12S ribosomal DNA, as have been found in some other studies of nuclear and mitochondrial ribosomal D N A sequences (Wheeler and Honeycutt, 1988; Vawter and Brown, 1993; Orti et al., 1996). These influences may bias phylogenetic results (Hillis and Dixon, 1991; Dixon and Hillis, 1993; Orti et al., 1996). The authors' aligned mitochondrial 12S r D N A sequences were compared with secondary structure models for Homo sapiens (Neefs et al., 1991) and piranhas (Teleostei: Characiformes: Characidae: Serrasalminae; Orti et al., 1996) to formulate a model of blennioid secondary structure for Paraclinus integripinnis, following methods used by Orti et al. (1996). This model (Fig. 3) was then used to identify paired and unpaired regions for the other taxa according to the aligned sequences. Base composition, TABLE III
253
n u m b e r s of variable positions, transition ratio versus transversion ratio, and n u m b e r s of informative characters (from PAUP 4.0*; Swofford, 1996) were compared in the two types of structural elements versus the entire data set. Relative rates of nucleotide substitution were determined by dividing the n u m b e r of changes in the paired versus unpaired regions by the n u m b e r of nucleotides in each region, following Orti et al. (1996). Frequencies of variations were compared between paired versus unpaired regions using contingency table tests (Siegel and Castellan, 1988). Separate neighbor-joining and p a r s i m o n y analyses were conducted (as discussed earlier) on subsets of data for the paired versus unpaired elements using PAUP* 4.0 (Swofford, 1996). Resulting trees were then compared with each other and with analyses based on the whole data set (see earlier discussion). D N A sequences were deposited in GenBank (access n u m b e r s U90356U90414.).
III. R e s u l t s
The aligned mitochondrial 12S r D N A data set for 59 blennioid and o u t g r o u p taxa used for analysis consists of 400 bp. Table III indicates the n u m b e r s of transitional and transversional substitutions per family and suborder. These ratios are approximately consistent a m o n g taxa at the levels of families and suborders (Table III), comprising 60% transitions and 40% transversions in the entire suborder Blennioidei and 58% transitions and 41% transversions in the o u t g r o u p s (Zoarcoidei and Notothenioidei combined). The sole
N u m b e r s of Transitional and Transversional Base Substitutions in Families and Suborders a
Taxon Family Clinidae Family Labrisomidae Family Chaenopsidae Family Tripterygiidae Family Blenniidae Suborder Blennioidei Family Stichaeidae Family Pholidae Family Zoarcidae Suborder Zoarcoidei Family Nototheniidae Family Bathydraconidae Suborder Notothenioidei
N transitions N transversions 113 115 48 76 132 249 8 15 27 59 20 13 40
73 75 35 42 81 164 6 9 14 35 22 8 38
Ratio
Total
1.55 1.53 1.37 1.80 1.63 1.52 1.33 1.67 1.93 1.69 0.91 1.63 1.05
186 190 83 118 213 413 14 24 41 94 42 21 78
aRatio is transitions/transversions. There are no significant differences in the proportions of transititions and transversions among blennioid families (x2 = 1.0, df = 4, p > 0.90), zoarcoid families (x2 = 0.35, df = 2, p > 0.90), at the familial versus blenniod suborder level (x2 = 0.12, df = 1, p > 0.70), or among the three suborders (x2 = 2.7, df = 2, p > 0.50).
254
CAROLA. STEPIENet al.
familial exception is a preponderance of transversions in the Nototheniidae. Transition: transversion ratios vary considerably within groups of congeners analyzed; ranging from 0.58 in Labrisomus (N substitutions = 27) and 0.83 in Gibbonsia (N substitutions = 22) to 1.9 in Entomacrodus (N substitutions = 26) and 4.0 in Lycodes (N = 15) in contrast to their more stable proportions at the familial level (Table III). Differential weighting schemes, including weighting transversional'transitional substitutions 3-2 (determined from their relative proportions, see earlier discussion) and insertions/deletions 3"1 and 10"1, did not change the most-parsimonious trees in the PAUP analyses and are not shown. Figure 3 shows the secondary structure model for the blennioid P. integripinnis. The first 54 bp of the blennioid data set was not used in constructing the model or for further structural comparisons due to difficulty in aligning to the piranha sequences (Orti et al., 1996; see Section II) and, consequently, determining secondary structure. Paired elements of the 12S blennioid sequence data have fewer nucleotide changes (64 of 169 sites vary, equal to 40% of the overall variability) than do the unpaired regions (96 of 177 sites vary, equal to 60% of the overall variability). Proportions of variable sites are significantly greater in the unpaired regions (X 2 = 8.7, df = 1, P < 0.005). Paired regions have a slightly greater proportion of phylogenetically informative characters (113 of 169, equal to 55% of the number of informative characters in the entire data set) than do unpaired regions (93 of 177, equal to 45% of the total number of informative characters), which is a significant difference (X 2 = 7.4, df = 1, P < 0.01). The transition" transversion ratio is slightly (but not significantly) higher in paired (2.5" 1.0) versus the unpaired (1.6-1.0) areas, and the former are thus somewhat (but not significantly) less saturated ( / ~ 2 - - 2.2, df = 1, P < 1.0). There are significant biases in nucleotide composition within the paired (24.9%G, 14.1%A, 24.9%T, 31.1%C;/~,,2 = 227, df = 3, P < 0.0001) and unpaired (12.0%G, 40.3%A, 22.2%T, 25.5%C; X2 = 1374, df = 3, P < 0.0001) sequence regions. These nucleotide proportions are also significantly different between the paired and the unpaired areas (,t'2 = 1032, df = 3, P < 0.0001). Paired elements are significantly richer in guanine and cytosine nucleotides (56%), whereas unpaired sites have significantly greater numbers of adenine and thymine bases (62%; X2 = 562, df = 1, P < 0.005). Separate neighbor-joining and parsimony analyses showed only slight variations in tree topologies among paired, unpaired, and combined data sets and are thus not included. The neighbor-joining distance tree of all blennioid genera for the entire data set, based on p distances
(PAUP* 4.0; Swofford, 1996), is shown in Fig. 4. Percentages on the nodes of the trees in this study show bootstrap support above 50% for nodes. Figure 5 is a summary of familial groupings from strict consensus of most-parsimonious trees, calculated using all genera and 50 independent repeated heuristic searches with PAUP*4.0 (Swofford, 1996). Parsimonious relationships among tribes, species, and genera are shown in greater detail in Figs. 6 and 7. The neighbor-joining (Fig. 4) and parsimony trees (Figs. 5 and 6) for blennioids are similar, but differ from each other in positionings of the family Dactyloscopidae and of the labrisomid tribe Mnierpini. In neighbor joining, the dactyloscopid is closest to the family Tripterygiidae. In parsimony analyses (Figs. 5 and 6), Dactyloscopidae is the basal clade in the suborder Blennioidei. In the neighbor-joining tree (Fig. 4), Mnierpini is genetically closest to the North American myxodin clinids (Gibbonsia and Heterostichus). In the parsimony analyses (Fig. 6A), Mnierpini is depicted as the sister taxon to the clade containing the other labrisomids (and the chaenopsids), and this entire clade is then the sister group of a monophyletic Clinidae. Both neighbor-joining and parsimony analyses group the "family Labrisomidae" as paraphyletic and the Chaenopsidae as a monophyletic group contained within it. The neighbor-joining (Fig. 4) and parsimony trees (Fig. 6) also differ in some cluster relationships among the clinid and labrisomid tribes, which are separated by relatively short genetic distances. Because the standard errors of these short branch lengths in the neighbor-joining analysis are high (MEGA analyses; Kumar et al., 1993), this tree cannot adequately distinguish the order of these higher-level relationships. This may be due either to site saturation (swamping of transitions; see Brown et al., 1979), which does not appear to be the case here, or to rapid taxon divergence rates. Figure 6A is the 50% majority-rule consensus tree of the mostparsimonious trees from a branch-and-bound search of the families Clinidae, Labrisomidae, and Chaenopsidae. A 50% majority-rule consensus of branch-andbound search maximum parsimony trees depicting the relationships of the families Dactyloscopidae, Tripterygiidae, and the Blenniidae is shown in Fig. 6B. A single most-parsimonious tree was obtained from a branch-and-bound search for the relationships of the suborders Zoarcoidei and Notothenioidei and is shown in Fig. 7. Separate exhaustive searches were also conducted for each family, and results are indicated in the legends of Figs. 6 and 7. Figure 8 shows results of the regression analysis of Nei's (1972) D from allozyme studies (reported in Stepien and Rosenblatt, 1991; Stepien, 1992) versus p distances from 12S mtDNA sequences.
AAUA
C
A
ACGC
A U U A
/G /
A
c / G U/A AC
/
u / C 5,
G
U
A
AGAAGC
C C C A C U ACGA
il
G
A
G~C G~C G~C CUbA
A
A C U U U U AU A G AAUUGACCCA
C
C A U A U G G G~C U~A G--C U--A C~G G~CC GAAA
G~C G~C A~U G
U
CAUUCGAC
G
U A A
AUAA
A
G
G
\
C
A
UACUA
\ ~
\
N
c
C A A
A A
u CUA
u
G
C C G C c AGGAACUAC
C
AC
G ~
CC
U
U
C
A
U
C
U
CGCC
C G C
U A U A U
GUUC
CUC
U C U C C AC
U U
u \ G G \ k AUC \
A
'~
A
cc
U
G A
G A \ G \ C U U G A A A C C C A A AGGA
CUAGCA3'
U
U A C C
U U G
A
Illlll
A A A C A A
UGA
FIGURE 3
GUAAGC
AA
C A
U U C C
UCG
U
G
U
AAT
C
G
CU
UGGCGGUGCUU
A
AGACC
C
C
C
C
U
A
G
C G A
S e c o n d a r y structure of the labrisomid Paraclinus integripinnis, s h o w i n g p a i r e d a n d u n p a i r e d regions. S e c o n d a r y structure w a s not d e t e r m i n e d for the first 54 bases of the blennioid dataset (see Section III).
256
CAROL A. STEPIEN et al. 0.047 / 98%
0.072 Ophiclinus gracilis IOphiclinini o.o14 Sticharium dorsale I 0.016 I I 0.023 Heteroclinus heptaelous 93% | 0.059 Heteroclinus wilsoni Clinini 0.057 Heteroclinus scotti 0.011 Clinitrachus argentatus 70% ! ~ 1 Myxodes viridis I L-0.017 | " 0 02o Heterostichus rostratus | | 94% I0.010 I 0 0 1 2 Gibbonsiametzi Myxodini o OlO I 88% " " I 9 h o" 022 GIbbonsla montereyensis 1 1 0 0 5 5 56~ " Gibbonsiaelegans J " o 035 Mnierpes macrocephalus Mnierpini LABRISOMIDAE 1 " 0.029 1100% I 0"035 Paraclinus integripinnis I I " Exerpes asper Paraclinini 0.038199%
0.0351 100%
CLINIDAE
r Acanthemblemaria aspera L___ Acanthemblemaria crockeri CHAENOPSIDAE " Chaenopsis sp 1,4 o o~ I II 0.o64 Emblemaria hypacanthus 0.013 U 0.072 Starksia atlantica I Starksiin i 57% II 0.075 Starksia nanodes Stathmonotus sp. I Uncertain Auchenionchus microcirrhis I Cryptotremini 0.071 LABRISOMIDAE Neoclinus blanchardi I Neoclinini 0.040 0.019 Labrisomus striatus o.o13 I 97% I 0.031 Labrisomus xantii Labrisomini 0.024 82% I 0"0321100%10.028 Malacoctenus zonifer Malacoctenus hubbsi 0.059 Karalepis stewarti 0.065 0.013 Rosenblatella etheostoma 0.060 Tripterygion delaisi TRI PTERYGIIDAE Notoclinus compressus 0.01311 0.092 Axoclinus nigricaudis 0.049 Myxodagnus opercularis I DACTYLOSCOPIDAE 0.024 196% I o.o41 Parablennius yatabei ! 0.010 | Hypsoblennius gentilis 82% I 0.027 Hypsoblennius gilberti 0.008 , 0.091 Ecsenius nalolo 0.058 Rhabdoblennius ellipes 0.0511100% I 0.039 Entomacrodus chiostictus BLENNIIDAE 0.013 Entomacrodus cadenati 0.061 85% 0.0101 0.036 Omobranchus Ioxozonus Omobranchus fasciolatoceps 0.014 0.037 Omobranchus punctatus 0.012 | 73% 0.056 Ophioblennius steindachneri 63% I 0.088 0.020 Petroscirtes breviceps .... Lycodes cortezianus I I
II
~ I I
I oo,~ ~" I
0.011
I
I
o
0 013 I I
0.0090t4 89% ~~ "
0.023/100%
0.034
57%
99%
o o, o, ~oo~
I u.u~u ~0 022 9
Lycodes pacificus 0 029 Lycodicthys dearborni " 0018 Zoarces viviparus I1~ 0019 Dictyosoma burgeri I"! " " Plectobranchus evides I I 0.0211103%|Apodichthys tfavidus I 0.042 "-- Apodichthys fucorum 0.039 Pholis gunnellus 0.012 |
0.040 / 100%
56% |
0.018 Notothenia gibberifrons I 0.018 Pagotheniaborchgrevinki 0.023 Trematomus bernacchii 0.032 Gymnodraco acuticeps Parachaenichthys charcoti
0.0271100%
0.027 / 92%
I'
ZOARCOIDEI
NOTOTHENIOIDEI
Neighbor-joining distance tree (PAUP 9 4.0; Swofford, 1996) using p distances for 12S mtDNA sequence data for all taxa. Branch lengths are indicated by decimals, where space is available, and can be calculated by length comparisons for others. Distances among taxa may be estimated by adding the branch lengths. Bootstrap values above 50% are shown as percentage support for nodes.
FIGURE 4
A. Parsimonious Relationships of the Families Clinidae, Labrisomidae, and Chaenopsidae The chaenopsids, labrisomids, and clinids together form a monophyletic clade in the parsimony analyses
(Figs. 5 and 6) and are also most closely related to each other in the neighbor-joining tree (Fig. 4). Maximum parsimony trees (Figs. 5 and 6A) show that the Clinidae and Chaenopsidae are each monophyletic, but the Labrisomidae is paraphyletic and contains the Chaenopsidae (Fig. 6A). Members of the egg-laying tribe
15. BlennioidRelationships
257 Clinidae
Labrisomidae/Chaenopsidae
Tripterygiidae
100%
Blenniidae
76%
Dactyloscopidae
Zoarcoidei
100%
Notothenioidei
Consensus of three most-parsimonious trees summarizing primary taxonomic groupings from 50 heuristic searches of all genera using PAUP 94.0 (Swofford, 1996), excluding uninformative characters. The topology of these major clades was identical in all three most-parsimonious trees (CI excluding uninformative characters = 0.30, length = 1320 steps). FIGURE 5
Myxodini are located basally within the family Clinidae in the parsimony analyses (Fig. 6A), but do not comprise a separate sister clade to the live-bearing tribes Clinini and Ophiclinini. Instead, the North American clinids form a monophyletic basal clade among the myxodins, and this clade is the sister group to Myxodes and the remaining clinids. Myxodes is then the sister taxon of the monotypic Mediterranean Clinitrachus and the livebearing tribes. Next, Clinitrachus is the sister taxon to the clade containing the tribes Clinini and Ophiclinini. Within the Labrisomidae, the tribe Mnierpini (Mnierpes macrocephalus) is depicted as
the basal taxon and as the sister group of the other labrisomids, as well as the Clinidae. The tribe Starksiini is shown as the next most basal labrisomid group and is then the sister group of the remaining labrisomids. The tribe Labrisomini (comprising the genera Labrisomus and Malacoctenus) is monophyletic and is the sister group of a clade grouping the tribes Neoclinini and Cryptotremini together. The genera Paraclinus and Exerpes comprise the monophyletic tribe Paraclinini. The family Chaenopsidae is depicted as a monophyletic clade within the Labrisomidae, and the relationships among Stathmonotus, the Paraclinini, and the Chaen-
258
CAROL A. STEPIEN et al.
A
I 99%
I
I
92% I
99%
Ophiclinus gracilis
Ophiclinini
Sticharium dorsale
Heteroclinus heptaelous
Clinini
Heteroclinus wilsoni Heteroclinus scotti
68%
Clinitrachus argentatus
CLINIDAE 55%
Myxodes viridis Heterostichus rostratus I
79%
Myxodini
Gibbonsia metzi
100% I
I 100% I i
99%
Gibbonsia montereyensis Gibbonsiaelegans Paraclinus integripinnis
Paraclinini
Exerpes asper
Stathmonotus sp. I Uncertain
~
Ncanthemblemaria
aspera
Acanthemblemaria crockeri ~
Chaenopsis so.
CHAENOPSIDAE
Emblemaria hypacanthus
85% "LABRISOMIDAE"
62%
I I
Auchenionchus microcirrhis I Cryptotremini Neoclinus blanchardi I Neoclinini Labrisomus striatus
78%
100%
89% !
Labrisomini
Labrisomus xantii
i
Malacoctenus zonifer
98% I
Malacoctenus hubbsi
I
Starksia atlantica Starksia nanodes
Starksiini
Mnierpes macrocephalus I Mnierpini Axoclinus nigricaudis
OUTGROUPS
I
Myxodagnus opercularis Omobranchus punctatus
FIGURE 6 Most-parsimonious (MP) trees from branch-and-bound searches for (A) the families Clinidae, Labrisomidae, and Chaenopsidae (rooted to tripterygiid, blenniid, and dactyloscopid outgroups; CI excluding uninformative characters = 0.41, length = 756 steps) and (B) the families Tripterygiidae, Blenniidae, and Dactyloscopidae (rooted to clinid, labrisomid, and chaenopsid outgroups). Bootstrap values are shown as percentage support for nodes. The trees were first analyzed using the basal species for each genus with more than one species represented, i.e., Heteroclinus, Gibbonsia, Labrisomus, and Malacoctenus in tree A and Hypsoblennius, Entomacrodus, and Omobranchus in tree B. Tree A contains a trichotomy, based on strict consensus of relationships among a labrisomid-chaenopsid clade. Separate exhaustive searches were conducted using all species. These had identical topologies for the families Clinidae (rooted to Paractinus, Labrisomus, Emblemaria, Starksia, and Mnierpes; one MP tree; CI = 0.53), Labrisomidae/Chaenopsidae (rooted to Heterostichus, Axoclinus, and Omobranchus; two MP trees, CI = 0.43), Chaenopsidae (rooted to Paraclinus, Labrisomus, Starksia, and Mnierpes; one MP tree; CI = 0.64), Tripterygiidae (rooted to Omobranchus, Rhabdoblennius, Starksia, and Mnierpes; one MP tree; CI = 0.62), and Blenniidae (rooted to Axoclinus and Myxodagnus; four MP trees, which differed in the relationships among species of Omobranchus and in relative positioning of Ecsenius and Rhabdoblennius; CI = 0.52).
15. Blennioid Relationships
259
Heterostichus rostratus Paraclinus integripinnis
OUTGROUPS
Emblemaria hypacanthus Starksia aUantica Mnierpes macrocephalus Karalepis stewarti
62%
Rosenblatella etheostoma Tripterygion delaisi
TRIPTERYGIIDAE
Notoclinus compressus Axoclinus nigricaudis Parablennius yatabei
86%
88% I
I
60%
Hypsoblennius gentilis
Parablenniini
Hypsoblennius gilberti Ecsenius nalolo
100% I
I
87% BLENNIIDAE
Entomacrodus chiostictus
Salariini
Entomacrodus cadenati Rhabdoblennius ellipes Omobranchus Ioxozonus
55%
Omobranchus punctatus
61%
Omobranchini
Omobranchus fasciolatoceps
71%
Ophioblennius steindachneri I Salariini
56%
Petroscirtes breviceps I Nemophini
DACTYLOSCOPIDAE
Myxodagnus opercularis
FIGURE 6 (Continued)
opsidae are not resolved in these analyses. Within Chaenopsidae, Chaenopsis is the sister group to the Acanthemblemaria clade, and Emblemariais then the sister group of the entire clade.
B. Relationships of the Families Dactyloscopidae, Tripterygiidae, and Blenniidae The family Dactyloscopidae is the basal group of the blennioids in the parsimony analyses (Figs. 5 and 6B)
and is very divergent in the neighbor-joining tree (Fig. 4). The dactyloscopid sequenced has one deletion and one insertion which distinguish it from all other taxa analyzed in this study. The families Tripterygiidae and Blenniidae are monophyletic and group together as sister taxa in the most-parsimonious branch-and-bound tree (Fig. 6B). In the overall parsimony analysis (Fig. 5), the family Blenniidae is the basal taxon to the clade containing the Tripterygiidae as the next most basal group. The single
260
CAROL A. STEPIEN et al.
Axoclinus nigricaudis Omobranchus punctatus OUTGROUPS
Rhabdoblennius ellipes
Myxodagnus opercularis Lycodes cortezianus 76%
Lycodes pacificus
75% ZOARCIDAE 87%
Lycodicthys dearborni
Zoarces viviparus Dictyosoma burgeri
STICHAEIDAE
61%
ZOARCOIDEI
Plectobranchus evides
89%
Apodichthys flavidus 100%
PHOLIDAE
Apodichthys fucorum
59%
Pholis gunnellus 100%
Notothenia gibberifrons NOTOTHENIIDAE
98%
Pagothenia borchgrevinki 100%
NOTOTHENIOIDEI 100%
BATHYDRACONIDAE
98%
Trematomus bernacchii Gymnodraco acuticeps
Parachaenichthys charcoti
FIGURE 7 Single most-parsimonious tree obtained from a branch-and-bound search for relationships of the suborders Zoarcoidei and Notothenioidei (rooted to a dactyloscopid, a tripterygiid, and two blenniids; CI excluding uninformative characters = 0.62, length = 451). Bootstrap values are shown as percentage support for nodes. A separate exhaustive search of the Zoarcoidei (rooted to Omobranchus, Myxodagnus, Notothenia, and Gymnodraco) yielded a single most-parsimonious tree (CI excluding uninformative characters = 0.66, length = 319 steps), which was identical to the topology of the tree shown. A separate exhaustive search of the Notothenioidei (rooted to Omobranchus, Myxodagnus, Zoarces, and Pholis) yielded a single mostparsimonious tree (CI excluding uninformative characters = 0.74, length = 309), which was identical to the topology of the tree shown.
most-parsimonious tree (Fig. 6B) shows members of the tripterygiid tribe Leptoblenninae (Axoclinus and Karalepis) as paraphyletic basal groups to the mono-
phyletic tribe Tripterygiinae (Notoclinus, Tripterygion, and Rosenblatella). Two primary sister clades are found in the family Blenniidae: one contains the tribe Para-
15. Blennioid Relationships 2.0' o o o
1.5, 1.0'
o oo
~
o
00 o z
o o
0.5
/
/ o
o
o
o
o o
0.0
o.oo
o.o2
o.&
o.o6
o.o8
o.io
o.42
o.i4
o.i6
o.18
p distance
FIGURE 8 Regressionanalysis of Nei's (1972)genetic distances (D) from allozyme data (Stepien and Rosenblatt, 1991; Stepien, 1992) versus p distancesfrom 12SmtDNAsequences (seeFig. 4). F = 19.42, P < 0.0002.
blennini and most of the tribe Salariini (Rhabdoblennius, Ecsenius, and Entomacrodus) as sister taxa and the other contains the tribe Omobranchini as the sister taxon to a clade containing the tribe Nemophini and the remaining salariin (Ophioblennius).
C. R e l a t i o n s h i p s o f the Suborders Zoarcoidei and N o t o t h e n i o i d e i The two outgroup suborders, Zoarcoidei and Notothenioidei, are monophyletic and are sister taxa in the most-parsimonious trees (Figs. 5 and 7). They are also most closely related to each other by genetic distances in the neighbor-joining tree (Fig. 4). The zoarcoid families tested (Zoarcidae, Stichaeidae, and Pholidae) are each monophyletic, and Pholidae is the basal clade to a sister group comprising Zoarcidae and Stichaeidae (Fig. 7). A unique insertion unites the notothenioids and this study depicts the families Nototheniidae and Bathydraconidae as sister groups.
IV.
Discussion
A. Molecular Features of the Data Set The secondary structure of blennioid 12S rDNA (Fig. 3) is highly consistent with other vertebrates (Neefs et al., 1991), especially other teleost fishes (Orti et al., 1996). Base compositional biases in the paired (G/C rich) and unpaired ( A / T rich) regions have also been found in nuclear srDNA (Vawter and Brown, 1993) and other mitochondrial 12S rDNA (Orti et al., 1996) data sets. The (G/C) bias of the paired elements is believed to increase ribosomal subunit structural sta-
261
bility, and the (A/T) bias of the unpaired regions is thought to facilitate protein binding (Gutell et al., 1985). Transitions outweigh transversions in both the paired and the unpaired regions of the 12S rDNA blennioid data set, a bias found in all studies of mitochondrial DNA reviewed by Meyer (1993). Transition: transversion ratios are similar among sets of congeners, families, and suborders in the authors' data set (see Section III and Table III), and consistency of these ratios may suggest a retention of the phylogenetic signal at the various hierarchical levels. Differential weighting of transversions :transitions (3:2; according to their relative frequencies) for the entire sequence data set and using transversions only did not change the topologies of parsimonious trees. The secondary structure of the 12S rDNA region does not appear to affect the phylogenetic reconstruction of blennioid taxa. Comparisons of paired and unpaired regions in some other studies have suggested that unpaired regions produce more reliable phylogenies (Wheeler and Honeycutt, 1988; Vawter and Brown, 1993; Orti et al., 1996). Dixon and Hillis (1993) suggested that when relative rates of evolution are markedly different in paired and unpaired regions, weighting may be used to compensate. In the authors' study, separate parsimony and neighbor-joining analyses of data from the paired and unpaired structural regions resulted in few changes to the overall tree topologies, compared with those based on the entire sequence (see Section III). These variant trees split some clades that are well characterized on the basis of morphology, and the separate analyses thus appeared to be at the expense of sacrificing the overall number of informative characters necessary to resolve these relationships. Orti et al. (1996) found that small subunit nuclear rDNA unpaired regions evolved four times as fast as paired regions in piranha taxa. In comparison, the authors' results indicate that blennioid unpaired regions evolve more slowly, less than two times as fast as the paired regions. Informative characters are thus more evenly distributed between the paired and the unpaired elements in the authors' data set (see Section III). The entire 12S rDNA sequence data set of blennioid taxa in this study, including the paired and unpaired structural elements, contains phylogenetically informative characters.
B. Overall Phylogenetic and
Distance Relationships The neighbor-joining tree based on p distances (Fig. 4) and the most-parsimonious PAUP trees (Figs. 5, 6, and 7) largely support morphological hypotheses for the relationships of these groups (Springer, 1993). A
262
CAROL A. STEPIENet al.
close correspondence exists between p distances based on these sequence data and distance estimates based on allozyme data (Nei's, 1972 D; Stepien et al., 1993; see Fig. 8). These similar distance ratios may suggest similar evolutionary periods of time, especially for the lower taxonomic levels common to both studies. Examples of approximate molecular clock/evolutionary time calibrations are given later in this chapter, but should be regarded with extreme caution due to difficulty in calibrating clocks, possible differences in evolutionary rates among lineages (Gillespie, 1986; Moritz et al., 1987), and possible site saturation (Brown et al., 1979) at the higher taxonomic levels. A rate of sequence divergence of 1% per million years was used for calibration in the present study, which is at the lower end of the conventional range of 1 to 2%, adjusted for ectotherms (reviewed in Avise, 1994). If the rate of molecular evolution of these taxa has been relatively constant with time, then the proportional distances will allow future calibration adjustments. If the phylogenetic signal is "swamped" at higher taxonomic levels by too many substitutions at given sites, then the divergences at the deeper branches of the distance tree (Fig. 4) are underestimated. The relative rate of mtDNA evolution has been postulated to be correlated with differences in metabolic rate, body size, and/or generation time in some animal groups (Thomas and Beckenbach, 1989; Martin et al., 1992, Martin and Palumbi, 1993; Rand, 1994). Martin et al. (1992) examined sharks and Thomas and Beckenbach (1989) tested salmonids, and both studies compared the rates of these cold-blooded animals with those of mammals. These studies suggest that the rate of substitution may be two to five times lower in ectotherms than in endotherms. However, other studies of marine and freshwater teleosts have identified relatively high and similar rates of mtDNA substitutions in groups inhabiting a variety of different biogeographic temperature zones (Stepien, 1995). For example, Stepien (1995) found that deep-sea teleost fishes (members of the pleuronectid genus Microstomus and the scorpaenid genus Sebastolobus) inhabiting cold waters (approximately 4~ and having very low metabolic rates (living in the oxygen minimum zone) had high levels of variability in the mtDNA control region (comparable to teleosts inhabiting shallow, warmer waters). It is also possible that the higher mtDNA evolution rates in populations of the species of Microstomus and Sebastolobus examined by Stepien (1995) may be due to the influence of warmer temperatures a n d / or mutagenic effects of ultraviolet radiation during their pelagic early life history stages (the larval period may extend to 1 year for the Dover sole, M. pacificus). Sharks and sea turtles (which also have slow rates of
mtDNA substitutions; Avise et al., 1992) have less exposure to radiation during early life history stages than do relatively transparent pelagic fish larvae in surface waters. Blennioid fishes typically have pelagic larvae and many have relatively long larval periods (Matarese et al., 1984; Thresher, 1984), e.g., 2 months in myxodin clinids (Stepien, 1986a). It is possible that damage to mtDNA may be extensive during this early life history period, resulting in high mutation rates. In support of using these genetic distances to roughly estimate separation times in this study, there are no marked differences in relative magnitudes separating congeners belonging to different biogeographic temperature regions, including the deep water Lycodes, the temperate shallow water Apodichthys, Gibbonsia, Heteroclinus, Hypsoblennius, Entomacrodus, and Omobranchus; and the shallow water tropical Malacoctenus, Labrisomus, and Starksia (Fig. 4). These results suggest that there may be no direct correlation between habitat temperature and the rate of mtDNA mutations among these ectothermic taxa. Total horizontal genetic distances separating all taxa in the neighbor-joining tree (Fig. 4) are equivalent to a possible divergence of approximately 30.0 + 3.0 myr, during the mid to late Oligocene epoch (or earlier, if the calibration rate should be increased and/or if site saturation is responsible for underestimation). Distances suggest that the lineage containing the clinids, labrisomids, and chaenopsids stemmed from a common ancestor shared with the other blennioids by 23.0 + 2.0 to 27.0 + 3.0 myr. Ancestors of the families Tripterygiidae and Blenniidae may have similarly diverged by approximately 22.0 ___ 2.0 and 26.0 + 2.0 myr, respectively. These distances may suggest a relatively rapid diversification of blennioid higher taxa in a variety of demersal tropical and temperate habitats during the early to mid-Miocene epoch. Alternatively, the deeper phylogenetic radiations in this study may erroneously appear to have occurred at approximately similar times due to site saturation of the sequence. This hypothesis may be tested with more slowly evolving nuclear DNA regions, such as the ribosomal array (Stepien et al., 1993). Trees of blennioid familial relationships obtained from nuclear ribosomal DNA ITS-1 spacer sequences appear congruent with those obtained in this study (Stepien et al., 1993), supporting resolution of these relationships with this mitochondrial DNA data set. White (1986, 1989) hypothesized that many modern antitropical distributions, such as that of the family Clinidae (Fig. 2; see Stepien, 1992; Stepien et al., 1993), may have a common paleoclimatic origin in a midMiocene, low-latitude warming event, which appears consistent with DNA distances separating the primary
15. Blennioid Relationships blennioid groups. For example, members of the egglaying clinid tribe Myxodini and the temperate livebearing clinids appear to have stemmed from an early to mid-Miocene ancestor approximately 21.0 + 1.5 myr. In the mid-Miocene, the two live-bearing tribes Ophiclinini and Clinini may have diverged from each other about 16.5 + 1.5 myr, and the egg-laying myxodin clinids split into North and South America groups by 13.3 + 1.5 myr (congruent with estimates from allozyme data; Stepien, 1992; Stepien et al., 1993). Similarly, the clade containing the temperate cryptotremin and neoclinin labrisomids seems to have separated from common ancestors shared with tropical labrisomids in the New World about 16.3 + 1.5 myr. Many of the primarily tropical clades in the labrisomid group on the neighbor-joining tree (Fig. 4) may also have diversified during the hypothesized tropical Miocene warming, including Chaenopsidae (16.3 + 1.5 myr), Starksiini (15.4 ___ 1.5 myr), the enigmatic genus Stathmonotus (15.4 + 2.0 myr), and Labrisomini (14.6 + 2.0 myr). Tribes in the family Blenniidae likewise appear to share this divergence pattern, including Salariini (19.3 + 2.0 myr), Omobranchini (17.3 + 2.0 myr), Nemophini (17.7 + 2.0 myr), and Parablennini (16.8 + 1.5 myr). The longest single branch divergence within the Blennioidei leads to the Dactyloscopidae (Fig. 4), also suggesting a possible mid-Miocene divergence (approximately 18.4 + 2.0 myr). Estimated separation times from blennioid groups not directly discussed in this text may be calculated by adding the branch lengths in Fig. 4. Ancestors of the Notothenioidei and Zoarcoidei appear to have separated from a common ancestor shared with the Blennioidei by at least 28.0 + 3.0 myr. The zoarcoid and notothenioid lineages may have diverged from each other by the early to mid-Miocene, approximately 20.5 ___2.5 myr. According to these estimates, modern zoarcoid groups may have diversified by at least 10.0 + 0.5 myr and modern notothenioids about 9.2 + 0.5 myr, the latter following the expansion of a true Antarctic ice cap approximately 15 myr (Van Andel, 1985; White, 1989; other researchers suggest an older date; see summaries by Eastman, 1993; Miller, 1993). In contrast to these estimates of evolutionary time, a fossil notothenioid in Antarctica dated to 38 myr (Balushkin, 1994) lends support to the metabolic rate/temperature hypothesis for a slower rate of mtDNA change in taxa inhabiting colder waters (Thomas and Beckenbach, 1989; Martin et al., 1992; Avise et al., 1992; Rand, 1994). The rate of mtDNA changes may be markedly slower in these cold water outgroups and these taxa may thus be considerably older (perhaps four times those given here). Alternatively, the calibration time of 1% per million years may underestimate the divergence times for this en-
263
tire study, although other fossil dates and allozyme data (discussed later) appear to correspond to these estimates.
C. Evolution and Biogeography of the Family Clinidae Phylogenies of clinids from 12S mtDNA sequences (Figs. 4, 5, and 6A) and allozyme data (Stepien and Rosenblatt, 1991; Stepien, 1992, Stepien et al., 1993) yield trees that show the same ordering of relationships within the family. They differ in that the allozyme tree depicts live-bearing taxa as basal and as more closely related to the labrisomids (Stepien et al., 1993). In the mtDNA parsimony tree (Fig. 6A), the egg-laying myxodins are basal. Neighbor-joining distance analysis (Fig. 4) suggests that live-bearing taxa have the greatest degree of divergence in the family from a common labrisomid ancestor shared with the myxodins and suggests relative timing of divergences that support the allozyme tree. Morphological data (Stepien, 1992; Springer, 1993; Stepien et al., 1993) and molecular data (Stepien et al., 1993; present study) support monophyly of the Clinidae. An exception is the depiction of a close relationship of the labrisomid M. macrocephalus to the North American myxodin clinids in the authors' neighbor-joining tree (Fig. 4), which is not supported by the parsimony analyses (Figs. 5 and 6A) or by the allozyme study (Stepien et al., 1993). Examination of the mtDNA data set reveals no synapomorphies that would place the Mnierpini as part of the Clinidae, to the exclusion of other labrisomids. Inclusion of Mnierpini in the Myxodini appears unlikely based on morphology, but should be further tested and the other mnierpin genus (Dialommus) should be included. Mitochondrial DNA sequence relationships (Figs. 4 and 6A) support the morphological hypothesis that the tribes Ophiclinini and Clinini are sister groups and that inclusion of the ophiclinins (snake blennies) by George and Springer (1980) in the family Clinidae is correct. Parsimony analyses (Fig. 6A) of mtDNA, in contrast to neighbor-joining distance (Fig. 4) and allozyme data (Stepien et al., 1993), suggest that oviparity and external fertilization are the ancestral states among the clinid/labrisomid/chaenopsid clade, supporting the hypothesis of Wourms and Lombardi (1992) that the evolution of viviparity is usually derived in fishes. The mtDNA parsimony tree (Fig. 6A) also supports the hypothesis of Penrith (1969) that the live-bearing groups Clinini and Ophiclinini are less closely related to the common clinid ancestor shared with the Labrisomidae. Evolution of matrotrophic viviparity in the Clinini and Ophiclinini may be responsible for their comparatively greater species richness,
264
CAROLA. STEPIENet al.
in comparison with the less numerous oviparous Myxodini (Table I), supporting the hypothesis of Lydeard (1993) that viviparity in actinopterygian fishes may be positively correlated with speciation. Parsimony analyses of mtDNA sequences (Fig. 6A), allozymes (Stepien and Rosenblatt, 1991; Stepien, 1992; Stepien et al., 1993), and nuclear rDNA sequences (Stepien et al., 1993) support the conclusion that the live-bearing tribes Ophiclinini and Clinini form a monophyletic sister clade to the egg-laying myxodins. According to p distances, divergence of modern myxodin taxa appears to have occurred at least 16.7 + 1.5 myr, corresponding to a mid-Miocene separation, possibly the warming of the tropics proposed by White (1986, 1989). Allozyme data (Nei's D = 0.812 + 0.031) for the previously mentioned separation of the Australian clinin Heteroclinus and the South American myxodin Myxodes (Clinitrachus was not available to the allozyme study) similarly estimated this time as 15.4 + 0.6 myr (Stepien, 1992; calibrated according to Grant, 1987, D of 1.0 = 19 myr). MtDNA sequence data show that the Mediterranean myxodin (the monotypic Clinitrachus argentatus) is the sister taxon of the live bearers and of the South American Myxodes. Divergence of Myxodes from a common ancestor shared with Clinitrachus is estimated at approximately 11.8 + 1.0 myr, another apparent Miocene event. There is fossil evidence for a Miocene Clinitrachus in Romania (Bannikov, 1989), which appears congruent with the dates estimated in this chapter. The southeastern Pacific myxodin genus Myxodes is shown to be the sister group to Clinitrachus and to a monophyletic clade containing the northeastern Pacific genera Heterostichus and Gibbonsia (Fig. 6A); the latter relationship is also supported by allozyme data (Stepien and Rosenblatt, 1991; Stepien, 1992). Separation of the North and South American taxa may have occurred about 13.3 + 1.5 myr, comparable to allozyme estimates of 13.5 + 1.0 myr (Nei's D = 0.712 + 0.031) and compatible with the hypothesized mid-Miocene climatic warming hypothesis (White, 1986). The genera Heterostichus and Gibbonsia are sister groups (Fig. 6A), as shown in the analyses of allozyme and mtDNA data (Stepien and Rosenblatt, 1991; Stepien, 1992; Stepien et al., 1993). The mtDNA trees show that G. metzi is the sister species to the clade of G. montereyensis and G. elegans (Fig. 6A), discerning between one of the two mostparsimonious trees from allozyme data (Stepien, 1992). The sister relationship between G. elegans and G. montereyensis is supported by two morphological synapomorphies: unequally spaced posterior dorsal fin rays and prominent dorsal ocelli (Hubbs, 1952; Stepien and Rosenblatt, 1991; Stepien, 1992). The divergence of common ancestors shared by Heterostichus and Gibbon-
sia may have occurred during the late Miocene, estimated here as 6.5 + 0.3 and 7.8 + 1.0 myr (Nei's D = 0.41 + 0.031) using allozymes (Stepien and Rosenblatt, 1991; Stepien, 1992). Separation of the species of Gibbonsia appear to have occurred about 3.0 to 4.5 + 0.2 myr. These divergence dates also correspond to those estimated from allozyme data (3.28 + 0.09 to 4.45 + 0.13 myr; Stepien and Rosenblatt, 1991; Stepien, 1992). Temperature changes during the Pliocene may have served as vicariant events separating formerly continuous distributions of these intertidal fishes, resulting in speciation (Stepien and Rosenblatt, 1991; Stepien, 1992).
D. Phylogenetic Relationships of the "'Family Labrisomidae'" Mitochondrial DNA sequences (Figs. 4 and 6A), nuclear rDNA sequences (Stepien et al., 1993), and allozyme data (Stepien et al., 1993) confirm the close relationship among the "Labrisomidae," the Chaenopsidae, and the Clinidae. Trees from mtDNA (Figs. 4 and 6A) and allozyme data (Stepien et al., 1993) show that the "labrisomids" are not monophyletic. Mooi and Gill (1995) reported that the labrisomids they examined (Labrisomus, Malacoctenus, Paraclinus, and Starksia) are characterized by a less derived type of epiaxial muscle morphology than that possessed by tripterygiids, dactyloscopids, clinids (myxodin clinids were not included), chaenopsids, and blenniids, which may be a possible character uniting them. MtDNA sequence divergences suggest that the "labrisomid" and clinid clades may have shared a common ancestor approximately 23.0 + 2.0 myr. Estimates from allozyme divergences appear congruent in suggesting a most recent common ancestry of 22.3 + 1.1 myr (Stepien and Rosenblatt, 1991; Stepien, 1992). George and Springer (1980) hypothesized that the Labrisomidae may not be closely related to the Clinidae, which is contradicted by a suite of molecular evidence, including allozymes (Stepien et al., 1993), nuclear rDNA sequences (Stepien et al., 1993), and the present mtDNA sequence data (Fig. 6A). "Labrisomids" lack clear morphological synapomorphies and have been referred to as a "wastebasket" of scaled blennioids not clearly falling into other families (Springer, 1993). Molecular data sets differ somewhat in the relative positionings of the clinids, "labrisomids," and chaenopsids. Parsimony (Fig. 6A) and neighbor-joining (Fig. 4) analyses of the mtDNA data set suggest that a labrisomid-chaenopsid clade is the sister group of the Clinidae, with the Chaenopsidae contained as a monophyletic clade within a paraphyletic "Labrisomidae." The nuclear rDNA data analysis was unable to
15. Blennioid Relationships
distinguish among these relationships, which varied among the most-parsimonious and two next mostparsimonious trees (Stepien et al., 1993). Analyses of allozyme data (Stepien et al., 1993) placed the Chaenopsidae as the basal clade of a paraphyletic "Labrisomidae and a monophyletic Clinidae was the terminal group of the clade. In the allozyme trees, the chaenopsids were the sister group to sister clades comprising the Neoclinini and the remaining "labrisomids," respectively (Stepien et al., 1993). The most-parsimonious tree from mtDNA sequences suggests that there are six "labrisomid" clades: the Mnierpini, Paraclinini, the Chaenopsidae, the Neoclinini-Cryptotremini, the Starksiini, and the Labrisomini (Fig. 6A). Placement of the Mnierpini is very weakly supported, as are the relationships of the Starksiini, Neoclinini, and Cryptotremini. Parsimony analyses of mtDNA sequences suggest that the Starksiini is the sister group of the Labrisomini (Fig. 6A), whereas those based on allozyme data placed the tribe Starksiini as the sister group of either the Paraclinini or the Cryptotremini (Stepien et al., 1993). Rosenblatt and Taylor (1971) hypothesized that starksiins may be derived from either a cryptotremin or a Labrisomus-like ancestor, indicating morphological support for one of the allozyme hypotheses, as well as the mtDNA hypothesis. Low resolution for tribal relationships from the molecular data sets (Fig. 6A and Stepien et al., 1993), coupled with lack of morphological synapomorphies, leave these relationships speculative. Although trees from allozyme data (Stepien et at., 1993) place the Neoclinini as the basal "labrisomid" group and the sister group of the chaenopsids, mtDNA sequences (Figs. 4 and 6A) suggest that it is more closely related to the tribe Cryptotremini. The neoclinin-cryptotremin clade is then the sister group of a clade containing the Chaenopsidae, S tathmonotus, and Paraclinini (the latter clade is unresolved by consensus of the most-parsimonious trees). In contrast, allozyme trees do not indicate a close relationship between cryptotremins and neoclinins (Stepien et al., 1993). Hubbs (1952) had placed the genus Neoclinus in the Chaenopsidae and Springer (1955) then removed it to the Clinidae-Labrisomidae, postulating that it is derived from ancestors of the Paraclinini. Stephens (1963) excluded Neoclinus from the Chaenopsidae on the basis of presence of scales, a lateral line, and four circumorbital bones. Hastings and Springer (1994) suggested that morphological characters may place Neoclinus as the sister group of the family Chaenopsidae, compatible with the allozyme study (Stepien et al., 1993). Both allozyme (Stepien et al., 1993) and mtDNA sequence data (Fig. 6A) support traditional morphological groupings of genera within the tribes (shown
265
in Table I), although they differ in the relationships among the tribes. For example, the genera Exerpes and Paraclinus comprising the tribe Paraclinini are sister taxa in the most-parsimonious mtDNA sequence (Fig. 6A) and allozyme trees (Stepien et al., 1993). A sister relationship among the labrisomin genera Malococtenus and Labrisomus based on allozyme data (Stepien et al., 1993) and mtDNA sequences (Figs. 4 and 6A) is congruent with morphological similarity postulated by Hubbs (1952). Their hypothesized mid-to-late Miocene radiation of 12.8 + 1.0 myr appears congruent with the Miocene fossil Labrisomus pronuchipinnis in the southwestern Mediterranean, where the genus is no longer represented. The modern descendant Labrisomus nuchipinnis is widespread throughout much of the western Atlantic (Springer, 1993). "Labrisomids" and clinids have been described to "raft" in pieces of drift algae, which may explain their wide dispersal capability (Hubbs, 1952; Stepien, 1986a, 1992). Their larvae and postlarvae are planktonic for about 2 months and juveniles tend to congregate in groups in drift algae, which apparently aids in dispersal across deep water areas (Stepien, 1986a). E. P h y l o g e n y o f the F a m i l y
Chaenopsidae
MtDNA sequence data confirm the monophyly of the Chaenopsidae, whose morphological synapomorphies have been analyzed by Hastings and Springer (1994). MtDNA analyses (Figs. 4 and 6A) place the chaenopsids as the sister group of some of the "labrisomids." Parsimony trees (Fig. 6A) group the chaenopsids as being closely related to S tathmonotus and the Paraclinini. Neighbor-joining analyses of genetic distances from mtDNA (Fig. 4) suggest the closest affinity of chaenopsids with Paraclinini, Starksiini, and Stathmonotus. Hastings and Springer (1994) hypothesized Stathmonotus to be the sister group of the chaenopsids, based on morphological characters. Hastings and Springer (1994) also stated that among the currently recognized tribes of "labrisomids," the Starksiini share the greatest number of apparent synapomorphies with chaenopsids. In contrast to their placement in the mtDNA study (Fig. 6A), trees based on allozyme data showed the Chaenopsidae as the basal clade and the sister group of the clades comprising the Neoclinini and the labrisomid-clinid lineage (Stepien et al., 1993). A sister relationship among neoclinins and chaenopsids (including Stathmonotus) was also hypothesized by Hastings and Springer (1994), based on morphological characters. The chaenopsids analyzed in this study (Emblemaria, Chaenopsis, and Acanthemblemaria) appear to be separated by a total divergence of approximately 15.0 + 1.5
266
C A R O L A. STEPIEN et al.
myr. The single most-parsimonious tree (Fig. 6A) from an exhaustive search of mtDNA data shows Chaenopsis as the sister taxon of Emblemaria, which is then the sister group of a clade containing Acanthemblemaria. In comparison, Hastings and Springer (1994) hypothesized that the Acanthemblemaria clade forms the basal sister group to a clade containing Emblemaria and Coralliozetus as sister groups.
F. Evolutionary Relationships of the Family Tripterygiidae MtDNA sequence data place the monophyletic Tripterygiidae as either the sister group of the family Blenniidae (in parsimony analyses; Fig. 6B) or as the sister group of the "labrisomid"-chaenopsid-clinid clade (in neighbor-joining and overall parsimony analyses; Figs. 4 and 5). In the latter hypothesis, the family Blenniidae is then the sister group to the clades containing the Tripterygiidae and the clinids, "labrisomids," and chaenopsids (Fig. 5). MtDNA sequences suggest that ancestors of the tripterygiids diverged about 22.0 + 2.0 myr and that the tribes separated by 13.4 + 1.0 myr (Fig. 4), compatible with early and mid-Miocene separations. A fossil species (Tripterygion pronasus) has been described from Miocene deposits by the Mediterranean Sea (Arambourg, 1927; discussed by Wirtz, 1980), which appears compatible with these dates. The tripterygiid tribe Lepidoblenninae (represented here by Axoclinus and Karalepis) appears to be paraphyletic, as Axoclinus is depicted as the basal group to Karalepis, which is then the sister group of the tribe Tripterygininae (represented here by Notoclinus, Tripterygion, and Rosenblatella; see Figs. 4 and 6B). Neighbor-joining (Fig. 4) and parsimony trees (Fig. 6) suggest that Notoclinus forms the sister group of a clade containing Tripterygion and Rosenblatella of those taxa analyzed. Arrangements of these taxa differ from those hypothesized by Fricke (1994), based on morphology. Additional tripterygiids need to be sequenced in order to further elucidate their relationships.
G. Phylogeny of the Family Blenniidae Monophyly of the combtooth blennies is supported by five morphological characters (Springer, 1968, 1993; Williams, 1990), nuclear rDNA sequences (Stepien et al., 1993), and these mtDNA sequence data (Figs. 4, 5, and 6B). Six tribes are recognized (Table I), of which mtDNA sequences from four are analyzed. Most-parsimonious trees support the idea that the tribe Parablennini is monophyletic, which has been hypothesized based on two possible morphological synapomorphies (Williams, 1990). The tribe Salariini appears paraphyletic (Fig. 6B), with the genus Ophio-
blennius not grouping with the others. This needs to be investigated further. The remainder of the Salariini form a sister group to the Parablenniini. A sister relationship between the Parablenniini and the Salariini was also shown based on osteological characters (Bock and Zander, 1986). The most-parsimonious tree from mtDNA sequences (Fig. 6B) depicts the Nemophini (saber-tooth blennies) as closely related to the Omobranchini. A close relationship between the Nemophini and the Omobranchini has also been suggested by Springer (1968) based on jaws, dentition, and caudal fin osteology and by Bock and Zander (1986) based on neurocranial osteology. Most-parsimonious trees (Fig. 6B, see legend) did not resolve relationships among the Omobranchus species and some reversed the ordering of the genera Rhabdoblennius and Ecsenius from that shown. These questions need to be addressed further with additional taxa and a larger sequence data set.
H. Placement of the Family Dactyloscopidae Springer (1993) placed the dactyloscopids (sand stargazers) in the Blennioidei, but some researchers have placed them with the Uranoscopidae (e.g., Gosline, 1968). Uranoscopidae is now included in the suborder Trachinoidei (Nelson, 1994). 12S rDNA data in the present study support the morphological hypothesis (Springer, 1993) that dactyloscopids (represented here by Myxodagnus opercularis) are blennioids. Examination of the data set also provides support for considerable divergence from the other blennioids, as shown by the single longest horizontal branch on the neighbor-joining tree in Fig. 4. Morphologically, the Dactyloscopidae is also the most divergent family from the other blennioids, corroborating mtDNA data (Springer, 1993; V. G. Springer, personal communication 1996). Most-parsimonious trees in the authors' investigation (Figs. 5 and 6B) place the Dactyloscopidae as the sister group to other blennioids. The neighborjoining tree (Fig. 4) shows it as most closely related to the tripterygiids.
I. Phylogenetic Relationships of the Suborders Zoarcoidei and Notothenioidei The outgroups Notothenioidei and Zoarcoidei form two sister clades, corresponding to their division in separate monophyletic suborders (Figs. 4, 5, and 7). Some morphologists have also hypothesized a sister relationship among the notothenioids and zoarcoids (Anderson, 1990). Results of the authors' study suggest a close relationship among the blenniiform suborders Blennioidei, Notothenioidei, and Zoarcoidei. Their re-
15. Blennioid Relationships lationships to the Trachinoidei are being tested. These mtDNA data suggest that the ancestors of the notothenioid and zoarcoid clade stemmed from a common ancestor shared with the Blennioidei (Figs. 4, 5, and 7) by at least 28.0 + 3.0 myr and that the suborder lineages diverged from each other about 20.5 + 2.5 myr. White (1987) hypothesized that deep sea groups, such as zoarcids, may have speciated during evolutionary pulses associated with oceanic anoxic events by which advancing oxygen minima promoted taxonomic diversification at intermediate depths on the continental slope by restricting isolated populations to disjunct hydrochemical refugia. Modern zoarcoid and notothenioid lineages may have diversified about 10.0 + 0.5 and 11.6 + 0.5 myr, respectively, according to the 1% calibration used here. In contrast, Anderson (1994) has suggested a much older origin for the suborder Zoarcoidei, as early as the Eocene in the North Pacific Ocean. He postulated that the early zoarcoids then spread throughout the Pacific Rim and that the family Zoarcidae radiated along the western coasts of the Americas during the pre-Miocene. An earlier date is also indicated by the description of a nototheniid fossil in Antarctica dated 38 myr (Balushkin, 1994). It is possible that the date discrepancy for these coldwater outgroups, may be due to low sequence variability correlated with slow metabolic rate or to a calibration error in this study. If so, this fossil suggests that the true divergence dates may actually be four times greater than those indicated. Within the monophyletic Zoarcoidei, the family Pholidae (gunnels) appears to be the sister group of the families Stichaeidae (pricklebacks) and Zoarcidae (eelpouts; Figs. 4 and 7), among the taxa included here. The pholids may have diverged by 6.8 + 0.5 myr (using the 1% per million year estimate), possibly during temperate changes in the Pliocene, as hypothesized for the northeastern Pacific clinid genera (Stepien and Rosenblatt, 1991; Stepien, 1992). Pholids and clinids inhabit similar algal-covered rocky intertidal areas and are sensitive to temperate changes (Stepien et al., 1991). The most-parsimonious tree depicts the genus Pholis as the sister taxon of the genus Apodichthys, which is congruent with a morphological analysis by Yatsu (1985). Among the members of the Zoarcidae included in this chapter, the genus Zoarces is depicted as the sister group of Lycodichthys, which is then the sister taxon of Lycodes (Fig. 7). The notothenioids analyzed are monophyletic and form two clades (Figs. 4 and 7), the families Nototheniidae (Notothenia, Pagothenia, and Trematomus) and Bathydraconidae (Fig. 7; Gymnodraco and Parachaenichthys), corresponding to their morphological classification (De Witt et al., 1990; Eastman, 1993). The nototheniids (cod icefishes) and bathydraconids (dra-
267
gonfishes) may have been separated since at least 11.6 + 1.5 myr (Fig. 4), following the expansion of the Antarctic ice cap hypothesized about 15 myr (Van Andel, 1985; White, 1989; other researchers have suggested a much earlier date; see discussions by Anderson, 1990; Eastman, 1993; Miller, 1993). Taxon divergence times estimated by Bargelloni et al. (1994) are very similar to the estimates described in this chapter. The two clades within the family Nototheniidae follow its morphological division in two subfamilies, with the Notothenninae (Notothenia) as the sister group of the Trematominae (Trematomus and Pagothenia). The distance (Fig. 4) and parsimony (Fig. 7) trees of the authors' study do not support the hypothesis by Eastman and Grande (1989) that Pagothenia is an early branch of the Nototheniidae. In another sequencing study, which included a smaller portion of the 12S rDNA gene (overlapping the end portion of the sequence in this study) and part of the 16S rDNA gene, Bargelloni et al. (1994) found less close correspondence with morphological groupings than the authors did. The trees of Bargelloni et al. (1994) depicted the Nototheniidae as paraphyletic, with the Bathydraconidae placed between the Notothenninae and the Trematominae, which had low consensus and bootstrap support. However, higher consensus and bootstrap support for the authors' data and correspondence between the trees reported in this chapter (Figs. 4 and 7) and morphological-based systematics support monophyly of the Nototheniidae and a close relationship between the subfamilies Notothenninae and the Trematominae. The phylogenies of Bargelloni et al. (1994), the authors' trees, and morphological characters (Table II; Eastman, 1993) support a sister relationship between the trematomin genera Trematomus and Pagothenia. Divergence of these trematomins is similar in both studies, with the authors' suggesting about 3.6 + 0.5 myr of separation (Fig. 4). These separation times are congruent with those estimated by McDonald et al. (1992) from allozyme distances.
V. Summary Analyses of mtDNA sequences from the 12S rDNA region result in phylogenies that are largely congruent with known morphological classification (summarized by Springer, 1993), supporting monophyly of the blenniiform suborders Blennioidei, Notothenioidei, and Zoarcoidei. Results also support monophyly of a Clinid-Labrisomid-Chaenopsid superfamily and the families of Clinidae, Chaenopsidae, Tripterygiidae, Blenniidae, Nototheniidae, Bathydraconidae, Zoarcidae, Stichaeidae, and Pholidae. Trees of blennioid
268
CAROL A. STEPIEN et al.
relationships are congruent with those based on sequences of nuclear rDNA spacer regions (Stepien et al., 1993) and are largely congruent with those based on allozyme data (Stepien et al., 1993; Stepien, 1992; Stepien and Rosenblatt, 1991). The present investigation suggests that the chaenopsids form a monophyletic clade within the "Labrisomidae." Relationships among the "labrisomids" remain enigmatic due to lack of synapomorphies discerned from DNA, allozyme, and morphological data. Phylogenies based on mtDNA data support inclusion of the family Dactyloscopidae as blennioids, and parsimony trees (Figs. 4 and 6B) suggest their placement as the basal clade. These data support a possible sister relationship between the outgroups used and the suborders Notothenioidei and Zoarcoidei, with the Trachinoidei remaining to be investigated. Molecular data also seem to support most familial radiations as occurring relatively rapidly, possibly during the early Miocene epoch 22 to 27 myr and most tribal radiations as occurring during the midMiocene about 13.5 to 21 myr, using a calibration of 1% divergence per million years. These dates appear consistent with the Miocene fossils of a labrisomid (Springer, 1970; George and Springer, 1980), a clinid (Bannikov, 1989; see Springer, 1993), and a tripterygiid (Arambourg, 1927) and may be related to Miocene warming of the tropics (White, 1986, 1989). Tropical warming may have vicariantly separated formerly continuous distributions, promoting speciation (White, 1986, 1989; Stepien, 1992). Alternatively, similar divergence estimates may be artifacts of site saturation, which does not appear to be the case due to the consistency of transition to transversion rates and relatively high proportions of phylogenetically informative sites in both paired and unpaired regions coded by the 12S rDNA. Fossil evidence suggests that divergence of the notothenioid outgroup may actually be four times older than estimated in this chapter (Balushkin, 1994), possibly due to their low metabolic rates. Results of this study indicate that 12S mtDNA sequences are useful for resolving phylogenetic hypotheses at taxonomic levels ranging from species through suborders and that this region appears to retain phylogenetic signals for these various hierarchies. This is part of an ongoing comprehensive investigation of these groups by C. A. Stepien, using mitochondrial and nuclear DNA sequences.
Acknowledgments We thank the following persons for helping collect specimens; P. Wirtz, R. R. McConnaughey, R. H. Rosenblatt, R. E. Thresher, M. E. Anderson, E. O. Wiley, K. Amaoka, K. Kawaguchi, T. Abe, O. Okamura, G. Somero, A. A. Naffziger, L. Badzioch, S. Mesnick, K. Dick-
son, D. Hoese, and A. C. Gill. This manuscript benefited substantially from critical reviews by V. G. Springer, R. H. Rosenblatt, P. Wirtz, J. T. Williams, M. E. Anderson, C. Lydeard, R. R. Wilson, B. N. White, and T. D. Kocher. A pilot study for this work was begun by CAS during a Sloan Postdoctoral Fellowship in Molecular Evolution, sponsored by D. M. Hillis at the University of Texas, Austin. Data acquisition, analysis, and writing were done in the laboratory of CAS at CWRU. This study was supported by the CWRU Department of Biology, a George B. Mayer assistant professorship to CAS, and laboratory setup funds from the Ohio Board of Regents and a Howard Hughes Medical Institute grant to the Department of Biology, CWRU. KLC thanks the Howard Hughes Medical Institute summer undergraduate research program in the Department of Biology at CWRU for fellowship support. MJB was supported by the CWRU Department of Biology during a 1-year postdoctoral fellowship in the laboratory of CAS. Specimen collections in Japan by CAS were supported by the National Research Council, in Chile by National Geographic Society Grant 3615-87 to CAS and R. H. Rosenblatt, in California and Mexico by NSF BSR-8600180 to CAS, and in Portugal by a travel grant from the Centro de Ciencia e Tecnologia da Madeira (CITMA) to CAS and P. Wirtz. Undergraduate research students L. Naftalin, G. Johns, N. Valtz, H. Strick, and J. Skidmore assisted in some of the DNA extractions.
References Acero, A. P. 1987. The chaenopsine blennies of the southwestern Caribbean (Pisces, Clinidae, Chaenopsinae). III. The genera Chaenopsis and Coralliozetus. Bol Ecotrop 16:1-21. Anderson, M. E. 1990. The origin and evolution of the Antarctic ichthyofauna. In "Fishes of the Southern Ocean" (O. Gon and P. C. Heemstra, eds.), pp. 28-33. J. L. B. Smith Institute of Ichthyology, Grahamstown, South Africa. Anderson, M. E. 1994. Systematics and osteology of the Zoarcidae (Teleostei: Perciformes). ]. L. 13. Smith Inst. Ichthyol. Ichthyol. Bull. 60:1-120.
Arambourg, G. 1927. Les poissons fossiles d'Oran. Mater. Carte gol. Alger (paleont.). 6:1-289. Arise, J. C. 1994. "Molecular Markers, Natural History, and Evolution." Chapman and Hall, New York. Avise, J. C., Bowen, B. W., Lamb, T., Meylan, A. B., and Bermingham, E. 1992. Mitochondrial DNA evolution at a turtle's pace: Evidence for low genetic variability and reduced microevolutionary rate in the testudines. Mol. Biol. Evol. 9(3):457-473. Balushkin, A. 1994. Proeleginops grandeast manorum gen. et. sp. nov. (Perciformes, Notothenioidei, Eleginopsidae) from the late Eocene of Seymour Island (Antarctica) is a fossil notothenioid, not a gadiform. ]. Ichthyol. 34(8): 10-23. Bannikov, A. E 1989. The first discovery of scale-bearing blennies (Teleostei) in the Sarmatian of Moldavia. Paleont. ]. 2: 64-70. Bargelloni, L., Ritchie, P. A., Patarnello, T., Battaglia, B., Lambert, D. M., and Meyer, A. 1994. Molecular evolution at subzero temperatures: Mitochondrial and nuclear phylogenies of fishes from Antarctica (suborder Notothenioidei), and the evolution of antifreeze glycopeptides. Mol. Biol. Evol. 11(6):854-863. Bock, M., and Zander, C. D. 1986. Osteological characters as tools for blenniid taxonomy: A generic revision of European Blenniidae (Percomorphi; Pisces). Zool. Inst. Zool. Mus. Univ. Hamburg. 1986: 138-143. Briggs, J. C. 1974. "Marine Zoogeography." McGraw-Hill, New York. Brown, W. M., George, M., Jr., and Wilson, A. C. 1979. Rapid evolution of animal mitochondrial DNA. Proc. Natl. Acad. Sci. USA 76: 1967-1971. De Witt, H. H., Heemstra, P. C., and Gon, O. 1990. Nototheniidae,
15. Blennioid Relationships In "Fishes of the Southern Ocean" (O. Gon, and P. C. Heemstra, eds.), pp. 279-331. J. L. B. Smith Institute of Ichthyology, Grahamstown, South Africa. Dixon, M. T., and Hillis, D. M. 1993. Ribosomal RNA secondary structure: Compensatory mutations and implications for phylogenetic analysis. Mol. Biol. Evol. 10(1):256-267. Eastman, J. T. 1993. "Antarctic Fish Biology." Academic Press, San Diego. Eastman, J. T. and Grande, L. 1989. Evolution of the Antarctic fish fauna with emphasis on the recent notothenioids. In: "Origins and Evolution of the Antarctic Biota" (J. A. Cranes, ed.). Geol. Soc. Spec. Pub. 47:241-252. Fricke, R. 1994. Tripterygiid fishes of Australia, New Zealand and the Southwest Pacific Ocean, with descriptions of 2 new genera and 16 new species (Teleostei). Theses Zoologicae, Vol. 24. Koeltz Scientific Books. Fukao, R., and Okazaki, T. 1987. A study on the divergence of Japanese fishes of the genus Neoclinus. Jap. J. Ichth. 34(3):309-323. George, A., and Springer, V. G. 1980. Revision of the Clinid fish tribe Ophiclinini, including five new species, and definition of the family Clinidae. Smith. Contr. Zool. 307:1-30. Gillespie, J. H. 1986. Variability of evolutionary rates of DNA. Genetics 113:1077-1091. Gosline, W. A. 1968. The suborders of perciform fishes. Proc. U. S. Natl. Mus. 124:1- 78. Gosline, W. A. 1971. "Functional Morphology and Classification of Teleostean Fishes." University Press of Hawaii, Honolulu, HI. Grant, W. S. 1987. Genetic divergence between congeneric Atlantic and Pacific Ocean fishes. In "Population Genetics and Fishery Management." (N. Ryman, and F. Utter, eds.), pp. 225-246, Washington Sea Grant Program, Univ. of Washington Press. Seattle, WA. Greenwood, P. H., Rosen, D. E., Weitzman, S. H., and Meyers, G. S. 1966. Phyletic studies of teleostean fishes, with a provisional classification of living forms. Bull. Am. Mus. Nat. Hist. 131:339-456. Gutell, R. R., Weiser, B., Woese, C. R., and Noller, H. F., 1985. Comparative anatomy of 16S-like ribosomal RNA. Prog. Nucleic Acid Res. Mol. Biol. 32:155-216. Hastings, P. A. 1991. Phylogenetic relationships of the tube blennies of the genus Acanthemblemaria (Pisces: Blennioidea). Bull. Mar. Sci. 47(3):725-737. Hastings, P. A., and Springer, V. G. 1994. A review of Stathmonotus, with redefinition and phylogenetic analysis of the Chaenopsidae (Pisces: Blennioidei). Smith. Contr. Zoot. 558:1-48. Hendy, M. D., and Penny, D. 1982. Branch and bound algorithms to determine minimal evolutionary trees. Math. Biosci. 59:277-290. Hillis, D. M., and Dixon, M. T. 1991. Ribosomal DNA: Molecular evolution and phylogenetic inference. Quart. Rev. Biol. 66(4): 411-453. Hubbs, C. 1952. A contribution to the classification of the blennioid fishes of the family Clinidae, with a partial revision of the eastern Pacific forms. Stanford Ichth. Bull. 4: 41-65. Hultman, T., Stahl, S., Hornes, E., and Uhlen, M. 1989. Direct solid phase sequencing of genomic and plasmid DNA using magnetic beads as solid support. Nucleic Acids Res. 17: 4937-4946. International Biotechnologies, Inc. (IBI) 1992. Assembly LIGN Sequence Assembly Software, Kodak. Johnson, G. D. 1993. Percomorph phylogeny: Progress and problems. Bull. Mar. Sci. 52(1):3-28. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., P~i~ibo, S., Villablanca, F. X., and Wilson, A. C. 1989. Dynamics of mitochondrial DNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86:6196-6200. Kumar, S., Tamura, K., and Nei, M. 1993. "MEGA: Molecular Evolu-
269
tionary Genetics Analysis, Version 1.01." Pennsylvania State University, University Park, PA. Lydeard, C. 1993. Phylogenetic analysis of species richness: Has viviparity increased the diversification of Actinopterygian fishes? Copeia 1993(2):514-518. Martin, A. P., Naylor, G. J. P., and Palumbi, S. R. 1992. Rates of mitochondrial DNA evolution in sharks are slow compared with mammals. Nature 357:153-155. Martin, A. P., and Palumbi, S. R. 1993. Body size, metabolic rate, generation time and the molecular clock. Proc. Natl. Acad. Sci. USA 90: 4087-4091. Materese, A. C., Watson, W., and Stevens, E. G. 1984. Blennioidea: Development and Relationships. In "Molecular Systematics of Fishes" (H. G. Moser et al., eds.), pp. 565-573. Allen Press, Lawrence, KS. Meyer, A. 1993. Evolution of mitochondrial DNA of fishes. In "The Biochemistry and Molecular Biology of Fishes" (P. W. Hochachka, and P. Mommsen, eds.), Vol. 2, pp. 1-38. Elsevier Press, Amsterdam. McDonald, M. A., Smith, M. H., Smith, M. W., Novak, J. M., Johns, P. E., and Devries, A. L. 1992. Biochemical systematics of notothenioid fishes from Antarctica. Biochem. Syst. Ecol. 20:233-241. Miller, R. G. 1993. "A History and Atlas of the Fishes of the Antarctic Ocean." Foresta Institute for Ocean and Mountain Studies, Carson City, NV. Miller, D. J., and Lea, R. N. 1972. "Guide to the coastal fishes of California." Fish Bulletin 157. State of California. Department of Fish and Game. Sacramento, CA. Mooi, R. D., and Gill, A. C., 1995. Association of epaxial musculature with dorsal-fin pterygiophores in acanthomorph fishes, and its phylogenetic significance. Bull. Nat. Hist. Mus. Lond. (Zool.). 61(2): 121-137. Moritz, C., Dowling, T. E., and Brown, W. M., 1987. Evolution of animal mitochondrial DNA: Relevance for population biology and systematics. Annu. Rev. Ecol. Syst. 18:269-292. Neefs J. M., Y. Van de Peer, De Rijk, P., Goris, A., and De Wachter, R. 1991. Compilation of small ribosomal subunit RNA sequences. Nucleic Acids Res 19s: 1987-2015. Nei, M. 1972. Genetic distance between populations. Am. Nat. 106: 283-292. Nelson, J. S. 1994. "Fishes of the World," 3rd Ed. Wiley, New York. Orti, G., Petry, P., Proto, J. I. R. Jegu, M., and Meyer, A. 1996. Patterns of nucleotide change in mitochondrial ribosomal RNA genes and the phylogeny of piranhas. J. Mol. Evol. 42:169-182. Penrith, M. L. 1969. The systematics of the fishes of the family Clinidae in South Africa. Ann. S. Afr. Mus. 55(1): 1-127. Perbal, B. 1988. "A Practical Guide to Molecular Cloning." Wiley, New York. Rand, D. M. 1994. Thermal habit, metabolic rate and the evolution of mitochondrial DNA. TREE 9(4) : 125-131. Rosenblatt, R. H. 1984. Blennioidei: An introduction. In "Ontogeny and Systematics of Fishes," (H. G. Moser, et al., eds.), pp. 551552. Based on an international symposium dedicated to the memory of Elbert Halvor Ahlstrom, Allen Press, Lawrence, KS. Rosenblatt, R. H., and Taylor, L. R., Jr. 1971. The Pacific species of the clinid fish tribe Starksiini. Pacific Sci. 25: 436-463. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Sanger, F., Nicklen, S., and Coulson, A. R. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Aci. USA 74: 5463-5467. Siegel, S., and Castellan, N. J., Jr. 1988. "Nonparametric Statistics for the Behavioral Sciences," 2nd Ed. McGraw-Hill, New York.
270
CAROL A. STEPIEN et al.
Smith-Vaniz, W. F. 1976. The saber-toothed blennies, tribe Nemophini (Piscesi Blenniidae). Acad. Nat. Sci. Philadelphia 19:1-196. Springer, V. G. 1955. The taxonomic status of the fishes of the genus Stathmonotus, including a review of the Atlantic species. Bull. Mar. Sci. Gulf Carib. 5(1):66-80. Springer, V. G. 1968. "Osteology and Classification of the Fishes of the Family Blenniidae." U.S. Nat. Mus. Bull. 284. Smith. Inst. Press, Washington, D.C. Springer, V. G. 1970. The western south Atlantic clinid fish Ribeiroclinus eigenmanni with discussion of the intrarelationships and zoogeography of the Clinidae. Copeia 1970(3): 430-436. Springer, V. G. 1982. Pacific plate biogeography with special reference to shorefishes. Smith. Contr. Zool. 367:1-182. Springer, V. G. 1993. Definition of the suborder Blennioidei and its included families (Pisces: Perciformes). Bull. Mar. Sci. 52(1): 427-495. Springer, V. G. and Freihofer, W. C. 1976. Study of the monotypic fish family Pholidichthyidae (Perciformes). Smith. Contr. Zool. 216: 1-43. Springer, V. G. Smith, C. L., and Fraser, T. H. 1977. Anisochromis straussi, new species of protogynous hermaphroditic fish, and synonymyr of Anisochromidae, Pseudoplesiopidae, and Pseudochromidae. Smith. Contr. Zool. 252:1-15. SPSS, Statistical Package for the Social Sciences. 1992. Version 5.0.1. Stephens, J. S. 1963. A revised classification of the blennioid fishes of the American family Chaenopsidae. Univ. Calif. Pub. Zool. 68: 1-165. Stephens, J. S., and Springer, V. G. 1973. Clinid fishes of Chile and Peru, with description of a new species, Myxodes ornatus, from Chile. Smith. Contr. Zool. 159:1-24. Stepien, C. A. 1986a. Life history and larval development of the giant kelpfish, Heterostichus rostratus Girard. Fish. Bull. 84(4):809826. Stepien, C. A. 1986b. Regulation of color morphic patterns in the giant kelpfish, Heterostichus rostratus Girard: Genetic versus environmental factors. J. Exp. Mar. Biol. Ecol. 100:181-208. Stepien, C. A. 1987. Color pattern and habitat differences between male, female, and juvenile giant kelpfish. Bull. Mar. Sci. 41: 45-58. Stepien, C. A. 1991. Population structures, diets, and biogeographic relationships of rocky intertidal fishes in central Chile: High levels of herbivory in a temperate system. Bull. Mar. Sci. 47(3): 598-612. Stepien, C. A. 1992. Evolution and biogeography of the Clinidae (Teleostei: Blennioidei). Copeia 1992(2):375-392. Stepien, C. A. 1995. Population genetic divergence and geographic patterns from DNA sequences: Examples from marine and freshwater fishes. In "Evolution and the Aquatic Ecosystem: Defining Unique Units in Population Conservation," (J. Nielsen, ed.), pp. 263-287. American Fisheries Symposium 17, Bethesda, MD. Stepien, C. A., Dixon, M. T., and Hillis, D. M. 1993. Evolutionary relationships of the blennioid fish families Clinidae, Labrisomidae, and Chaenopsidae: Congruence between DNA sequence and
allozyme data. In "Symposium on Evolution of Percomorph Fishes," (G. D. Johnson, ed.). Bull. Mar. Sci. 52(1):873-921. Stepien, C. A., Glattke, M., and Fink, K. M. 1988. Regulation and significance of color patterns of the spotted kelpfish, Gibbonsia elegans Cooper, 1864 (Blennioidei: Clinidae). Copeia 1998(1):7-15. Stepien, C. A., Phillips, H., Adler, J. A., and Mangold, P. J. 1991. Biogeographic relationships of a rocky intertidal fish assemblage in an area of cold water upwelling off Baja California, Mexico. Pacific Sci. 45(1): 63- 71. Stepien, C. A., and Rosenblatt, R. H. 1991. Patterns of gene flow and genetic divergence in the northeastern Pacific myxodin Clinidae (Teleostei: Blennioidei), based on allozyme and morphological data. Copeia 1991(4) :873-896. Swofford, D. L. 1996. "PAUP* (Phylogenetic Analysis Using Parsimony) vers. 4.0 (test version O)." Sinauer, Sunderland, MA. Swofford, D. U, Olson, G. J., Waddell, P. J., and Hillis, D. M. 1996. Phylogenetic Inference. In "Molecular Systematics, Second Ed." (D. M. Hillis, C. Moritz, and B. K. Mable, eds.), pp. 407-514. Sinaver Assoc., Inc. Sunderland, MA. Thomas, W. K., and Beckenbach, A. T. 1989. Variation in salmonid mitochondrial DNA: Evolutionary constraints and mechanisms of substitution. J. Mol. Evol. 29:233-245. Thresher, R. E. 1984. "Reproduction in Reef Fishes." T. F. H. Publications, Neptune City, NJ. Titus, T. A., and Larson, A. 1995. A molecular phylogenetic perspective on the evolutionary radiation of the salamander family Salamandridae. Syst. Biol. 44:125-151. Uhlen, M. 1989. Magnetic separation of DNA. Nature 340:733-734. Van Andel, T. H. 1985. "New Views on an Old Planet: Continental Drift and the History of the Earth. Cambridge University Press, Cambridge. Vawter, L., and Brown, W. M. 1993. Rates and patterns of base change in the small subunit ribosomal RNA gene. Genetics 134: 597-608. Wheeler, W. C., and Honeycutt, R. L. 1988. Paired sequence difference in ribosomal RNAs: Evolutionary and phylogenetic implications. Mol. Biol. Evol. 5(1):90-96. White, B. N. 1986. The Isthmian link, antitropicality and American biogeography: Distributional history of the Atherinopsinae (Pisces: Atherinidae). Syst. Zool. 35:176-194. White, B. N. 1987. Oceanic anoxic events and allopatric speciation in the deep sea. Biol. Oceanogr. 5:243-259. White, B. N. 1989. Antitropicality and vicariance: A reply to Briggs. Syst. Zool. 38(1):77-79. Williams, J. T. 1990. Phylogenetic relationships and revision of the blenniid fish genus Scartichthys. Smith Contr. Zool. 492:1-30. Wirtz, P. 1980. A revision of the eastern-Atlantic Tripteryygiidae (Pisces, Blennioidei) and notes on some west African blennioid fish. Cymbium 1980(21):83-101. Wourms, J. P., and Lombardi, J. 1992. Reflections on the evolution of piscine viviparity. Am. Zool. 32:276-293. Yatsu, A. 1985. Phylogeny of the family Pholidae (Blennioidei) with a redescription of Pholis scopoli. J. Ichthyol. 32(3):273-282.
CHAPTER
16 Major Histocompatibility Complex Genes in the Study ofFish Phylogeny DAGMAR KLEIN Department of Microbiology and Immunology University of Miami School of Medicine Miami, Florida 33136
JAN KLEIN Max-Planck-Institut fiir Biologie Abteilung Immungenetik D-72076 T~ibingen, Germany and Department of Microbiology and Immunology University of Miami School of Medicine Miami, Florida 33136
AKIE SATO Max-Planck-Institut fiir Biologie Abteilung Immungenetik D-72076 T~ibingen, Germany
FELIPE FIGUEROA Max-Planck-Institut far Biologie Abteilung Immungenetik D-72076 T~ibingen, Germany
COLM O'HUIGIN Max-Planck-Institut fiir Biologie Abteilung Immungenetik D-72076 Tfibingen, Germany
eral approach to using Mhc genes in phylogenetic and systematic studies and the advantages, as well as possible pitfalls, are discussed.
I. Introduction The major histocompatibility complex (Mhc) is a gene system that arose early in the evolution of vertebrates in response to an increased need for protection against parasites. Because of its key function in the immune response, which it has retained during its entire evolution, the Mhc has been studied extensively by immunologists and is consequently one of the best characterized genetic complexes in vertebrates. To fish taxonomists, the Mhc offers several advantages that other molecular systems do not provide. Foremost among these is the trans-species character of Mhc polymorphism. The functional Mhc loci are highly polymorphic and many of these polymorphisms predate speciation. Closely related species, such as those constituting the haplochromine flocks of East African Great Lakes, share identical Mhc alleles. The frequencies of alleles can be used to determine the phylogenetic relationships among the various species of the flocks. The genMOLECULAR SYSTEMATICS OF FISHES
II. Major Histocompatibility Complex (Mhc) Structure and Function All jawed vertebrates possess a set of molecules that have a characteristic, highly conserved quaternary, tertiary, and secondary structure but, at the same time, are highly divergent in their primary structure: the major histocompatibility complex molecules (for reviews, see Klein, 1986; Srivastava et al., 1991; Kasahara et al., 1995). During their early evolution, the Mhc molecules were apparently assembled from three types of modules that arose independently (Figs. 1 and 2): the membrane-anchoring module (MAM), the immunoglobulin-like module (ILM), and the peptide-binding module (PBM; Klein and O'hUigin, 1993). The MAM is 27/
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
2 72
JAN KLEIN et al. E1
A
E2
E3
E4
E5
I
, 9
E6 E7 9
,,
E1
E2
....
Bq
E3
E4 E5
E6
i
T E1
E2
E3
E4
Relationship between exons (E) os Mhc class I genes (A and B) and domains of class I a and 13polypeptide chains. Different shading indicates modules: light, peptide-binding module (PBM); intermediate, immunoglobulin-likemodule (ILM);and dark, membrane-anchoring module (MAM). CT, connecting peptide; CY, cytoplasmic tail; TM, transmembrane region. Arrows indicate correspondence between exons and domains.
FIGURE 1
composed of a short connecting peptide (CT), a transmembrane (TM) region, and a cytoplasmic (CY) tail. The ILM consists of domains homologous to those of the immunoglobulin (Ig) superfamily proteins. The PBM resembles interleukin-8 (IL-8) and related proteins, and possibly also the endothelial-cell protein C receptor (EPCR). The three modules consist of domains whose arrangement distinguishes two types of Mhc molecules, class I and class II (Figs. I and 2). Molecules in both classes are heterodimers that consist of noncovalently associated a a n d / 3 polypeptide chains. The class I/3 chain contains a single Ig-like domain (ILD), which also occurs in a free form as/32-microglobulin in tissue fluids. In the class I cr chain, two peptide-binding domains (PBD), or1 and or2, constitute the PBM; one ILD joins noncovalently with the/3 chain ILD to form the ILM; and a single MAM fastens the molecule to the plasma membrane. In the class II molecule, PBDs of the cr and/3 chains (al and/31, respectively) comprise the PBM; another domain of the cr chain (or2), together with a domain of the/3 chain (/32), comprises the ILM; and the entire extracellular part of the molecule is fastened to the plasma membrane by two anchors, one contributed by the cr and the other by the/3 chain. The extracellular parts of the polypeptide chains are glycosylated, rendering the Mhc molecules glycoproteins. All class I and class II Mhc molecules thus far identified, be they from fish, amphibian, reptile, bird, or mammal, appear to have the same structure and the encoding genes the same exon-intron organization (Figs. 1 and 2). Each extracellular domain is encoded by a separate exon: E1 encodes the signal peptide; E2, E3, and E4 of the class I A genes encode the eel, c~2, and or3 domains, respectively; E2 and E3 of the class II A genes encode the or1 and or2 domains, respectively;
Aq
H E1
E2
E3
E4
Relationshipbetween exons (E) of Mhc class II genes (A and B) and domains of class II cr and 13polypeptide chains. For an explanation of symbols, see legend to Fig. 1.
FIGURE 2
and E2 and E3 of the class II B genes encode the/31 and/32 domains, respectively. The single domain of the class I/3 chain is encoded in three exons, although the bulk of the sequence is specified by a single exon (E2). The number of exons specifying the membraneanchoring domain (MAD) is somewhat more variable, both among genes and among species. The Mhc molecules are receptors that bind peptides produced by degradation of other proteins. Most of the time the peptides are derived from the body's own proteins, but in an infected animal, some of them originate from the parasite. Peptides originating from intracellular parasites, such as viruses, bind predominantly to class I molecules, whereas those derived from extracellular parasites, such as many bacteria, largely bind to class II molecules. The binding is dependent on interaction with a small number of amino acid residues of the peptide-binding region (PBR) specified by exons 2 and 3 in the case of the class I A gene and exon 2 in the case of class II A or B genes. The PBMs of class I and class II molecules are constructed somewhat differently so that they can accommodate peptides of different lengths and constitutions. Each PBM is capable of binding a large array of peptides, which, however, share amino acid residues at a few critical positions. The bound peptides, if derived from parasites, are recognized, together with parts of the Mhc molecules, by specific receptors on T lymphocytes. This recognition initiates the specific immune response to the parasite. The structural differences between class I and class II molecules outside the PBMs may reflect the distinctive modes of biosynthesis and intracellular transport of the two proteins. Class I molecules are synthesized and loaded with peptides in the endoplasmic reticulum. Class II molecules are synthesized in the endoplasmic reticulum and loaded with peptides in the early endosomes. Peptides used by different classes differ in their origin. Peptides for class I molecules are
16. Mhc in Fish Phylogeny
produced by processing intracellular proteins in specialized molecular aggregates (the proteasomes) in the cytosol. Peptides for class II molecules are produced by the enzymatic degradation of extracellularly derived proteins in the endocytic vesicle.
III. Mhc as a Source of Systematic Information Very few molecules have been studied as extensively and from so many different perspectives as those controlled by the Mhc. As a result, the Mhc products are among the best characterized glycoproteins. The main reason for this has been the desire to understand how the vertebrate immune system functions and how it originated. As such studies involve a variety of organisms, they provide not only the information sought, but also phylogenetic information. In the gene banks, Mhc sequences are well represented and thus provide a rich source of information for phylogenetic and taxonomical comparisons. Increasingly, however, Mhc genes are being studied with the sole purpose of obtaining phylogenetic information because they offer certain advantages over many other nuclear genes. For example, the Mhc genes are members of a rich multigene family which undergoes frequent rearrangements and thus constitutes a source of chromosomal mutations that can be used as characters in cladistic analysis. Another disadvantage is that certain regions of the Mhc genes, specifically the PBR, are highly variable. The variability is maintained by balancing selection (Hughes and Nei, 1988; Takahata et al., 1992) which retains alleles in populations as polymorphisms despite speciation events. These "transspecies polymorphisms" and their usefulness in systematics will be described in greater detail later. Study of the Mhc provides three types of phylogenetic information: sequence data, characters stemming from macromutations, and frequency data. Sequence differences originate from point mutations and can be evaluated by using either distance or parsimony (character-based) methods. Macromutations are defined as changes that simultaneously affect more than one nucleotide, in contradistinction to point mutations, which affect one site only. They include duplications, deletions, and other chromosomal rearrangements, and insertions of repetitive elements (transposons). Frequency data are derived from the study of gene and haplotype polymorphisms. Although they are normally used to evaluate relationships among populations, they can also be used to test relationships among closely related species.
273
IV. Sequences as a Source of Phylogenetic and Systematic Information Like other genes, the Mhc genes of two species that diverged from a common ancestor accumulate substitutional differences roughly in proportion to the elapsed time (Kimura, 1983). This "molecular clock" seems to tick not only for the neutral sites of the Mhc genes (synonymous, intron, and intergenic sites), but also for sites subject to balancing selection (largely the PBR sites; see Satta et al., 1991). The latter constancy of evolutionary rate presumably reflects a constancy of selection pressure. Because of these constancies, it is possible to use Mhc sequences to infer gene and species phylogenies. Examples are given in Figs. 3, 4, and 5 in the form of phylogenetic trees constructed on the basis of fish class I and class II amino acid sequences. The usefulness of Mhc sequence information for fish taxonomy has thus far been minimal. The trees in Figs. 3, 4, and 5 are congruent with established relationships among fish taxa, but do not add new information because only very few sequences are available from different taxa. The number can, however, be expected to grow rapidly in the near future and, with it, the utility of Mhc sequence information. Moreover, Mhc genes are already being used to help resolve long-standing taxonomical problems by focusing on specific taxa. One example is the relationship among the Dipnoi, Crossopterygii, and tetrapods (reviewed by Meyer, 1995). Class I Mhc genes of the coelacanth, Latimeria chalumnae (Betz et al., 1994), and of the African lungfish Protopterus aethiopicus (A. Sato, H. Sfiltmann, and J. Klein, unpublished data) have been cloned and show that the coelacanth class I Mhc genes are more closely related to the
91i
Pore-B17 Pore-A3
100 ] Sasa-P30
Cyca-UAl*01 Brre-UAl*01
lOOl
100i
Brre-UA-FU1 HLA-A11E
0.0
0.1
0.2
0.3
I
I
!
]
Genetic distance
FIGURE 3 Phylogenetictree of fish class I ~ polypeptide chain se-
quences. The tree was constructed by the neighbor-joiningmethod (Saitou and Nei, 1987); genetic distances were determined as percentage identity (Poisson corrected) between proteins. Numbers on nodes indicate percentage recovery of these nodes per 500 bootstrap replications. Pore, Poecilia reticulata, guppy (Sato et al., 1995); Brre, Brachydanio rerio, zebrafish (Takeuchiet al., 1995);Cyca, Cyprinus carpio, carp (Okamura et al., 1993); Sasa, Salmo salar, Atlantic salmon (Grimholt et al., 1993).
2 74
JAN KLEIN et al.
1oo{
,oo{
0.0
0.1
0.2
0.3
I
I
I
I
Brre-2.1.4 Brre-l.3.4 Brre-ll.2 Mosa-L35062 ~Gici-M89951 100L_. Gici-M89950
Genetic distance FIGURE 4 Phylogenetic tree of fish class II a polypeptide chain sequences. The tree was constructed by the neighbor-joining method (Saitou and Nei, 1987); genetic distances were determined as percentage identity between proteins. Numbers on nodes indicate percentage recovery of these nodes per 500 bootstrap replications. Gici, Ginglymostoma cirratum, nurse shark (Kasahara et al., 1993); Brre, Brachydanio rerio, zebrafish (Siiltmann et al., 1993, 1995); Mosa, Morone saxatilis, striped bass (Hardee et al., 1995).
amphibian homologs than any other fish class I genes, including the lungfish genes. The Mhc thus helps to resolve a dispute that so far has been based largely on morphological and paleontological data (but see Meyer and Wilson, 1990; Meyer and Dolven, 1992). There are, however, at least two problems that could arise in applying Mhc sequences to systematic studies,
~.~ 100 87
97]
10o 948~
I 991
Brre-DAB 1"01 Brre-DAB2*01
Brre-DAB4*01 Cyca-K7-3 100] Cyca-K9-4 Sasa-C144 Onmy-DAB*01 Sasa-c157 Pore-4-28 Auha-231a Auha-231b Mosa-C-1 Mosa-R41 Gici-L20274 100L Gici-L20275
one technical and the other interpretative. The technical problem lies in the difficulty of cloning Mhc genes from new taxa. Mhc sequences of distant taxa are so dissimilar that Mhc clones cannot be isolated by crosshybridization. The only possibility is to use degenerate primers for polymerase chain reaction (PCR) amplification, but even then, success depends very much on luck and persistence. Although there are residues shared by all or most Mhc proteins of a particular class, they occur mostly at single sites scattered along the entire sequence and are therefore often not suitable for designing PCR primers. Nevertheless, Mhc genes have been cloned from different taxa and the success rate will undoubtedly increase as more sequences become available. The interpretative problem lies in the fact that homology relationships among the Mhc genes are equivocal. The problem can be illustrated by a hypothetical example (Fig. 6). Consider an ancestral gene A that has duplicated in an ancestral species I and produced genes A1 and A2. The duplication then became fixed, and when two new species, 2 and 3, arose from the ancestral species 1, both duplicated genes were inherited. Since the time of the duplication, the A1 and A2 genes have been diverging from each other, first during the remaining time of existence of species 1 (time T1) and then after cladogenesis of species I into species 2 and 3 ( t i m e T2). Comparing A1 or A2 sequences from species 2 and 3 (orthologous genes) will reflects the species phylogeny, but comparison of A1 (A2) of species 2 with A2 (A1) of species 3 (paralogous genes) will not. The difficulty arises because it is not always possible to know whether a comparison is between orthologous or paralogous genes, especially when further duplications and deletions followed the initial event. The possibility of homoplasy exists in all multigene systems
A1
0.0
0.1
0.2
0.3
I
I
I
I
i
S
Genetic distance FIGURE 5 Phylogenetic tree of fish class II fl polypeptide chain sequences. The tree was constructed by the neighbor-joining method (Saitou and Nei, 1987); genetic distances were determined as percentage identity between proteins. Numbers on nodes indicate percentage recovery of these nodes per 500 bootstrap replications. Gici, Ginglymostoma cirratum, nurse shark (Bartl and Weissman, 1994); Onmy, Oncorhynchus mykiss, rainbow trout (Glamann, 1995); Sasa, Salmo salar, Atlantic salmon (Hordvik et al., 1993); Brre, Brachydanio rerio, zebrafish (Ono et al., 1992); Auha, Aulonocara hansbaenschi, cichlid fish (Oho et al., 1993c); Mosa, Morone saxatilis, striped bass (Walker and McConnell, 1994); Pore, Poecitia reticulata, guppy (Sato et al., 1995).
A2
A1
A2
s
T2 A1
ps c I E 1
-ti
A2
I A1
S
I
t
Divergence A2
T1
I Duplication A
FIGURE 6 A hypothetical example of gene duplication and divergence within and between species. A, A1, and A2 are loci (represented by rectangles); T, time. For discussion, see text.
16. Mhc in Fish Phylogeny
guishable. This is also true about deletions, insertions, and other rearrangements. If a macromutation occurs and becomes fixed in an ancestral population before the latter gives rise to extant taxa, a synapomorphic character for cladistic analysis is generated. Macromutations are, of course, not restricted to the Mhc; they can occur at any other locus or chromosomal region. The Mhc, however, has the potential to become a very rich source of macromutations (as it is in mammals; see Mfiukov~-Fajdelova et al., 1994; Satta et al., 1996) for two reasons. First, a dense cluster of closely related genes is more likely to undergo rearrangements than a chromosomal region occupied by unrelated loci. Second, because of the considerable attention awarded to the Mhc, various macromutations are likely to be discovered in this chromosomal region by chance. One example of a cladistically useful macromutation serendipitously discovered during the studies of the Mhc in cichlid fishes given below (Figueroa et al., 1995). As mentioned earlier, the organization of the Mhc exons and introns that code for the extracellular domains is the same in all genes studied thus far, with one exception. In Aulonacara hansbaenschi and other cichlids, the class IIB loci all contain an extra intron which splits the ILD-encoding exon 3 into two (Ono et al., 1993c; Fig. 7). Further examination has revealed the extra intron to be present not only in cichlids, but in all other Percomorpha examined, as well as in representative species of Atheriniformes and Cyprinodontifor-
(and may not even be excluded in single gene systems because they, too, might once have been multigenic) but it is particularly acute in the Mhc, in which contractions and expansions of the cluster are frequent occurrences (Klein et al., 1993b). This problem, however, does not occur when Zl is much smaller than Z2. In such situations, even the comparison of paralogous genes will provide a meaningful species phylogenetic tree. The fact that the available Mhc dendrograms are congruent with species phylogenies (Fig. 3), even though some of the former are almost certainly based on paralogous comparisons (all the Latimeria class I genes, for example, probably arose from an ancestral gene that emerged after the separation of Crossopterygii from other fish taxa), indicates that in "long distance" comparisons, paralogy is not a serious problem (this will be expanded on later). Mhc sequences, together with those of other nuclear genes, are therefore a useful source of phylogenetic information.
V. Cladistic Analysis with Macromutations Macromutations are likely to be unique events. Although a gene can duplicate repeatedly, it is highly improbable that the different duplications will involve exactly the same DNA segment and hence be indistin-
E1 -1 1 4
E2
95
96
275
E3
E4-E6 201 220 1 9 ~ ~ 221 1891 214
Brre ~
200-270
I3IIG
v
650
100 I415
E1 -1 1 5
95 I2
Auha 68-206
724-1300
96 ~
228 236 166 167 189 190 ] ooa] E 33B AE4 ~'~'~
97-167 78-89 408-413 335
FIGURE 7 Exon-intron organization of class IIB genes in zebrafish (Brre, Brachydanio rerio) and cichlid fish (Auha, Aulonocara hansbaenschi). Filled rectangles represent exons (E), open rectangles represent untranslated regions, and connecting lines represent introns (I). Border codon positions are indicated by numerals above the exons; numerals below two-way arrows give distances in base pairs (from Figueroa et al., 1995).
2 76
JAN KLEIN et al.
TELEOSTEI EUTELEOSTEI NEOTELEOSTEI
I i
OSTEOGLOSSOMORPHA
I
I
ELOPOMORPHA
OSTEOGLOSSO NOTOPTERO IDEI IDEI i
i i
~Ja <
o
I
CLUPEOMORPHA
I
i
DENTICI
m
OSTARIOPHYSI I
ACANTHOPTERIGII I
ANOTO- SALMONIFORMES OTOPHYSI PHYSI (PROTACANTHO'SILURI- ' ' PTERIGII)
CLUPEOIDEIPITOIDEI FORMES
)
)
I ATHERINO MORPHA P A R A C A N T H O~ ~J PTERYGII (~
)
~
)
~
PERCOMORHA (~
~ ~
~ .~ 0 r
<
~o ~
~=~=~o ooo
~o
~z~:
FIGURE 8
D i s t r i b u t i o n of the e x t r a i n t r o n in Mhc class I I B g e n e s a m o n g teleost fishes. +, p r e s e n c e ; a b s e n c e of the e x t r a i n t r o n . T h e c l a d o g r a m is b a s e d o n L a u d e r a n d L i e m (1983); the d i s t r i b u t i o n of t h e e x t r a i n t r o n is b a s e d o n F i g u e r o a et al. (1995).
mes; it is absent in Cypriniformes and Salmoniformes (Figueroa et al., 1995). This distribution supports the cladistic division of Euteleostei into Acanthopterygii and nonacanthopterygian taxa (Ostariophysi, Protacanthopterygii, and Paracanthopterygii; see Lauder and Liem, 1983; Fig. 8). The absence of the extra intron in Ostariophysi and Protacanthopterygii suggests that the intron arose after these two taxa diverged from the Neoteleostei; its presence in the different Acanthopterygii indicates that it arose before the radiation of this group and that the group might indeed be monophyletic. It has thus far not been possible to amplify class IIB genes from any of the representatives of the Paracanthopterygii tested and thus to determine whether the extra intron arose before or after the divergence of this group. The extra intron varies in length in the different species and, in some of the species, contains a hexameric repeat that is also present in the spliced transcript straddling the site interrupted by the intron in the genomic DNA. The intron may therefore have arisen by repeated tandem duplication of the hexamer (Figueroa et al., 1995).
VI. Mhc Gene Frequencies in Populations Undergoing Adaptive Radiation Functional Mhc loci are highly polymorphic in all vertebrate classes, including fishes (Ono et al., 1992, 1993b; Klein et al., 1993a). A hallmark of the Mhc polymorphism is its trans-species charactermthe fact that divergence of allelic lineages often predated species divergence (Klein, 1987; see Fig. 9). This long persistence of allelic lineages can be used to work out the phylogeny of species undergoing adaptive radiation, such as those of the haplochromine flock in Lake Victoria, East Africa. It is believed that the flock, which counts several hundred species, arose from a common ancestral species less than 1 million years ago (Greenwood, 1981; Meyer, 1993); in fact, at least some of the species may be less than 15,000 years old because there are indications that the lake may have dried up to a large extent some 13,000 to 15,000 years ago (Stager et al., 1986; Johnson et al., 1996). Morphological and behavioral
16. Mhc in Fish Phylogeny Species (~)
Species ( ~
0@0@0@ 0@0@0@ 0@0@0@ 0@0@0@ 0@0@0@ 0@0| 0@0| 0@0@0@ 0@0@0| 0@0@0@0@0@ 0@0@0000 0@0@0| 0@0@0@ 0@0@0| 0@0@0| 0@0|174 0@0@0| Species (~) The principle of trans-species evolution of Mhc polymorphism. A species is represented as a gene pool and individual genes at one locus as circles (different shading indicates different alleles). Each row of circles represents one generation. Passage of ancestral polymorphism from species Z to species X and Y is shown.
FIGURE 9
characters of the species have been well studied (Regan, 1922; Fryer and Iles, 1972; Greenwood, 1981; Barel et al., 1977; Witte and van Oijen 1990), but attempts to verify these classical studies using biochemical and molecular methods have failed because the species have little variability in either nuclear genes (Sage et al., 1984) or mitochondrial DNA (Meyer et al., 1990). To explain how Mhc polymorphism could be used to clarify the relationships among closely related species, consider what might have happened in the early phase of adaptive radiation. Assume that the flock was indeed founded, as all the available evidence indicates (Meyer, 1993), by a single stock. Mhc data to be described later suggest that the founding population was very large and evidently contained all the allelic lineages found in the extant flock. The founding stock was probably characterized by certain frequencies of the individual Mhc alleles. As the stock in the nascent lake split into populations occupying various emerging niches, the Mhc gene frequencies changed, especially if the splitting was accompanied by reductions in founding population sizes. The frequency changes were probably affected mainly by random genetic drift. Moreover, probably not all alleles were passed into the splitting populations so that the populations (emerging species) diverged gradually, not only with respect to allele frequencies, but also with regard to allele composition. This differentiation must have had a certain directionality reflecting the topology of the splitting process. For example, if an allele was lost in a particular ancestral node, all species derived from this node and all species derived from these species lacked this particular allele. The Mhc gene frequencies (together with frequencies at the microsatel-
277
lite loci; see Sfiltmann and Mayer, 1997) can therefore be used for the construction of the dendrograms depicting phylogenetic relationships of the Lake Victoria haplochromines. The proposal to use the Mhc in the study of recently diverged species may seem contradictory: earlier it was argued that for the Mhc to be suitable for phylogenetic inferences, the divergence times between taxa must be very long so that homology relationships among the genes do not influence the analysis. Now we suggest that the Mhc is also suitable for studying relationships among closely related species. Furthermore, it was mentioned earlier that the Mhc genes evolve under the influence of balancing selection and that, as a result, the Mhc gene trees do not match species trees, suggesting that the trans-specific persistence of allelic lineages makes the Mhc a suitable system for phylogenetic analysis. In reality, however, the proposal is not contradictory. The Mhc is appropriate for use when the taxa are either highly divergent or closely related; only in the middle range of divergence may serious problems arise. Two lines of argument can be put forward in support of using a locus under selection for phylogenetic analysis of young species, one based on theoretical considerations and the other on actual observations. In theory, natural selection may interfere with phylogenetic analysis because it may influence evolutionary rates of genes and lead to convergence of characters. An uneven, fluctuating evolutionary rate would influence the branch length of phylogenetic trees and make estimates of the time of branch divergence all but impossible. It should, however, not bias tree topology because several methods of phylogenetic reconstruction have been shown to perform well even under conditions of widely varying rates (Li et al., 1987). Hence, if one is interested primarily in phylogenetic relationships among taxa and much less in the time of their divergence, a fluctuating evolutionary rate should not be a hindrance in using a locus under selection. Moreover, virtually all loci used in phylogenetic analysis are under negative selection, to which similar objections may apply. Finally, evidence shows that although the Mhc loci are under positive selection, they nevertheless evolve at a constant rate (Satta et al., 1991). The presence of selection should therefore not pose serious problems for using the Mhc to construct phylogenetic trees. In turning to the problem of parallelism, two types of convergence must be distinguished: in sequence and in gene frequencies. Evidence for sequence convergence at the PBR sites of functional Mhc genes is available (O'hUigin, 1995; Klein and O'hUigin, 1995). Sequence convergence, however, should not influence
2 78
JAN KLEIN et al.
the topologies of trees involving either very distantly or very closely related taxa. In the former case, the substitutions at the affected PBR sites have reached a saturation level and any effects of convergence have been obliterated. In the latter case, the probable convergences can be identified and eliminated by the removal of the PBR sites. Moreover, because the substitution rate at the Mhc loci outside the PBR sites is moderate (Satta et al., 1991), very few new substitutions can be expected (and, indeed, have been observed) to have arisen in the Lake Victoria haplochromines since they began to radiate. This slow divergence of Mhc genes precludes the use of substitutions as markers for the phylogenetic analysis of recently divergent species and makes the convergence argument irrelevant in this particular situation. As pointed out earlier, the general scarcity of sequence differences postdating speciation in Lake Victoria haplochromines make it necessary to resort to sequence differences predating speciation and hence to the use of gene frequencies at the Mhc loci. Thus there is no contradiction in using a trans-specifically evolving genetic system and old allelic lineages to study recent speciations. The fact that sequence-based Mhc gene trees do not correlate with the species trees not only does not preclude the use of these genes in phylogeny analysis, it actually provides a unique opportunity for constructing gene frequency-based trees of the adaptively radiating species. Similar gene frequencies, like similar nucleotide substitutions, may of course be established independently in two taxa by selection. Because the selection pressure exerted on Mhc loci is from parasites, one could imagine that populations and species in different environmental niches come under the influence of different parasites and that, as a consequence, Mhc gene frequencies of these populations diverge. By the same token, Mhc gene frequencies in two different species exposed to the same parasites might be expected to converge so that gene frequencies will not reflect phylogenetic relationships among the species. There is, however, a powerful counterargument against this: If such convergences were taking place, the allelic lineages would not have persisted for over 30 million years. The trans-specific persistence of allelic lineages must indicate that the agent responsible for it must coevolve with the host. Hence, by focusing on old allelic lineages rather than on recent sequence variation, the gene frequency approach should provide meaningful information about the phylogenetic relationships among emerging species. If these theoretical propositions are not fully compelling, actual observations should be. The polymor-
phism of the HLA complex, the human Mhc, has been studied extensively in many populations and differences have been found in both allelic composition and allelic frequencies. Genetic distances have been calculated from the gene frequencies and used to construct dendrograms depicting the relationships of the various ethnic groups. The dendrograms have been shown to reflect the relationships of these groups, as inferred from historic and archeological record as well as from the study of mitochondrial DNA, hemoglobin variants, microsatellite DNA, and other sources of genetic information, remarkably well (summarized in CavalliSforza et al., 1994). Here, then, is a situation similar in many respects to that of the Lake Victoria haplochromines. The HLA system is undoubtedly under selection pressure, as is the haplochromine Mhc. Humans have spread out to inhabit far more diverse environmental niches than the haplochromines and have had many opportunities to become subjected to convergent selection pressures. The periods of relative isolation of the human populations are roughly comparable to the period of haplochromine radiation in Lake Victoria. The human populations have recently had far more opportunities for mixing than the haplochromine species during their divergence. And yet, the HLA gene frequencies still faithfully reflect the pattern of divergence of the human populations. It is believed, therefore, that there is a good chance that the Mhc gene and haplotype frequencies of the Lake Victoria cichlids m or, for that matter, of any other adaptively radiating species flock--will reflect the phylogenetic relationships among these species. The applicability of this approach to phylogenetic analysis depends on the availability of methods for rapid Mhc typing. In initial studies (Klein et al., 1993a; Ono et al., 1993b), DNA sequencing was used to resolve the individual Mhc genes, but, for obvious reasons, this method is not suitable for the large-scale screening of populations. There are, however, several alternative methods which involve considerably less investment in time and money but provide only a slightly lower resolution. Of these, the combination of single-stranded conformational polymorphism (SSCP) electrophoresis (Orita et al., 1989) and limited sequencing has proved to be the most economical. In this approach, locus-specific primers complementary to sequences flanking the highly polymorphic exon 2 of the class II B loci are used for amplification in the PCR and the amplification product is subjected to SSCP electrophoresis. The sensitivity of the SSCP method is such that it ideally detects differences restricted to a single site in a short DNA segment. The electrophoresis reveals the presence of different "patterns" (constella-
16. Mhc in Fish Phylogeny
A
B
C
D
E
F
G
H
FIGURE 10 An example of Mhc class II B patterns obtained by SSCP analysis of PCR products from Lake Victoria cichlid fishes. Each lane contains DNA amplified from a different species. The primers used correspond to codon positions 114-120 and 173-179 of exon 3. PCR conditions: Annealing temperature 65~C, 40 cycles,each cycle at 93~ for 15 sec, 65~ for 10 sec, and 2 min at 72~
tions of bands with different mobilities) among the individuals of a given population or species (Fig. 10). It is then only necessary to identify the bands of different patterns by sequencing. Individuals with the same pattern are assumed to carry the same Mhc alleles. Using this approach, it is possible to screen hundreds of samples within a short time and with a m i n i m u m of expenditure. For the approach to work, however, it is necessary to show that trans-species Mhc polymorphism is indeed widespread in a recently arisen species flock, such as that of Lake Victoria haplochromines. Although the full extent of Mhc polymorphism among cichlid fishes is not known, data obtained thus far indicate that this condition is fulfilled in the flocks of the Great East African lakes (Fig. 11). Frequent sharing of alleles between different species has been documented for both Lake Victoria, with its satellites, and Lake Malawi (Klein et al., 1993a; Ono et al., 1993b; E. Malaga, S. Kastilan, H. Sfiltmann, and J. Klein, unpublished data). Once frequencies of these shared alleles are determined by examining representative samples of the different species, it will be possible to begin reconstructing the phylogenies of the haplochromines from genetic distances. Because such reconstructions are based on a single locus (or a cluster of closely linked loci), they will be associated with large standard errors; however, when combined with data on microsatellite loci (see Stiltmann and Mayer, 1997), they should provide solid grounds for a molecular interpretation of Lake Victoria haplochromine phylogeny and classification. In addition to sharing of identical alleles between
279
species, sharing of nearly identical alleles (i.e., those differing by only one or very few substitutions, in contrast to most Mhc alleles, which differ by many substitutions; in the case of haplochromines by as many as 47 substitutions in exon 2 alone) has also been observed. These alleles are presumably the result of recent divergence events, some of which may have occurred before each flock began to radiate, others after radiation. There are two ways of handling the nearly identical alleles in the present context. One could either treat each allele separately, even if it differs from another gene by a single substitution (as with genes at loci other than Mhc), or one could pool related genes and treat each group of closely related genes as one allele. The nature of the data will determine which of these two ways is the most informative. Substitutions in similar genes can also be treated as separate characters in parsimony analysis which may help define major lineages in the haplochromine flock.
VII. Conclusion Although the study of the fish Mhc began only recently (Hashimoto et al., 1990), Mhc genes have now been cloned from at least a dozen species representing four orders of bony fishes (Cypriniformes, Salmoniformes, Cyprinodontiformes, and Perciformes; see Table I). Mhc genes have also been identified in representative species of cartilaginous fishes, Ginglymostoma cirratum (Kasahara et al., 1992; Bartl and Weissman, 1994) and Triakis scylla (Hashimoto et al., 1992). In most species, characterization has not progressed beyond initial identification, but in a few (namely the zebrafish, Danio rerio, and some of the cichlid species of the East African Great Lakes), it has provided information about expression, exon-intron organization, linkage relationships, polymorphism, distribution of variability, and other characteristics (Klein et al., 1993a; Ono et al., 1992, 1993a,b,c,d; Stiltmann et al., 1993, 1994, 1995; Takeuchi et al., 1995; Figueroa et al., 1995). As more species are covered, progress in the fish Mhc study can be expected to accelerate and be accompanied by an increased use of the Mhc in resolving problems connected with fish taxonomy and phylogeny along the lines described in this chapter. The greatest contribution of the Mhc to fish systematic studies will probably be made by the analysis of recently radiating species flocks, where the Mhc will provide one of only a few tools available for the elucidation of molecular phylogenies. In this regard, the Mhc studies on the cichlid
280
JAN KLEIN et al. HaSa-V-131/4 0 vn-K-685/2 0 vl-G-602/20
Hapy-V-228/40 Havn-K-686/lO vl-G-603/1 9 l-V-161/2
Hapl-V-161/3 I Hapy-V-228/3 Oral-G-l(a) Oral-A-417 Hapy-V-152/1 Hapl-V-160/1 Hani-V-5Aa Hani-V-5Ab Oral-G-l(b) Hapy-V-152/2
-q L
Asal-V-284/1
Oral-G-2' Oral-A-416 Asal-G-760/1 Havl-G-603/3 Hasa-V- 131/3 Oral-A-418~420 L Oral-G-l(b)' Thsp-V-182d Oral-G-2 9 Oral-G- l (a )'~ Hapl-V-161/lO Hapy-V-229/lO Hapy-V-229/2 9 F Oral.G.2(a) 9 I L . Oral-A-417 9 Oral-G-l 9 '-- Thsp-V-182a@ Oral-G-2(b) ~ Oral-A-3(b) Oral-G-3(a) Asal-V-281/2
I
G
I
[-~ Hapy-V-228/1 [ . _ ~ Hani-V-FF268 Hani.~r.8Bb Hani'V'FF269
I
I
Oral-A-16
F Hasa-V-131/0
Hani-V-8Ba Hasa-V- 133/4 Hani-V-8Cb 9 Hasa-V- 133X/2 9 Hasa-V-133X/3 9 Hani-V-6Bb
Hani-V-6Aa O Hani-V-6Ab O Hani-V-6Ba Hani-V-7Ab I Hani-V-4Ca Hani-V-5Ba
I
I
i
I
0.0
0.05
0.1
0.15
Genetic distance
FIGURE 11 Evidence for trans-species Mhc class IIB gene polymorphism among haplochromine cichlids of the Lake Victoria basin. Genetic distances were calculated using the two-parameter method of Kimura (1980) on available exon 2 sequences. The tree was constructed by the neighbor-joining method of Saitou and Nei (1987). Full circles indicate identical sequences in different species, whereas open circles indicate sequences differing in one nucleotide substitution. The sequences are from Klein et al. (1993a), Ono et al. (1993b), and E. Malaga, S. Kastilan, H. Sfiltmann, and J. Klein (unpublished data). Species abbreviations: Thsp, Thoracochromis sp. (formerly Asnu, Astatotilapia nubila); AsaI, Astatoreochromis alluaudi; Hapy, Haploch-
romis pyrocephalus; Hani, H. nigricans; Hasa, H. sauvagii; Hapl, H. plagiodon; Havn, H. venator; Havl, H. velifer; Oral, Oreochromis alcalicus; Alal, O. alcalicus alcalicus. Lake abbreviations: V, Victoria; K, Kayugi; N, Nabugabo; A, Natron; G, Magadi.
16. M h c in Fish P h y l o g e n y
TABLE I
Species
List of Cloned Fish Mhc Genes and Gene Segments
Class I A
Chondrichthyes Triakis scylla Ginglystoma cirratum
Osteichthyes Cyprinus carpio
281
Class I B
Class II A
Class II B
One exon cDNA Genomic (one exon)
cDNA
Hashimoto et al. (1992) Kasahara et al. (1992) Bartl and Weissman (1994)
cDNA Genomic (partial)
cDNA
cDNA
cDNA Genomic (three exons)
Brachydanio rerio
cDNA Genomic (partial)
cDNA Genomic
cDNA Genomic (one exon)
cDNA Genomic (two exons)
Salmo salar
cDNA Genomic (one exon)
Oncorhynchus mykiss Aulonacara hansbaenschi and African cichlids
cDNA Genomic (one exon) cDNA cDNA Genomic
cDNA Genomic
Oreochromis niloticus Gymnogeophagus australis
Genomic (partial)
Perca fluviatilis Gymnocephalus cernua Melanotaenia trifasciata Gasterosteus aculeatus Fugu rubripes Morone saxatalis
cDNA
Poecilia reticulata
Latimeria chalumnae
cDNA Genomic (two exons) Genomic (three exons)
fishes of the East African Great Lakes may serve as a model for similar studies of other species flocks.
Acknowledgments We thank Ms. Lynne Yakes as well as Ms. Donna Devine for editorial assistance and Ms. Anica Milosev for the preparation of the computer graphics. The experimental work mentioned in this contribution was supported, in part, by Grant A123667 from the National Institutes of Health, Bethesda, Maryland.
Reference
Genomic (intron 3 and flanks) Genomic (intron 3 and flanks) Genomic (intron 3 and flanks) Genomic (intron 3 and flanks) Genomic (intron 3 and flanks) Genomic (intron 3 and flanks) cDNA
cDNA Genomic (one exon)
Van Erp et al. (1996a,b) Dixon et al. (1993) Ono et al. (1993d) Hashimoto et al. (1990) Ono et al. (1992) S~iltmann et al. (1993, 1994) Takeuchi et al. (1995) Grimholt et al. (1993) Hordvik et al. (1993) Grimholt et al. (1994) Glamann (1995) Klein et al. (1993a); Sato et al. (1997) Ono et al. (1993b) Dixon et al. (1993) Figueroa et al. (1995) Figueroa et al. (1995) Figueroa et al. (1995) Figueroa et al. (1995) Figueroa et al. (1995) Lim and Brenner (1995) Walker and McConnell (1994) Hardee et al. (1995) Sato et al. (1995)
Betz et al. (1994)
References Barel, C. D. N., Van Oijen, M. J. P., Witte, F., and Witte-Mass, E. 1977. An introduction to the taxonomy and morphology of the haplochromine cichlidae from Lake Victoria. Neth. J. Zool. 27: 333-389. Bartl, S., and Weissman, I. 1994. Isolation and characterization of major histocompatibility complex class II B genes from the nurse shark. Proc. Natl. Acad. Sci. USA 91:262-266. Betz, U. A. K., Mayer, W. E., and Klein, J. 1994. Major histocompatibility complex class I genes of the coelecanth Latimeria chalumnae. Proc. Natl. Acad. Sci. USA 91:11065-11069.
282
JAN KLEIN et al.
Cavalli-Sforza, L. L., Menozzi, P., and Piazza, A. 1994. "The History and Geography of Human Genes." Princeton University Press, Princeton, NJ. Dixon, B., R. J. M. Stet, Van Erp, S. H. M., and Pohajdak, B. 1993. Characterization of ~2-microglobulin transcripts from two teleost species. Immunogenetics 38:27-34. Figueroa, F., Ono, H., Tichy, H., O'hUigin, C., and Klein, J. 1995. Evidence for insertion of a new intron into an Mhc gene of perch-like fish. Proc. R. Soc. Lond. B 259:325-330. Fryer, G., and Iles, T. D. 1972. "The Cichlid Fishes of the Great Lakes of Africa. TFH Publications, Neptune City, NJ. Glamann, J. 1995. Complete coding sequence of rainbow trout Mhc I113 chain. Scand. J. Immunol. 41: 365-372. Greenwood, P. H. 1981. "The Haplochromine Fishes of the East African Lakes." Cornell University Press, Ithaca, NY. Grimholt, U., Hordvik, I., Fosse, V. M., Olsaker, I., Endresen, C., and Lie, f~. 1993. Molecular cloning of major histocompatibility complex class I cDNAs from Atlantic salmon (Salmo salar). Immunogenetics 37: 469-473. Grimholt, U., Olsaker, I., De Vries Linstrom, C., and Lie, f~. 1994. A study of variability in the MHC class II ]31 and the MHC class I a2 domain exons of Atlantic salmon (Salmo salar). Anim. Genet. 25:147-153.
Hardee, J. J., Godwin, U., Benedetto, R., and McConnell, T. J. 1995. Major histocompatibility complex class II A gene polymorphism in the striped bass. Immunogenetics 41:229-238. Hashimoto, K., Nakanishi, T., and Kurosawa, Y. 1990. Isolation of carp genes encoding major histocompatibility complex antigens. Proc. Natl. Acad. Sci. USA 87:6863-6867. Hashimoto, K., Nakanishi, T., and Kurosawa, Y. 1992. Identification of a shark sequence resembling the major histocompatibility complex class I a3 domain. Proc. Natl. Acad. Sci. USA 89:22092212. Hordvik, I., Grimholt, U., Fosse, V. M., Lie, f~, and Endresen, C. 1993. Cloning and sequence analysis of cDNAs encoding the MHC class II ]3 chain in Atlantic salmon (Salmo salar). Immunogenetics 37: 437-441. Hughes, A. L., and Nei, M. 1988. Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature 335:167-170. Johnson, T. C., Scholz, C. A., Talbot, M. R., Kelts, K., Ricketts, R. D., Ngobi, G., Beuning, K., Ssemmanda, I., and McGill, J. W. 1996. Late Pleistocene desiccation of Lake Victoria and rapid evolution of cichlid fishes. Science 273:1091-1093. Kasahara, M., McKinney, E. C., Flajnik, M. F., and Ishibashi, T. 1993. The evolutionary origin of the major histocompatibility complex: Polymorphism of class II a chain genes in the cartilaginous fish. Eur. J. Immunol. 23:2160-2165. Kasahara, M., Flajnik, M. F., Ishibashi, T., and Natori, T. 1995. Evolution of the major histocompatibility complex: A current overview. Transplant. Immunol. 3:1-20. Kasahara, M., Vazquez, M., Sato, K., McKinney, E. C., and Flajnik, M. F. 1992. Evolution of the major histocompatibility complex: Isolation of a class II A gene from the cartilaginous fish. Proc. Natl. Acad. Sci. USA 89: 6688-6692. Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111 - 120. Kimura, M. 1983. "The Neutral Theory of Molecular Evolution." Cambridge University Press, Cambridge, UK. Klein, D., Ono, H., O'hUigin, C., Vincek, V., Golschmidt, T., and Klein, J. 1993a. Extensive MHC variability in cichlid fishes of Lake Malawi. Nature 364: 330-334. Klein, J. 1986. "Natural History of the Major Histocompatibility Complex." Wiley, New York. Klein, J. 1987. Origin of major histocompatibility complex polymor-
phism: The trans-species hypothesis. Hum. Immunol. 19:155162. Klein, J., and O'hUigin, C. 1993. Composite origin of major histocompatibility complex genes. Curr. Opin. Genet. Dev. 3:923-930. Klein, J., and O'hUigin, C. 1995. Class II B Mhc motifs in an evolutionary perspective. Immunol. Rev. 143:89-111. Klein, J., Ono, H., Klein, D., and O'hUigin, C. 1993b. The accordion model of Mhc evolution. In "Progress in Immunology" (J. Gergely and G. Petranyi, eds.), Vol. 8, pp. 137-143. Springer-Verlag, Heidelberg. Lauder, G. V., and Liem, K. F. 1983. The evolution and interrelationships of the actinopterygian fishes. Bull. Mus. Comp. Zool. 150: 95-197. Li, W.-H., Wolfe, K. H., Sourdis, J., and Sharp, P. M. 1987. Reconstruction of phylogenetic trees and estimation of divergence times under nonconstant rates of evolution. Cold Spring Harbor Symp. Quant. Biol. 52:847-856. Lim, E. H., and Brenner, S., 1995. Sequence analysis of Mhc class II ]3-like fragments in the pufferfish, Fugu rubripes. Immunogenetics, 42: 432-433. Meyer, A. 1993. Phylogenetic relationships and evolutionary processes in East African cichlid fishes. Trends Ecol. Evol. 8:279-284. Meyer, A. 1995. Molecular evidence on the origin of tetrapods and the relationships of the coelacanth. Trends Ecol. Evol. 10:111-116. Meyer, A., and Dolven, S. I. 1992. Molecules, fossils, and the origin of tetrapods. J. Mol. Evol. 35:102-113. Meyer, A., Kocher, T. D., Basasibwaki, P., and Wilson, A. C. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature 347:550-553. Meyer, A., and Wilson, A. C. 1990. Origin of tetrapods inferred from their mitochondrial DNA affiliation to lungfish. J. Mol. Evol. 31: 359-364. Mfiukova-Fajdelova, M., Satta, Y., O'hUigin, C., Mayer, W. E., Figueroa, F., and Klein, J. 1994. Alu elements of the primate major histocompatibility complex. Mamm. Genome 5: 405-415. O'hUigin, C. 1995. Quantifying the degree of convergence in primate Mhc-DRB genes. Immunol. Rev. 143:123-140. Okamura, K., Nakanishi, T., Kurosawa, Y., and Hashimoto, K. 1993. Expansion of genes that encode MHC class I molecules in cyprinid fishes. J. Immunol. 151:188-200. Ono, H., Figueroa, F., O'hUigin, C., and Klein, J. 1993a. Cloning of the ]32-microglobulin gene in the zebrafish. Immunogenetics 38: 1-10. Ono, H., Klein, D., Vincek, V., Figueroa, F., O'hUigin, C., Tichy, H., and Klein, J. 1992. Major histocompatibility complex class II genes of zebrafish. Proc. Natl. Acad. Sci. USA 89:11886-11890. Ono, H., O'hUigin, C., Tichy, H., and Klein, J. 1993b. Majorhistocompatibility-complex variation in two species of cichlid fishes from Lake Malawi. Mol. Biol. Evol. 10:1060-1072. Ono, H., O'hUigin, C., Vincek, V., and Klein, J. 1993c. Exon-intron organization of fish major histocompatibility complex class IIB genes. Immunogenetics 38:223-234. Ono, H., O'hUigin, C., Vincek, V., Stet, R. J. M., Figueroa, F., and Klein, J. 1993d. New/3 chain-encoding Mhc class II genes in the carp. Immunogenetics 38:146-149. Orita, M., Iwahana, H., Kanazawa, H., Hayashi, K., and Sekiya, T. 1989. Detection of polymorphisms of human DNA by gel electrophoresis as single-strand conformation polymorphism. Proc. Natl. Acad. Sci. USA 86:2766-2770. Regan, C. T. 1922. The cichlid fishes of Lake Victoria. Proc. Zool. Soc. 11:157-191. Sage, R. D., Loiselle, P. V., Basasibwaki, P., and Wilson, A. C. 1984. Molecular versus morphological change among cichlid fishes of Lake Victoria. In "Evolution of Fish Species Flocks" (A. A. Echelle and I. Kornfield, eds.), pp. 185-20. University of Maine at Orono Press, Orono.
16. Mhc in Fish Phylogeny Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Sato, A., Figueroa, F., O'hUigin, C., Reznick, D. N., and Klein, J. 1995. Major histocompatibility complex genes of the guppy, Poecilia reticulata: Identification and cloning. Immunogenetics 43:38-49. Sato, A., Klein, D., S~iltmann, H., Figueroa, F., O'hUigin, C., and Klein, J. 1997. Class I Mhc genes of cichlid fishes: Identification, expression, and polymorphism. Immunogenetics, in press. Satta, Y., Mayer, W. E., and Klein, J. 1996. Evolutionary relationship of HLA-DRB genes inferred from intron sequences. J. Mol. Evol. 42: 648-657. Satta, Y., Takahata, N., Sch6nbach, C., Gutknecht, J., and Klein, J. 1991. Calibrating evolutionary rates at major histocompatibility complex loci. In Molecular Evolution of the Major Histocompatibility Complex" (J. Klein and D. Klein, eds.), pp. 51-62. Springer-Verlag, Heidelberg. Srivastava, R., Ram, B. P., and Tyle, P. (eds.). 1991. "Immunogenetics of the Major Histocompatibility Complex." VCH Publishers, New York. Stager, J. C., Reinthal, P. N., and Livingstone, D. A. 1986. A 25,000 year history for Lake Victoria East Africa, and some comments on its significance for the evolution of cichlid fishes. Freshwat. Biol. 16:15-19. S~iltmann, H., and Mayer, W. E. 1997. Reconstruction of cichlid fish phylogeny using nuclear DNA markers. In, "Molecular Systematics of Fishes" (T. D. Kocher and C. A. Stepien, eds.) pp. 3951. Academic Press, San Diego.
283
S~iltmann, H., Mayer, W. E., Figueroa, F., O'hUigin, C., and Klein, J. 1993. Zebrafish Mhc class II ~ chain-encoding genes: Polymorphism, expression, and function. Immunogenetics 38: 408-420. S~iltmann, H., Mayer, W. E., Figueroa, F., O'hUigin, C., and Klein, J. 1994. Organization of Mhc class IIB genes in the zebrafish (Brachydanio rerio). Genomics 23:1-14. S~iltmann, H., Mayer, W. E., Figueroa, F., Tichy, H., and Klein, J. 1995. Phylogenetic analysis of cichlid fishes using nuclear DNA markers. Mol Biol Evol. 12:1033-1047. Takahata, N., Satta, Y., and Klein, J. 1992. Polymorphism and balancing selection at major histocompatibility complex loci. Genetics 130:925-938. Takeuchi, H., Figueroa, F., O'hUigin, C., and Klein, J. 1995. Cloning and characterization of class I Mhc genes of the zebrafish, Brachydanio rerio. Immunogenetics 42: 77-84. Van Erp, S. H. M., Dixon, B., Figueroa, F., Egberts, E., and Stet, R. 1996a. Identification and characterization of a new major histocompatibility complex class I gene in carp (Cyprinus carpio L.). Immunogenetics 44: 49-61. Van Erp, S. H. M., Egbert, E., and Stet, R. J. 1996b. Characterization of class II A and B genes in a gynogenetic carp clone. Immunogenetics 44:192- 202. Walker, R. B., and McConnell, T. J. 1994. Variability in an MhcMosa class II ]3 chain-encoding gene in the striped bass (Morone saxatilis). Dev. Comp. Immunol. 18:325-342. Witte, F., and Van Oijen, M. J. P. 1990. Taxonomy, ecology and fishery of Lake Victoria haplochromine trophic groups. Zool. Verh. Leiden 262:1-47.
This Page Intentionally Left Blank
C H A P T E R
17 The Phylogenetic Utility of the Mitochondrial Cytochrome b Genefor Inferring Relationships among Actinopterygian Fishes CHARLES LYDEARD and KEVIN J. ROE Aquatic Biology Program, University of Alabama Department of Biological Sciences Tuscaloosa, Alabama 35487
I. I n t r o d u c t i o n
quinol:cytochrome c reductase, an enzyme present in the respiratory chain of mitochondria. Based on the analysis of protein sequences and studies of mutants, the cytochrome b gene is one of the best characterized proteins in terms of its structure and function (see Esposti et al., 1993). Studies of amino acid variation, in conjunction with knowledge of inferred structural models of cytochrome b, reveal some highly conservative regions (e.g., the outer surface of the protein) and other regions that exhibit considerable variabilility (e.g., transmembrane and innermembrane region) (Irwin et al., 1991; Esposti et al., 1993). Like other nuclear and mitochondrial protein-coding genes, cytochrome b exhibits evolutionary rate variation among codon positions and in types of nucleotide substitutions. For example, transitions predominate over transversions by a factor of at least 10 (Brown et al., 1982). Furthermore, transitions at third codon positions do not usually result in amino acid substitutions. As a consequence, third codon positions are under fewer selective constraints and hence evolve faster than first and second positions. The presence of both slowly and rapidly evolving codon positions and conservative and variable regions within the cytochrome b gene suggest
The introduction of conserved "universal" primers (Kocher et al., 1989), which permit amplification of specific regions of homologous DNA via the polymerase chain reaction (PCR) (Saiki et al., 1985), has offered tremendous opportunities for macro- and microevolutionary studies for a wide taxonomic array of species. Although Kocher et al. (1989) introduced primers that can amplify portions of three different mitochondrial genes, they focused on a 307-bp segment of the cytochrome b gene. They concluded that the short cytochrome b gene sequence is "a versatile source of phylogenetic information," thus setting the stage for many future molecular systematic studies. The cytochrome b gene is found in the mitochondrial genome of nearly all eukaryotic organisms and in many diverse prokaryotes, indicating a very ancient origin (Esposti et al., 1993). Indeed, the presence of cytochrome b and other mitochondrial genes in prokaryotes led, in part, to the now widely accepted endosymbiotic model of eukaryotic origins (Margulis, 1970; Yang et al., 1985). Cytochrome b is a transmembrane protein that is the central catalytic subunit of ubi-
MOLECULAR SYSTEMATICS OF FISHES
285
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
286
CHARLES LYDEARD A N D KEVIN J. ROE
that the gene may be useful for a diversity of systematic questions. Since 1989, DNA sequences of the mitochondrial cytochrome b gene have been used for many phylogenetic studies, particularly vertebrates, including mammals (e.g., Irwin et al., 1991; Krajewski et al., 1992), birds (e.g., Edwards et al., 1991; Avise et al., 1994), reptiles (e.g., Lamb and Lydeard, 1994; Lamb et al., 1994), amphibians (e.g., Moritz et al., 1992), and of course, fishes. Among fishes, the cytochrome b gene has been used to address many phylogenetic questions, including relationships among closely related cichlids of Lake Victoria in Africa (Meyer et al., 1990) to deep phylogenetic questions such as the relationships among living sarcoptyergian fishes and tetrapods, which diverged over 400 million years ago (Meyer and Wilson, 1990; Meyer
Molecular Systematic Studiesa Conducted on Fishes (i.e., the Entire Paraphyletic Assemblage) That Have Employed Cytochrome b Gene Data b
TABLE I
Kocher et al. (1989) Meyer and Wilson (1990) Meyer et at. (1990) McVeigh et al. (1991) Meyer et al. (1991) Normark et al. (1991) Bernardi and Powers (1992) Grachev et al. (1992) Martin et al. (1992) Sturmbauer and Meyer (1992) Block et al. (1993) Sturmbauer and Meyer (1993) Hedges et al. (1993) Meyer et al. (1994) Orti et al. (1994) Patarnello et al. (1994) Sturmbauer et al. (1994) Zhu et al. (1994) Bernardi and Powers (1995) Grant and Riddle (1995) Lydeard et al. (1995a,b) Slobodyanyuk et al. (1995) Schmidt and Gold (1995)
Cichlids Sarcopterygians African cichlids Salmo salar
African cichlids Neopterygian fishes Elasmobranchs Cottids Elasmobranchs
307-bp 5' region 360-bp5' region 363-bp5' region 295-bp 5' region 363-bp5' region 294-bp 5' region 307-bp5' region 382-bp 5' region Completegene
African cichlids Scombroids
400-bp5' region 600-bp 5' region
African cichlids Sarcopterygians Poeciliids Gasterosteids Salmonids African cichlids Melanotaeniids Funduline killifishes Killifishes Poeciliids Cottids Cyprinids
402-bp5' region 282-bp5' region 360-bp 5' region 747-bp 5' region 249-bp 5' region 402-bp5' region 351-bp5' region 270-bp 5' region 314-bp 5' region 402-bp 5' region 402-bp 5' region 512-bp 5' region
aThis list is not exhaustive and does not necessarily include all papers that stemmed from a single data set nor does it include all papers that have used a single ray-finned fish as an outgroup for examining relationships among tetrapods. bStudies are listed chronologicallyand include the group of fishes studied and the amount of DNA sequence data (note the general increase in the number of studies over the years employing the cytochrome b gene).
and Dolven, 1992; Hedges et al., 1993). Table I shows some examples of phylogenetic studies of fishes using cytochrome b DNA sequence data. Due to the widespread use of the cytochrome b gene in many molecular systematic studies and its relatively well understood structure and function, one may think it is the gene of choice, regardless of the nature of the question. However, Meyer (1994) has suggested that the popularity of the cytochrome b gene is actually a historical accident. In other words, if the focus of the Kocher et al. (1989) study had been on the 12S rRNA gene or on another protein-coding gene, use of the cytochrome b gene may not have ever been as prevalent as it is now. One could also argue that its popularity and widespread use are partly a function of how extremely well the primers described by Kocher et al. (L14841/H15149) and later the more popular primer pair [L14724 (P~i/ibo, 1990)/H15149] worked for both experienced and beginner molecular systematists. The large numbers of sequences now available for this gene allow detailed phylogenetic discrimination. Furthermore, the wealth of sequence data that has been generated for so many different types of phylogenetic questions has provided opportunities to examine the molecular evolution of the cytochrome b gene in great detail. With these studies, some investigators have concluded that the gene is of limited use for deeper phylogenic questions (e.g., Graybeal, 1993, 1994; Hillis and Huelsenbeck, 1992; Meyer, 1994). The objective of this chapter is to test the phylogenetic utility of this gene more fully by estimating relationships of the monophyletic Actinopterygii or ray-finned fishes at different hierarchic levels. Modern actinopterygians are the most diverse of all vertebrate groups and include more than 25,000 species (Nelson, 1994). Although a well-corroborated phylogeny does not exist for the entire group, a concerted effort by many systematists since the early 1970s, relying principally on morphological data and cladistic methodology, has markedly advanced our understanding of phylogenetic relationships. In order to assess the utility of the cytochrome b gene, actinopterygian fishes representing a diverse array of taxa and divergence times were selected. Figure 1 shows a phylogenetic hypothesis of actinopterygian fishes including only the taxa examined in this chapter, which is based on morphological data by Lauder and Liem (1983), Rosen (1985), Stiassny (1986, 1991), Sanford (1990), Begle (1991), and Johnson (1992). Estimates of divergence times based on fossil evidence (Carroll, 1988; Benton, 1990; Patterson, 1993) are provided for some of the nodes showing the time span covered in this study (Fig. 1). Although incomplete,
17. Cytochrome b Gene
287
Chondrichthiomorphi 438-450 I
405-412
Tetrapoda
Vertebrata J [ _
Chondrostei 200
Osteichthyes Actinopterygii
._. Cypriniformes
J
Neopterygii Teleostei Euteleostei
Salmonoidei Paracanthopterygii 75
Percoidei
Neoteleostei Euryopterygii 55 Ctenosquamata i Acanthomor ~ha ~
Scombroidei Pomacentridae
Acanthopterygii Percomor ha Perciformes Labroidei -
36
Old World Cichlids Geophagines
Cichlidae- L New World Cichlids -J
Cichlasomine Gp A Cichlasomine Gp B
I A morphological-based phylogenetic hypothesis of actinopterygian fishes based on Lauder and Liem (1983), Rosen (1985), Stiassny (1986, 1991), Sanford (1990), Begle (1991), and Johnson (1992).
FIGURE
inclusion of this broad assemblage of taxa will enable a better assessment of how the gene performs for different taxonomic levels and will suggest which nucleotide substitutions reflect the most reliable phylogenetic signal. In an ideal setting, the best way to evaluate the phylogenetic utility of a gene tree would be comparison with the known species tree or at least a well-corroborated phylogeny based on an independently derived data set. Unfortunately, although a phylogenetic hypothesis of actinopterygian fishes exists (Fig. 1), it is not well corroborated; however, there are certain aspects that are agreed upon by most ichthyological systematists. Therefore, the performance of the cytochrome b gene will be evaluated using taxonomic congruence. The observation of congruent patterns between the molecular phylogeny and the morphological-based phylogeny indicates that the two inde-
pendently derived phylogenies have converged on the best estimate of the true phylogeny. Areas of incongruence between morphological- and molecular-based phylogenetic hypotheses may be due to several factors, e.g., (1) the gene tree is incorrect and does not provide useful phylogenetic information at that particular hierarchic level; (2) the morphologicalbased tree is incorrect; (3) both trees are correct, but neither tree necessarily reflects the species tree; or (4) both trees are incorrect because data are ambiguous. Areas of incongruence found between molecularand morphological-based phylogenies are discussed in the hope that additional morphological and molecular data will eventually reveal the factor that attributed the most to the incongruence exhibited in this "realworld" situation. Although this chapter focuses on the cytochrome b gene, this study may serve as a model for further studies that examine the utility of other genes.
288
CHARLES LYDEARD AND KEVIN J. ROE TABLE II
II. Materials and M e t h o d s
Table II lists the 31 actinopterygian species examined in this study and their current classification (Nelson, 1994). All 12 Neotropical cichlid species and the damselfish (Pomacentrus sp. Pomacentridae) were collected from the wild (77% of specimens) or obtained from reliable aquarists (actual collection locales or sources are available from C. Lydeard). These 13 specimens were preserved in >75% ethanol, and total genomic DNA was isolated by standard phenol/chloroform extraction. Approximately 100 ng of genomic DNA provided a template for double-stranded reactions via the PCR in 25/~1 of a reaction solution containing each dNTP at 0.1 mM, cytochrome b primer L14724 (P/i/ibo, 1990) and H15915 (Kocher et al., 1989) at 1.0 ~M, 4.0 mM MgCI2, 2.5 ~1 10• reaction buffer, and 1.25 units of AmpliTaq polymerase. Reactions were amplified for 32 cycles, each involving denaturation at 92~ for 45 sec, annealing at 52~ for 45 sec, and extension at 72~ for 45 sec. Single-stranded DNA was obtained by asymmetric amplification (Gyllensten and Erlich 1988), using primer L14724 in limited quantity, concentrated on Millipore Ultrafree MC filters, and sequenced using the Sequenase version 2 kit (U.S. Biochemical) with 35S-labeled dATP. Overlapping primers L14724, L14952, L15093, L15162, L15299, L15379, L15567, and L15767 were used as sequencing primers for each specimen (Table III). The mitochondrial cytochrome b gene sequences for the remaining 18 ray-finned fishes (Table II) were retrieved from GenBank and include the following: white sturgeon, Acipenser transmontanus (Brown et al., 1989; embl X14944); common carp, Cyprinus carpio (Chang et al., 1994; embl X61010); blacktip shiner, Lythrurus atrapiculus; cherryfin shiner, Lythrurus roseipinnis; golden shiner, Notemigonus crysoleucas (Schmidt and Gold, 1993, unpublished; gb U17271, X66456, U01318, respectively); flat loach, Crossostoma lacustre (Tzeng et al., 1992; gb M91245); rainbow trout, Oncorhynchus mykiss (Zardoya et al., unpublished observations; GenBank L29771); Atlantic cod, Gadus morhua (Johansen and Johansen, 1994; embl X76366); yellow bass, Morone mississippiensis; stripetail darter, Etheostoma kennicotti (Song, 1994); largemouth bass, Micropterus salmoides TABLE H Taxonomic Position of Actinopterygian Taxa Included in the Present Study
Chondrostei Acipenser transmontanus (white sturgeon) Neopterygii Teleostei (continues)
(Continued)
Euteleostei Ostariophysi Cypriniformes Cyprinidae Cyprinus carpio (common carp) Lythrurus atrapiculus (blacktip shiner) L. roseipinnis (cherryfin shiner) Notemigonus crysoleucas (golden shiner) Balitoridae Crossostoma lacustre (flat loach) Protacanthopterygii Salmonoidei Salmonidae Oncorhynchus mykiss (rainbow trout) Neoteleostei Acanthomorpha Paracanthopterygii Gadiformes Gadus morhua (Atlantic cod) Acanthopterygii Percomorpha Percoidei Moronidae Morone mississippiensis (yellow bass) Dicentrarchus labrax (temperate bass) Centrarchidae Micropterus salmoides (largemouth bass) Percidae Etheostoma kennicotti (stripetail darter) Carangidae Trachurus trachurus (horse mackerel) Sparidae Boops boops (bogue) Labroidei Cichlidae Old World cichlids Oreochromis mossambicus (Mozambique tilapia) New World cichlids Geophagines Satanoperca jurapari Cichlasomine group A "Cichlasoma (Archocentrus)'" spilurum "C. (Amphilophus)" citrinellum "C. (Amphitophus)" labiatum "C. (Thorichthys)'" aureum "C. (Thorichthys)" cf. aureum "C. (Thorichthys)'" ellioti "'C. (Thorichthys)'" meeki "C. (Nandopsis)'" dovii "C. (Herichthys)" carpintis "C. (Herichthys)" labridens Cichlasomine group B Cichlasoma portalegrense Pomacentridae Pomacentrus sp. (three-striped damselfish) Scombroidei Scombridae Sarda sarda (bonito) Scomber scombrus (common mackerel) Thunnus thynnus (bluefin tuna)
17. Cytochrome b Gene
289
TABLE III Amplification and Sequencing Primers L14724 L14724 (Gludge-L) L14952 L15162 L15299 L15379 L15767 H15149 H15915
5'-cgaagcttgatatgaaaaaccatcgttg-3' 5'-tgacttgaaraaccaycgttg-3' 5'-tcytcygtdrcccayat-3' 5'-gcaagcttctaccatgaggacaaatatc-3' 5'-gattctttgccttccactt-3' 5'-gcagccataacaataattca-3' 5'-tattytgactcctaattgcaga-3' 5'-aaactgcagcccctcagaatgatatttgtcctca-3' 5'-aactgccagtcatctccggtttacaagac-3'
(Whitmore et al., 1994; gb L14074); bonito, Sarda sarda; bluefin tuna, Thunnus thynnus; common mackerel, Scomber scombrus; Mozambique tilapia, Oreochromis mossambicus; temperate bass, Dicentrarchus labrax; bogue, Boops boops; and horse mackerel, Trachurus trachurus (Cantatore et al., 1994; embl X81562 to X81568, respectively). To the best of the author's knowledge, the data set includes all available ray-finned fish taxa or their representatives from well-corroborated clades (e.g., Cypriniformes, Cichlidae) that have complete cytochrome b gene sequence data. In addition to the actinopterygians, DNA sequences for cytochrome b were obtained from GenBank for the sharks Carcharhinus plumbeus (Martin et al., 1992; GenBank L08032) and Galeocerdo cuvier (Martin et al., 1992; gb L08034) and for the following tetrapods: African clawed frog, Xenopus laevis (Roe et al., 1985; gb M10217), opossum (Monodelphis domestica, Ma et al., 1993; gb X70674), human (Anderson et al., 1981), and pygmy right whale (Caperea marginata, Arnason and Gullberg, 1994; embl X75586) to serve as outgroup taxa. All DNA sequences were entered into the ESEE (the eyeball sequence editor) program (XESEE version 3.0) of Cabot and Beckenbach (1989). Nucleotide variation and substitution patterns were examined using the software package MEGA (Kumar et al., 1993; version 1.01). Phylogenies were estimated by maximum parsimony analysis using the heuristic search procedure (10 replications) of PAUP (version 3.1; Swofford, 1993). Bootstrapping (Felsenstein, 1985) was employed to measure the internal stability of data using 200 iterations. The skewness of tree length distributions as a measure of information content (Hillis and Huelsenbeck, 1992) was tested by generating 10,000 random trees. DNA sequences were submitted to GenBank (accession numbers are U88853-U88865).
III. R e s u l t s a n d D i s c u s s i o n
A. Cytochrome b Sequence V a r i a t i o n Sequences for the entire cytochrome b gene for 31 actinopterygian fishes and six outgroup taxa (two
P~i~ibo(1990) Palumbi et al. (1991) Lydeard et al. (1995a,b) Taberlet et al. (1992) Present study Present study Present study Kocher et al. (1989) Irwin et al. (1991)
sharks, frog, opossum, pygmy right whale, and human) may be requested from the authors. The predicted amino acid translations largely follow expected patterns for cytochrome b (Esposti et al., 1993). However, exceptions on conserved amino acid residues were found in three taxa: a valine (instead of a methionine) was found for residue 139 in O. mossambicus, an asparagine (instead of a aspartic acid) was found for residue 253 in opossum, and leucines were found (instead of phenylalanines) in residues 275 and 282 for Galeocerdo. These amino acid substitutions were derived from single base changes and may represent plausible errors (Esposti et al., 1993). However, in the authors' study, these were used as originally published and as submitted to GenBank. The unique nature of each substitution had no effect on phylogenetic relationships. Table IV shows the number of variable and phylogenetically informative sites (i.e., nucleotide sites at which there are at least two different kinds of nucleotides, each represented at least twice) for each codon position of the cytochrome b gene for various putative monophyletic groups representing different times of divergence. As expected, most variation is found in the
TABLE IV Number of Variable and Phylogenetically Informative (in Parentheses) Nucleotide Substitutions of Cytochrome b for Various Putative Monophyletic Groups a Codon position Taxa
First
Second
Third
Aminoacid
Cichlasomine group A New World cichlids Cichlidae Percomorpha Actinopterygii All taxa
37 (24)
9 (3)
216 (172)
25 (14)
58 (32)
19 (5)
288 (200)
43 (19)
71 (36) 140 (99) 159 (132) 194 (156)
24 (11) 62 (39) 75 (49) 113 (76)
304 (223) 370 (351) 376 (371) 377 (372)
57 (26) 119 (84) 141 (99) 194 (144)
aNumber of variable substitutions are provided for each codon position, and the total number of amino acid replacements is provided in the last column.
290
CHARLES L Y D E A R D A N D KEVIN J. ROE
sons within the Percomorpha. However, comparisons among sequences within percomorphs to those among more deeply divergent taxa (Actinopterygii and all taxa) reveal little increase in the number of variable and phylogenetically informative sites in the third position. Indeed, nearly all possible third positions (98 to 99%) are variable for all these deeper hierarchic levels (Table IV). If virtually all third positions are variable, then saturation has occurred, resulting in a decrease of phylogenetic signal. Scatter plots (Fig. 2) of pairwise genetic sequence differences (p distance) calculated for each codon position separately versus the number of transitions and
third codon position, with the least being found in the second codon position. For example, within the family Cichlidae, 71 (17.8%), 24 (6%), and 304 (76.1%) variable nucleotide substitutions were observed in the first, second, and third codon positions, respectively. Partitioning the number of variable and phylogenetically informative nucleotides into different hierarchic levels reveals a gradual increase in the amount of observed nucleotide variation for the first and second codon positions. The third codon position exhibits a gradual increase in the number of variable nucleotides among sequences within the recently diverged Cichlasomine group A taxa (Stiassny, 1991) up to compari-
6O o
50
C r
._~40
b- o 3"~.
' - ~ 30 E ~ z= ~ - 2 0
o "5
10 0
v
0.05
A
0.1
0.15
0.2
0.25
0.3
Genetic Distance at 1st Codon Position
40 o 35
ao
E= .,,,
z~
~.
B
.l" :
' 20
o
0
~ 0
. 0.02
o~
0 o oe.Go
~o
5
8
~o8o
!
o
o~Oo
o
:
~
' 0.06
0.04
:
:
:
:
0.08
0.1
0.12
0.14
........
: 0.16
Genetic Distance at 2nd Codon Position
140 o't
= O
9
120 9
.,,,..
= (=
100
,~'~ 8o E~
60
x
40
o ffl .Q
20
,
o
o ~O
9
I01t"
I,- o =~=
9
0
..1'=='"
0 oo
o
o
0 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Genetic Distance at 3rd Codon Position
FIGURE 2 Scatter plots showing number of substitutions (transitions and transversions) versus genetic difference (p distance) for each codon position (note: x and y axes show values calculated for each individual codon position). Transitions are black circles, transversions are open circles. (A) First codon position, (B) second codon postion, and (C) third codon position.
17. Cytochrome b Gene
number of transversions among all actinopterygian fishes and outgroup taxa for each codon position reveal some interesting patterns. For each codon position, transitions typically outnumber transversions as expected. For the first codon position (Fig. 2a), a relatively clear separation between the number of transitions and transversions at genetic distance values below approximately 10% is seen (e.g., comparisons among all members of the family Cichlidae). The number of transitions increases linearly until about 10%, at which point a reduction in the rate of increase is apparent. At approximately 22%, the numbers of transitions and transversions are about equal. This point of overlap occurs among comparisons of sharks to tetrapods and ray-finned fishes and tetrapods to ray-finned fishes. The reduction in the rate of increase of transitions beyond a genetic distance of 10% indicates that transitions may not provide reliable phylogenetic information due to site saturation. Few transversional differences exist among closely related taxa (0 to 8%); however, after approximately 10%, the first codon position exhibits a linear accumulation of transversions. The scatter plot of variation at the second codon position, which shows the absolute number of each type of substitution versus genetic distance (Fig. 2b), has markedly less variation than do the first and third positions (e.g., the family Cichlidae ranges from 0 to 7%). The second position exhibits a roughly linear increase of transitions and transversions with no signs of a rate decrease. Lack of a decrease in the number of transitions and transversions with increasing genetic distance suggests that saturation is not a substantial problem for evaluating deep phylogenetic relationships at the second codon position. The third codon position scatter plot of the absolute number of transitions and transversions versus genetic distance (Fig. 2c) reveals a clear demarcation between transitions and transversions, which show a linear accumulation with genetic distance up to approximately 35% difference. After about 35% there is a marked decline in the rate of increase in the number of transitions, and the number of transversions equals or outnumbers transitions. This indicates that saturation has occurred and needs to be considered in a phylogenetic analysis. Taxa separated by genetic distances of 35-50% include Old World and New World cichlids. Larger genetic distances include those between sharks and ray-finned fishes, tetrapods and sharks, and distantly related rayfinned fishes.
B. Base Compositional Bias Base compositional bias, which is unequal proportions of the four bases (G,A,T, and C), is common in
291
DNA sequences. For example, the cytochrome b gene of all fishes examined to date typically exhibits an antiG bias, particularly in the third codon position (Meyer, 1993). Table V summarizes base compositions for the 31 actinopterygian fishes and six outgroup taxa. The values of the nucleotide compositional bias index (Irwin et al., 1991) are very similar to the values reported for mammalian (Irwin et al., 1991) and avian (Kornegay et al., 1993) cytochrome b sequences and follow previously described observations for mitochondrial DNA base composition for other taxa (Brown, 1985; Kocher et al., 1989; Meyer, 1993). Although problems associated with base compositional bias are often recognized, particularly in association with nucleotide character-state reconstruction (e.g., character-state bias; Collins et al., 1994), these are problems that all systematists (including those using morphological characters) must contend with when analyzing data sets with characters that possess a biased distribution (e.g., outgroup 0 for most characters and all terminal taxa near the crown = 1; see Collins et al., 1994). More serious problems occur when the base compositional bias varies among taxa, resulting in potentially unreliable phylogenies. Most analytical methods, including parsimony, maximum likelihood, and neighbor joining, tend to group sequences of similar nucleotide composition together regardless of their evolutionary history (Lockhart et al., 1994). The amount of interspecific variation in base composition is reflected in the standard deviation values presented in Table V. Like other values reported for mammalian (Irwin et al., 1991) and avian (Kornegay et al., 1993) cytochrome b sequences, the standard deviation values are highest for third positions. This is expected as most third position changes tend to be silent substitutions and lowest for the more highly constrained second codon positions (Table V). C. A m i n o A c i d D i f f e r e n c e s
Many investigators translate their DNA nucleotide data to protein data for inferring deeper phylogenetic relationships. However, because the information from the three nucleotides is now collapsed into one datum, special consideration needs to be exercised for phylogenetic analyses of amino acid sequences. For instance, some amino acid replacements involve substitutions in the more rapidly evolving third codon position (Asn/Lys, Ile/Met, His/Gly, Cys/Trp), which means if saturation is a problem for the third codon position, it may still be a problem even at the amino acid level. In addition, identical amino acid replacements do not necessarily reflect homologous evolutionary change. For example, a change from phenyl-
292
CHARLES LYDEARD AND KEVIN J. ROE Base C o m p o s i t i o n at First, Second, and Third Positions of Codons
TABLE V
First Taxa
Carcharhinus plumbeus Gateocerdo cuvier Frog Opossum Pygmy right whale Human
Acipenser transmontanus Cyprinus carpio Lythrurus atrapiculus L. roseipinnis Notemigonus crysoleucas Crossostostoma lacustre Oncorhynchus mykiss Gadus morhua Morone mississippiensis Dicentrarchus labrax Micropterus salmoides Etheostoma kennicotti Trachurus trachurus Boops boops Oreochromis mossambicus Satanoperca jurapari Cichlasoma portalegrense "C. (Archocentrus)" spilurum "C. (Herichthys)" carpintis "C. (Herichthys)" labridens "C. (Thoricthys)'" cf. auruem "C. (Thoricthys)'" aureum "C. (Thoricthys)'" meeki "C. (Thoricthys)'" ellioti "C. (Amphilophus)" citrinellum "C. (Amphilophus)'" labiatum "C. (Nandopsis)'" dovii Pomacentrus sp. Sarda sarda Scomber scombrus Thunnus thynnus Mean SD Bias a
Third
Second
A
T
C
G
A
T
C
G
A
T
C
G
26.8 28.9 27.4 29.2 30.0 29.5 25.0 25.3 24.5 24.5 23.4 23.7 22.9 23.4 24.2 23.2 22.1 23.4 22.9 23.7 24.5 26.0 24.5 24.8 24.7 25.4 24.6 24.3 25.1 24.0 24.3 23.7 24.4 24.0 22.6 21.3 22.6
25.8 26.2 27.4 25.3 21.8 23.4 23.2 22.9 24.2 24.5 24.2 24.5 23.7 24.5 26.1 26.1 24.5 23.9 22.1 22.9 23.7 23.1 22.9 24.5 22.8 22.2 24.1 23.2 23.8 23.7 24.3 24.8 24.4 22.0 24.2 23.2 23.2
25.5 24.1 22.9 24.5 26.6 27.6 26.6 26.6 25.5 25.3 25.8 25.3 26.1 24.7 25.8 24.5 26.8 25.8 27.9 27.1 26.6 27.1 27.0 26.4 27.9 28.2 26.5 27.2 26.7 27.0 27.0 26.7 26.5 28.7 26.6 26.8 27.4
21.8 20.7 22.4 21.1 21.6 19.5 25.3 25.3 25.8 25.6 26.6 26.6 27.4 27.4 23.9 26.3 26.6 26.8 27.1 26.3 25.3 23.9 25.6 24.3 24.7 24.2 24.9 25.3 24.3 25.3 24.5 24.8 24.7 25.2 26.6 28.7 26.8
20.0 20.5 20.8 21.3 20.0 20.0 20.8 20.0 20.0 20.1 20.3 20.3 19.5 19.7 20.3 20.5 20.0 19.2 20.0 20.0 20.5 19.9 19.7 19.5 19.6 19.6 19.2 19.3 19.5 19.4 19.8 19.0 19.7 19.6 19.5 19.2 19.5
44.2 43.8 41.8 41.8 41.1 40.0 40.0 41.8 40.5 40.6 41.1 41.6 40.3 42.4 41.1 41.8 40.5 40.8 40.8 40.5 41.1 42.2 42.7 42.2 41.9 41.8 42.5 42.0 41.9 42.6 42.1 43.1 41.9 42.8 41.1 41.3 41.1
22.6 22.6 24.5 24.2 25.0 27.1 25.3 25.0 26.1 25.9 25.3 24.7 26.3 24.5 25.3 23.9 26.1 25.5 25.8 25.3 25.3 25.0 24.9 26.2 25.7 25.4 25.7 25.6 25.9 25.6 25.8 25.5 26.1 24.9 26.1 25.5 26.1
13.2 13.1 12.9 12.6 13.9 12.9 13.9 13.2 13.4 13.5 13.4 13.4 13.9 13.4 13.4 13.7 13.4 14.5 13.4 14.2 13.2 12.9 12.7 12.2 12.7 13.3 12.5 13.2 12.8 12.4 12.2 12.4 12.3 12.6 13.4 13.9 13.4
35.3 36.0 40.3 43.2 39.2 36.3 36.8 43.9 31.8 32.7 36.6 31.8 31.1 31.1 28.2 30.8 27.4 26.8 28.7 29.7 26.6 34.0 32.8 29.9 31.7 31.4 31.8 28.2 30.2 27.6 30.7 30.8 33.6 31.2 33.4 29.7 32.1
22.4 30.4 28.4 28.4 18.2 12.1 14.5 13.7 17.6 20.8 20.5 16.6 23.4 33.2 26.8 30.5 20.8 25.3 20.0 22.1 18.4 26.2 19.9 17.0 16.9 15.9 17.5 18.7 16.3 20.1 23.4 24.5 18.8 16.0 21.6 19.7 22.6
39.5 32.3 28.7 25.3 38.4 47.9 43.7 38.4 39.2 35.9 33.9 42.6 40.0 29.7 37.6 29.5 45.3 39.7 45.5 42.1 51.3 38.8 44.6 48.0 48.9 50.4 47.2 47.2 49.2 46.6 43.8 42.5 46.0 46.6 40.5 41.6 40.3
2.9 1.3 2.6 3.2 4.2 3.7 5.0 3.9 11.3 10.6 8.9 8.9 5.5 6.1 7.4 9.2 6.6 8.2 5.8 6.1 3.7 1.1 2.7 5.1 2.4 2.3 3.5 5.8 4.3 5.6 2.2 2.2 1.6 6.1 4.5 8.9 5.0
24.7 2.03
24.0 1.26
26.3 1.19
24.9 2.03
19.9 0.51
41.6 0.98
25.3 0.91
13.2 0.57
32.5 4.21
21.1 4.99
41.3 6.46
5.1 2.66
0.018
0.225
0.317
aCalculated as: C = (2/3) ~
ci
-
-
0.251,
i=1
where C is the compositional bias and ci is the frequency of the ith base.
alanine to leucine can result from a replacement substitution at either the third or the first codon positions. The authors advocate the use of DNA sequence data rather than amino acid sequences for phylogenetic analyses. The knowledge obtained from studying the patterns of nucleotide substitutions within and among the three codon positions allows for a more robust and sophisticated analysis than is possible with amino acid sequence data. Table IV summarizes the number of
variable and phylogenetically informative amino acid replacements.
D. Variability in Areas of the Cytochrome b Gene The cytochrome b gene is composed of a negative proton input side (N terminus, b c - , d e - , f g - , C terminus), a positive proton output side (ab+, cd+,
17. Cytochrome b Gene Negative
293
Side
N-terminus
~,,,"~
_._ ,,,HI
li;~
;,,,%,;
,-,,,,;,,
,,,,;,,;
;~5
;;,';5
~5,"
';;G
;5;;;
;G55
" ilili
iliiii " H H
"7""
iliii H H r
,Ht// HHH HHH HHH tHH, t/.H, Ht,.t ,HH, H,H, HtH.' ,HtH
~ iiiii ;:;;;
I
I
I
I
I
|
i
i
f
!{!!!!
iiiii , H H
i
ab
C-terminus
fg
de
i
!
ef
cd Positive
gh
Side
Structural model of cytochrome b gene following Esposti et al. (1993). The negative and positive sides correspond to the inner and outer surfaces of the mitochondrial membrane, respectively. The eight labeled boxes correspond to the eight transmembrane regions.
FIGURE 3
ef+, gh+), and eight transmembrane regions (Fig. 3; Esposti et al., 1993). Table VI shows the number of variable and phylogenetically informative amino acid residues within ray-finned fishes for each putative re-
TABLE VI Number of Amino Acid Replacement Substitutions, Phylogenetically Informative Replacement Substitutions (in Parentheses), and Proportion of Amino Acids That Are Variable for Each Region of the Cytochrome b Gene within Actinopterygian Fishes. a
Cytochrome b region
Number of substitutions (phyl. inform.)
Percentage of variable amino acids/region
N-terminus ( - ) Transmembrane A ab + Transmembrane B bcTransmembrane C cd+ Transmembrane D deTransmembrane E ef+ Transmembrane F fgTransmembrane G gh+ Transmembrane H C-terminus
12 (6) 4 (3) 11 (9) 8 (4) 3 (3) 7 (5) 9 (6) 12 (11) 11 (9) 12 (8) 5 (3) 6 (4) 3 (2) 11 (8) 5 (5) 14 (11) 5 (5)
38.7% 16.0% 45.8% 30.8% 37.5% 35% 20.4% 44.0% 50.0% 54.5% 13.2% 27.2% 30.0% 47.8% 55.5% 58.3% 83.3%
aCytochrome b regions and location of residues were determined following the cytochrome b structural model of Esposti et al., 1993.
gion of the cytochrome b gene following the structural model and terminology of Esposti et al. (1993). As observed for other taxa (Irwin et al., 1991; Kornegay et al., 1993), the most variation was found in the negative side (44.2%) followed by the transmembrane region (39.1%). The least variation was found within the positive proton output side (26.1%). This may be due to the requirement for proper protein-protein contacts between cytochrome b and the "Rieske" iron-sulfur subunit, which plays a major role in ubiquinol oxidation at the positive side of the membrane (Esposti et al., 1993). Meyer (1994) indicated that the 3' end of the cytochrome b gene, which is sequenced less frequently than the 5' end, is more variable and may provide valuable phylogenetic information. If the gene is arbitrarily divided into a 5' half (205 residues) and a 3' half (176 residues), greater amino acid variation is indeed found in the 3' end (32.2% versus 40.9%). E. P h y l o g e n e t i c A n a l y s e s
Based on the analysis of nucleotide substitution patterns discussed previously, several different strategies were employed to compensate for saturation, including (1) excluding third codon positions in the analysis, (2) weighting transversions two times transitions in first codon positions, and (3) excluding transitions from the third codon position. Maximum parsimony analysis of nucleotide changes at first and second positions of codons (equal weight and unordered), excluding third codon positions, yielded one most parsimonious tree (Fig. 4) with a total length (TL) of
294
CHARLES L Y D E A R D A N D KEVIN J. ROE .... Carcharhinus plumbeus t
Galeocerdocuwer
50
Chondricthyes
Xenopus laevus I
|
I
,
, , Didelphisvirginiana
I
Capereamarginata
Tetrapods
I,,,i., Homo sapiens 9
.
i
.. Acipenser transmontanus
~
Cyprinuscarpio
53
Lythrurus atrapiculus Lythrurus roseipinnis
Cyprinidae
Notemigonus crysoleucus
d
I
...........................
Crossostoma lacustre
61
Actinopterygii
Oncorhynchus mykiss Gadus morhua 9i_2
Moronemississippiensis
57 " - - 1
Dicentrarchuslabrax
Euteleostei
Moronidae
Sarda sarda ,~
Thunnus thynnus
Scombroidei
Scomber scombrus
_
~
~
.
1
.
~
~
Micropterussalmoides Etheostoma kennicottii
~]
....
I9
54
Pomacentrussp. Satanoperca jurapari
81
Perciformes
Oreochromis mossambicus
Cichlasoma portalagrense C. Archocentrus spilurum C. Nandopsis dovii 98
Labroidei
1
New World Cichlids I
97
I
!I
C. Amphilophus citrinellum C. Amphilophus labiatum
I
C. Herichthys carpintis
|
C. Herichthys labndens
. E _ J 95 , 9 ~
C. Thorichthys cf. aureum
Cichlasomine
Group A
C. Thorichthys meeki C. Thorichthys aureum
! .
.
.
.
C. Thorichthys ellioti Trachurus trachurus
F I G U R E 4 A cladogram of the single most parsimonious tree for ray-finned fishes derived from the un-
weighted maximum parsimony analysis of nucleotides from the first and second codon position. The numbers on the tree correspond to the percentage of bootstrap replicates where the particular clade was found (200 total replications). Only values greater than 50% are shown.
1127 and a consistency index (CI) of 0.371. Cytochrome b data were significantly skewed (gl = -0.472), revealing a strong phylogenetic signal. The molecular phylogeny contained a monophyletic Tetrapoda with conventional groupings (i.e., amphibians sister to mammals, and within mammals, the metatherian opossum sister to eutherians), which is sister to the Actinopterygii. Within the actinopterygian fishes, Acipenser, the authors' chondrostian representative, is the
most basal taxon and is sister to all other ray-finned fishes. Within the Euteleostei, Cyprinidae (Cyprinus, Lythrurus, and Notemigonus) is the basal-most clade, followed by Crossostoma (Balitoridae) rendering Cypriniformes paraphyletic. Gadus, which is a member of the Paracanthopterygii, is sister to Oncorhynchus (Salmonoidei) and together they are sister to the monophyletic Perciformes. Within Perciformes, Morone + Dicentrarchus (Moronidae) are sister to Boops (Sparidae),
17. Cytochromeb Gene which in turn is sister to the monophyletic Scombroidei (Sarda, Thunnus, and Scomber). The aforementioned clade is sister to all other remaining Perciformes. Trachurus (Carangidae) is the next most basal taxon within the Perciformes followed by Micropterus (Centrarchidae) + Etheostoma (Percidae), which are sister to the monophyletic Labroidei. Within the Labroidei, the Cichlidae is rendered paraphyletic with the Old World cichlid (Oreochromis) sister to Pomacentrus (Pomacentridae). Within the remaining Cichlidae, however, New World cichlids, Cichlasomine groups A + B, Cichlasomine group A, "'Cichlasoma (Amphilophus)," "Cichlasoma (Thorichthys)," and "Cichlasoma (Herichthys)" are depicted as monophyletic. Bootstrap values, which are an indication of the amount of internal support for a given node, showed relatively strong support for the following groups: Actinopterygii, Euteleostei, Scombroidei, Moronidae, New World cichlids, Cichlasomine group A + B, "'Cichlasoma (Amphilophus)," "Cichlasoma (Herichthys)," and "Cichlasoma (Thoricthys)." Weaker support was found for the Perciformes, Labroidei, and Cyprinidae. As noted by Hillis and Bull (1993), bootstrap values of 70% actually indicate substantial support for a given node. Maximum parsimony analysis of nucleotides of first and second codon positions with transversions weighted two times transitions for the first codon position yielded two equally parsimonious trees (TL = 1421; CI = 0.41; gl = -0.478). One of the two trees is identical to that shown in Fig. 4. The second most parsimonious tree switches the placement of Trachurus and the clade Scombroidei + Moronidae + Boops. Five equally parsimonious trees were obtained from a maximum parsimony analysis of first and second codon positions (unordered, equal weight), and transversions from the third codon position (TL = 2643; CI = 0.365; gl = -0.467). Figure 5 shows a strict consensus tree of the five equally parsimonious trees. The most striking difference between the gene tree, including transversions from the third position (Fig. 5) and excluding the third codon position (Fig. 4), is that both tetrapods and ray-finned fishes are rendered paraphyletic due to the sister relationship of the frog and the sturgeon (Acipenser). This unconventional pattern is likely due to the saturation of transversions at the third codon position and indicates that the phylogenetic signal has been replaced largely by noise for taxa that diverged over 200 million years ago. However, including transversions from the third codon position increased the support for relationships among some of the more closely related taxa. For example, the Cypriniformes are now depicted as monophyletic, and higher bootstrap values are obtained for all nodes within the Neotropical cichlid clade. In addition, although not shown on the
295
strict consensus tree, two of the most parsimonious trees depicted a monophyletic Cichlidae (76% bootstrap value) nested within the monophyletic Labroidei (<50% bootstrap value). Other most parsimonious solutions, however, rendered both groups paraphyletic. One other notable difference is that the Moronidae is sister to Micropterus + Etheostoma instead of Boops. Only one most parsimonious tree resulted from maximum parsimony analysis of nucleotides of first and second codon positions (transversions weighted twice transitions for the first codon position), including transversions from the third codon position (TL = 2947; CI = 0.398; gl = -0.429). This tree is fully resolved with greater support for monophyly of the Cypriniformes, Cyprinidae, Labroidei, and Cichlidae (Fig. 6). However, this topology shows a different suite of phylogenetic hypotheses of relationships within the Perciformes.
F. Congruence/Incongruence of DNA-
and Morphological-Based Phylogenetic Hypotheses Phylogenetic analyses of particular classes of cytochrome b DNA sequence data reveal a strong phylogenetic signal for studying relationships among rayfinned fishes. Using the more slowly evolving first and second codon positions and excluding the third codon position provided strong phylogenetic signal beyond the ray-finned fishes up to 450 million years ago, which was the estimated time of divergence between the Chondrichthiomorphi and the Teleostomi (Carroll, 1988; Benton, 1990). Indeed, the molecular phylogeny (Fig. 4) revealed a monophyletic Tetropoda being sister to the Actinopterygii. The Chondrostei, which diverged about 200 million years ago from the Neopterygii (Carroll, 1988), was depicted as the most basal actinopterygian fish group, as expected. In addition, the cytochrome b gene tree suggested that the Cypriniformes are the most basal euteleostei group examined in the study. This particular aspect was unresolved in the morphological-based phylogeny (Fig. 1), but agrees with a molecular phylogeny of teleosts based on the growth hormone gene (Rubin and Dores, 1995). Counter to the morphological-based phylogeny, the cytochrome b gene depicts the Salmonoidei as sister to the Paracanthopterygii, thus rendering the Neoteleostei paraphyletic. The authors' approach has been to view congruent patterns as evidence that the gene provides signal and areas of incongruence as evidence that the gene does not. However, the fact that certain components of the gene provide sound phylogenetic signal for taxa that diverged both earlier and later than the Salmonoidei/Neoteleostei split suggests that the cyto-
296
CHARLES LYDEARD A N D KEVIN J. ROE
Carcharhinus plumbeus .....................
54
,
,
Galeocerdo cuvier
i .....
Xenopus laevus Acipenser transmontanus
......
100 ,
i,
100 !
,
I
Chondricthyes
Cyprinus carpao Lythrurus atrapicuius Cypriniformes
Notemigonus crysoleucus Crossostoma lacustre
71 8
'
' 58
I
100
| ,
Oncorhynchus mykiss
I
Gadus morhua
!i
, Morone mississippiensis Dicentrarchus labrax
81 I
Micropterus salmoides
! '
Boops boops ,
9
,,
Oreochromis mossambicus
,
Satanoperca jurapan
21
Cichlasoma portalagrense
91 99
67
I
I
C. Archocentrus spilurum i
I100 / "
I Perciformes
98
!
New World Cichlids
81
I' 100 "
196
.
.
.
.
.
.
100
I
I
97
C. Nandopsis dovii ' C. Amphilophus citrinellum c. Amphilophus labiatum C. Herichthys carpintis C. Thorichthys cf. aureum C. Thorichthys meeki
94
!
C. Thorichthys aureum
i
C. Thorichthys ellioti
.
Cichlasomine Group A
c. Herichthys labridens
98 I !
Pomacentrus sp.
!
91 | !
I
.....
Etheostoma kennicottii Trachurus trachurus
.............
Euteleostei
I Moronidae
i
Sarda sarda
|
Thunnus thynnus
I Scombroidei
Scomber scombrus
61 /
Didelphis virginiana
I
Caperea marginata Homo sapiens
.....
FIGURE 5 A strict consensus tree for five equally parsimonious cladograms for actinopterygian fishes based on the maximum parsimony analysis of all changes at first and second positions as well as transversions at third positions. The numbers on the tree correspond to the percentage of bootstrap replicates where the clade was found (200 total replications). Only values greater than 50% are shown.
chrome b genee tree may be offering some novel information worthy of further investigation. Although few investigators have challenged the monophyly of the Neoteleostei, Johnson (1992) noted that of the eight total synapomorphies listed for the Neoteleostei by Rosen (1973), Fink and Weitzman (1982), Lauder and Liem (1983), and Rosen (1985), only one is common to all four lists. Johnson (1992) re-
viewed the evidence and offered four characters as the least ambiguous evidence for the monophyly of the Neoteleostei. Interestingly, one of the four characters (exoccipitals and basioccipitals exposed posteriorly and joined by an inverted Y-shaped suture) was initially hypothesized as a likely candidate as a possible synapomorphy for an expanded Neoteleostei, which includes the salmonids (Fink and Weitzman, 1982). In
17. Cytochrome b Gene I
297 Carcharhinusplumbeus Galeocerdo cuvier
I Chondricthyes
Xenopus laevus Acipenser transmontanus Cyprinus carplo 80 ~
Lythrurusatrapiculus
84
Lythrurus roseipinnis
C yprinifo r mes
Notemigonus crysoleucus
52
Crossostoma lacustre
67
~
ncorhynchusmykiss Gadus morhua
~
7)
7!1 93
/
Boops boops Oreochromis mossambicus Satanoperca jurapari Cichlasoma portalagrense
I
99
' C. Archocentrus spilurum C. Nandopsis dovii
76 99
3
I Moronidae
Dicentrarchus labrax
Cichlid
Euteleostei
Moronemississippiensis
i
C. Amphilophus citrinellum C. Amphilophus labiatum
I
C. Herichthys carpintis
100
C. Herichthys labridens
Cichlasomine
Group A
C. Thorichthys cf. aureum C. Thorichthys meeki
N L
C. Thorichthys aureum C. Thorichthys ellioti Pomacentrus sp.
Labroidei
Sarda sarda
100
Thunnus thynnus
Scombroidei
Scomber scombrus Micropterus salmoides Perciformes
Etheostoma kennicottii Trachurus ~'achurus -
96
Didelphisvirginiana , Capereamarginata Homo sapiens
FIGURE 6
A cladogram of the single most parsimonious tree for actinopterygian fishes derived from the maximum parsimony analysis of all changes at first and second codon positions (transversions weighted twice transitions at the first codon position) and transversions only at the third codon position. The numbers on the tree correspond to the percentage of bootstrap replicates where the clade was found (200 total replications). Only values greater than 50% are shown.
addition, although fossils only offer a minimum estimate of time of divergence, it is noteworthy that the oldest fossil salmonids are of Eocene age (Wilson and Williams, 1993), which is about 20 million years younger than the earliest known paracanthopterygian taxa (Patterson, 1993), which they supposedly diverged from. Alternatively, the fact remains that there is a large assemblage of intervening taxa between the
single paracanthopterygian and salmonid examined in the authors' study and that all have been recognized as monophyletic (i.e., Eurypterygii, Ctenosquamata, and Acanthomorpha) (Stiassny, 1986; Johnson, 1992). Therefore, before offering any additional conjecture, the authors recommend collecting additional sequence data from other lineages prior to further systematic evaluation.
298
C H A R L E S L Y D E A R D A N D K E V I N J. ROE
Perciformes (sensu Greenwood et al., 1966) are the largest and most diversified of all fish orders (9293 species; Nelson, 1994) and are probably polyphyletic (Johnson, 1993). The cytochrome b gene tree depicts a monophyletic Perciformes; however, this may be based on the nature of the sampling regime and agrees with the morphological-based hypothesis for the taxa included in the study. Within the Perciformes, the molecular phylogeny reveals strong support for a monophyletic Scombroidei, which, at least based on the taxa the authors included, agrees with Collette et al. (1984), Johnson (1986), and Carpenter et al. (1995). The Labroidei, which includes the families Labridae, Scaridae, Odacidae, Cichlidae, Embiotocidae, and Pomacentridae, has been recognized as monophyletic based on eight synapomorphies associated with the branchial complex (for a review see Stiassny and Jensen, 1987). However, skeptism remains due to an absence of corroborative evidence from other morphological features besides the branchial complex (Johnson, 1993). The cytochrome b gene provides tentative support for the recognition of this suborder. Within the Labroidei, some support was found for the monophyly of the family Cichlidae, and strong support was found for the monophyly of Neotropical cichlids and several New World putatively monophyletic groups (sensu Stiassny, 1991). The cytochrome b gene trees failed to support the monophyly of the suborder Percoidei. This actually agrees with current views of many ichthyologists (Johnson, 1993). Strong support was found for monophyly of the Moronidae and the sister relationships of Centrarchidae and Percidae, again recognizing that not all pertinent taxa were included in the authors' study. Although strong support was found for the monophyly of certain groups within the Perciformes (e.g., Scombroidei, Moronidae), no robust pattern of relationship among the groups was found, regardless of how data were analyzed. Indeed, different analyses typically yielded different topologies (Fig. 4-6). The fact that cytochrome b gene data yielded strong support and fully resolved nodes for taxa that diverged both before and after the divergence of the Perciformes suggests that the problem may not be due to the gene, but to the rapid radiation of the Perciformes. Patterson (1993) reviewed the fossil evidence of all acanthomorph taxa of the world from the Cretaceous and early Tertiary. He noted there were no lower Cretaceous acanthomorphs and that the upper Cretaceous contains mostly stem group acanthomorphs, including the earliest paracanthopterygians and one possible perciform. Following a 20 million year gap in the fossil record, late Paleocene (ca. 55 million years ago) fossil data reveal a diverse fauna of paracanthopterygians and perciformes, including scombrids, carangids, labrids,
temperate basses, and others. The geologically sudden appearance of many extant perciform families in the fossil record offers evidence that the radiation within the Perciformes was quite rapid and may explain why the cytochrome b gene failed to recover a robust phylogeny within the group.
G. Utility of the Cytochrome b Gene The authors have attempted to assess the phylogenetic utility of the mitochondrial cytochrome b gene by examining taxonomic congruence (Kluge, 1989) between molecular- and morphological-based phylogenetic hypotheses. Although congruence analysis can be a powerful method (Miyamoto and Cracraft, 1991), a potential shortcoming is the often implicit assumption that the traditional or conventional hypothesis reflects the true phylogeny. Ideally, a taxonomic congruence analysis should use a well-corroborated pattern of relationship for comparison. Unfortunately, although significant progress has been made regarding the systematic relationships within actinopterygian fishes (Nelson, 1989; Johnson, 1993), our understanding of particular groups and their relationships is in a state of flux and is by no means well corroborated. For example, Johnson and Patterson (1993) acknowledge homoplasy on a "massive scale" in a cladogram summarizing their views on acanthomorph interrelationships and admit that their scheme is "far from perfect." Therefore, systematists should certainly view unusual phylogenetic hypotheses from morphological and molecular studies cautiously, but with an open mind, and be willing to reexamine their data in light of new findings. Although the authors' phylogenetic analyses revealed some unlikely findings (e.g., paraphyletic Cypriniformes and Cichlidae using only first and second codon positions), other areas of incongruence correspond with areas of uncertainty for the morphologicalbased phylogeny as well (e.g., relationships among basal euteleosts). Despite lack of concordance for some parts of the tree, molecular- and morphological-based phylogenetic hypotheses, which span about 450 million years among all taxa, reveal a striking degree of congruence. This high degree of congruence begs the question of why several investigators examining systematic relationships employing cytochrome b DNA sequence data reported that it is not useful for deep phylogenetic questions (Hillis and Huelsenbeck, 1992; Graybeal 1993, 1994). Although these studies were not necessarily conducted on fishes, it is instructive to review some of the criticisms about using the cytochrome b gene in an attempt to sort out this confusing state of affairs.
17. Cytochrome b Gene
Hillis and Huelsenbeck (1992) were perhaps the first to note the shortcomings of the cytochrome b gene for phylogenetic studies. While presenting a novel approach for assessing the amount of phylogenetic signal in a data set, they examined partial nucleotide sequences among a group of vertebrates [ray-finned fish (outgroup), lungfish, coelacanth, frog, salamander, chicken, human, rat, mouse, and cow] and found a significant phylogenetic signal. However, the phylogenetic signal was largely restricted to a clade uniting the rat, mouse, and cow. Reanalysis of the remaining trees showed that the data set no longer contained significant phylogenetic information. Indeed, the most parsimonious trees yielded topologies that made little sense compared to traditional views of vertebrate relationships, thus supporting a potential lack of phylogenetic signal in the data set. Based on their findings, Hillis and Huelsenbeck (1992) concluded that the cytochrome b gene evolves too rapidly to resolve deeper levels of phylogeny beyond 400 million years ago. The objective of their study was to demonstrate how the phylogenetic signal may not be distributed throughout the branches of an estimated tree, based on the analysis of a cytochrome b data set. Nevertheless, because Hillis and Huelsenbeck (1992) concluded that the gene had potentially serious shortcomings, it is worth scrutinizing their analyses further. Hillis and Huelsenbeck (1992) examined only a 360bp segment of the 5' region of the cytochrome b gene (from Meyer and Wilson, 1990), including all nucleotide positions in their analysis. In contrast to the authors' analysis of nucleotide substitution patterns, there are two probable reasons for the lack of a phylogenetic signal in Hillis and Huelsenbeck's (1992) data set: the inclusion of nucleotides from the third codon position and the limited amount of sequence data. Including data from the third codon position to assess relationships among all vertebrates increases the noise-tosignal ratio and contributes to symmetrical tree distributions. However, when nucleotides from third codon positions are excluded, few phylogenetically informative sites remain in the first and second codon positions because only about one-third of the cytochrome b gene from the more conservative 5' region was examined. The authors believe that Hillis and Huelsenbeck's (1992) study reveals the shortcomings of relying on the conservative 5' region of the cytochrome b gene for deeper phylogenetic questions, as opposed to the shortcomings of the entire gene. Graybeal (1993) examined the phylogenetic utility of the cytochrome b gene in toads (Bufonidae) and found that the gene was useful for assessing relationships among closely related taxa, but not for deeper relationships. Graybeal (1993) found saturation in the
299
third position among the deeply divergent taxa, but there were too few phylogenetically informative sites in the first and second codon positions to resolve deeper relationships. Although Graybeal (1993) suggested that the cytochrome b gene should not be categorically avoided for questions among deeply divergent taxa, she concluded that the cytochrome b gene and perhaps other mitochondrial proteins are not useful for resolving divergences greater than about 50 million years ago. Like Hillis and Huelsenbeck (1992), only one-third of the gene was sequenced. Although the authors agree that the particular portion of the gene sequenced was not useful for resolving deep phylogenetic relationships among bufonids, they disagree about extrapolating conclusions to represent the entire cytochrome b gene and all taxa. Several studies have been conducted in which the phylogenetic utility of the cytochrome b gene was explored either directly or indirectly by examining complete mitochondrial DNA sequences of the gene (e.g., Kumazawa and Nishida, 1993; Graybeal, 1994; Cantatore et al., 1994). Kumazawa and Nishida (1993) examined the relationships among a mouse, rat, cow, human, chicken, frog, and sea urchin. Kumazawa and Nishida's (1993) criteria for whether a gene or set of genes (all tRNA genes combined) could be regarded as good indicators of phylogenetic relationships was based on comparing the gene tree with the well-established tree of vertebrates and by bootstrap values greater than 95%. As in the authors' data set, Kumazawa and Nishida (1993) observed a high noise-tosignal ratio when third positions were included and therefore emphasized information from the first and second codon positions. Using the sea urchin as the outgroup, they obtained high bootstrap values for the m o u s e - r a t clade and mammal clade (nearly 95%); however, the frog was sister to the mammals instead of the chicken, suggesting problems associated with rooting the tree. Reanalysis excluding the sea urchin, however, resulted in the correct topology and high bootstrap values. Although the combined mitochondrial tRNA genes consistently met Kumazawa and Nishida's (1993) criterion for bootstrap values >95%, the authors believe that establishing such high values as indicators of success for any gene is unrealistic. Indeed, all supposedly well-corroborated phylogenies, including those based on morphological data, would probably have to be rejected using such a criterion. Nevertheless, Kumazawa and Nishida's (1993) data indicate a substantial problem with the cytochrome b gene for divergences greater than 600 million years ago. Cantatore et al. (1994) conducted an evolutionary analysis of the cytochrome b gene of 12 actinopterygian
300
CHARLES LYDEARD AND KEVIN J. ROE
fishes and concluded that the gene does not seem suitable for drawing phylogenetic inferences at higher taxonomic levels within fishes. Their decision was based largely on the fact that their unrooted and rooted phenograms were not fully resolved and several nodes had low bootstrap values. The authors agree that the cytochrome b gene does not fully resolve phylogenies within the Perciformes (< 60 million years ago divergences), but does provide some resolution for deeper nodes, suggesting that failure to elucidate these relationships is a function of the rapid speciation rate. Graybeal (1994), after identification of an initial pool of candidates and weeding out the pool according to practical aspects of data collection and analysis, evaluated the phylogenetic utility of 35 nuclear and mitochondrial protein-coding genes for estimating relationships among a fish, a frog, a bird, a rodent, and a primate. Although the most parsimonious tree from both amino acids and nucleotides yields the correct tree, Graybeal (1994) concluded that the cytochrome b gene provides relatively limited utility of resolution for taxa that diverged over 80 million years ago. Her conclusion was based on low bootstrap values and non- or nearly nonsignificant phylogenetic signal (g statistics). Given the breadth of her study, Graybeal (1994) included all third codon positions and gave equal weight to all substitutions. As previously discussed, this is clearly a mistake for assessing deep phylogenetic questions using the cytochrome b gene. Dropping third codon positions, Kumazawa and Nishida (1993) obtained bootstrap values near 95% and data contained a significant phylogenetic signal for assessing relationships within vertebrates.
the amount of variation found within a gene by sequencing taxa thought to be the most divergent and those thought to be the most closely related before starting a large-scale sequencing project. Molecular systematists must be aware of the problems associated with determining the utility of a gene. Most molecular systematists are attracted to questions that are largely intractable or unresolved using conventional morphological data (e.g., coelacanthd i p n o a n - t e t r a p o d relationships). When a gene tree yields a poorly resolved taxonomic tree, investigators often conclude that the gene is not informative for that particular question. An alternative hypothesis, however, is that the gene reveals true relationships. When speciation is rapid, little time may be available for synapomorphies to accrue between nodes; therefore, the tree may be unresolved. Evidence for this pattern has been found within poeciliids (Parenti, 1981; Meyer and Lydeard, 1993), birds (Avise et al., 1994), and bovine mammals (Kraus and Miyamoto, 1991). It is inappropriate to propose rapid speciation as an a priori hypothesis until sufficient data are gathered. It is necessary to gather enough morphological and/or molecular data to determine if the polytomy is retained. For example, Kraus and Miyamoto (1991) failed to obtain robust patterns of relationships for the four families of pecoran ruminants despite sequencing 2.7 kb of mitochondrial DNA. Their data suggest that the four lineages diverged over a brief geological time period. A similar lack of resolution and weak support for many nodes within the Perciformes suggest that no single gene will provide all the answers.
Acknowledgments H. Summary The cytochrome b gene appears to offer substantial phylogenetic information for assessing intrarelationships of actinopterygian fishes across a broad taxonomic range. However, careful consideration should be given regarding weighting strategies prior to conducting any phylogenetic analysis. Many criticisms leveled against the utility of the gene, particularly for questions among distantly related taxa, appear unfounded based on limited and/or improperly analyzed data. Other problems, such as base compositional bias, rate variation between different lineages, saturation of third codon positions, and limited variation in first and second codon positions, are not unique to cytochrome b, but are general features of all mitochondrial protein-coding genes (e.g., Kocher et al., 1989, Meyer 1994). As Graybeal (1993) and Meyer (1994) recommend, investigators should begin a study by exploring
We are grateful to T. D. Kocher, T. Lamb, C. A. Stepien and her students, M. L. J. Stiassny, P. J. West, and two anonymous reviewers for their comments on the manuscript. Don Conkel kindly provided wild-caught cichlid specimens for study. This research was funded by the Department of Biological Sciences, University of Alabama, and the National ScienceFoundation (DEB-9527758to C.L.). References
Anderson, S., Bankier, A. T., Barrell, B. G., De Bruijn, M. H. L., Coulson, A. R., Drouin, J., Eperon, I. C., Nierlich, D. P., Roe, B. A., Sanger, F., Schreier, P. H., Smith, A. J. H., Staden, R., and Young, I. G. 1981. Sequence and organization of the human mitochondrial genome. Nature 290:457-465. Arnason and Gullberg. 1994. Relationship of baleen whales established by cytochrome b gene sequence comparison. Nature 367: 726-728. Avise, J. C., Nelson, W. S., and Sibley, C. G. 1994. Why one-kilobase sequences from mitochondrial DNA fail to solve the hoatzin phylogenetic enigma. Mol. Phyl. Evol. 3:175-184. Begle, D. P. 1991. Relationships of the osmeroid fishes and the use
17. Cytochrome b Gene of reductive characters in phylogenetic analysis. Syst. Zool. 40: 33-53. Benton, M. J. 1990. Phylogeny of the major tetrapod groups: Morphological data and divergence dates. J. Mol. Evol. 30:409-424. Bernardi, G., and Powers, D. A. 1992. Molecular phylogeny of the prickly shark, Echinorhinus cookei, based on nuclear (18S rRNA) and a mitochondrial (cytochrome b) gene. Mol. Phyl. Evol. 1: 161-167. Bernardi, G., and Powers, D. A. 1995. Phylogenetic relationships among nine species from the genus Fundulus (Cyprinodontiformes, Fundulidae) inferred from sequences of the cytochrome b gene. Copeia 1995: 469-473. Block, B. A., Finnerty, J. R., Stewart, A. F. R., and Kidd, J. 1993. Evolution of endothermy in fish: Mapping physiological traits on a molecular phylogeny. Science 260:210-214. Brown, J. R., Gilbert, T. L., Kowbel, D. J., O'Hara, P. J., Buroker, N. E., Bechenback, A. T., and Smith, M. J. 1989. Nucleotide sequence of the apocytochrome B gene in white sturgeon mitochondrial DNA. Nucleic Acids Res. 17:4389. Brown, W. M. 1985. The mitochondrial genome of animals. In "Molecular Evolutionary Genetics" (R. J. MacIntyre, ed.), pp. 95-130. Plenum, New York. Brown, W. M., Prager, E. M., Wang, A., and Wilson, A. C. 1982. Mitochondrial DNA sequences of primates: Tempo and mode of evolution. J. Mol. Evol. 18:225-239. Cabot, E. L., and Beckenbach, A. T. 1989. Simultaneous editing of multiple nucleic acid and protein sequences with ESEE. Cabios 5: 233-234. Cantatore, P., Roberti, M., Pesole, G., Ludovico, A., Milella, F., Gadaleta, M. N., and Saccone, C. 1994. Evolutionary analysis of cytochrome b sequences in some Perciformes: Evidence for a slower rate of evolution than in mammals. J. Mol. Evol. 39:589-597. Carpenter, K. E., Collette, B. B., and Russo, J. L. 1995. Unstable and stable classifications of scombroid fishes. Bull. Mar. Sci. 56: 379-405. Carroll, R. L. 1988. "Vertebrate Paleontology and Evolution." Freeman, New York. Chang, Y. S., Huang, F. L., and Lo, T. B. 1994. The complete nucleotide sequence and gene organization of carp (Cyprinus carpio) mitochondrial genome. J. Mol. Evol. 38:138-155. Collette, B. B., Potthoff, T., Richards, W. J., Ueyangi, S., Russo, J. L., and Nishikawa, Y. 1984. Scombroidei: development and relationships. In "Ontogeny and Systematics of Fishes" (H. G. Moser et al., eds.), pp. 591-620. Amer. Soc. Ichthyol. Herp., Spec. Pub. 1: 591-620. Collins, T. M., Wimberger, P. H., and Naylor, G. J. P. 1994. Compositional bias, character-state bias, and character-state reconstruction using parsimony. Syst. Biol. 43:482-496. Edwards, S. V., Arctander, P., and Wilson, A. C. 1991. Mitochondrial resolution of a deep branch in the genealogical tree for perching birds. Proc. R. Soc. Lond. B 243:99-107. Esposti, M. D., De Vries, S., Crimi, M., Ghelli, A., Patarnello, T., and Meyer, A. 1993. Mitochondrial cytochrome b: Evolution and structure of the protein. Biochim. Biophys. Acta 1143:243-271. Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39: 783-791. Fink, W. L., and Weitzman, S. H. 1982. Relationships of the stomiiform fishes (Teleostei), with a redescription of Diplophos. Bull. Mus. Comp. Zool. 150:31-93. Grachev, M. A., Slobodyanyuk, S. Ja., Kholodilov, N. G., Fyodorov, S. P., Belikov, S. I., Sherbakov, D. Yu., Sideleva, V. G., Zubin, A. A., and Kharchenko, V. V. 1992. Comparative study of two protein-coding regions of mitochondrial DNA from three endemic sculpins (Cottoidei) of Lake Baikal. J. Mol. Evol. 34:85-90.
301
Grant, E. C., and Riddle, B. R. 1995. Are the endangered springfish (Crenichthys Hubbs) and poolfish (Empetrichthys Gilbert) fundulines or goodeids? A mitochondrial DNA assessment. Copeia 1992:209-212. Graybeal, A. 1993. The phylogenetic utility of cytochrome b: Lessons from bufonid frogs. Mol. Phyl. Evol. 2:256-269. Graybeal, A. 1994. Evaluating the phylogenetic utility of genes: A search for genes informative about deep divergences among vertebrates. Syst. Biol. 43:174-193. Greenwood, H. P., Rosen, D. E., Weitzman, S. H., and Myers, G. S. 1966. Phyletic studies of teleostean fishes, with a provisional classification of living forms. Bull. Am. Mus. Nat. Hist. 131:339-456. Gyllensten, U. B., and Erlich, H. A. 1988. Generation of singlestranded DNA by the polymerase chain reaction and its application to direct sequencing of the HLA-DQA locus. Proc. Natl. Acad. Sci. USA 85: 7652- 7656. Hedges, S. B., Hass, C. A., and Maxson, L. R. 1993. Relations of fish and tetrapods. Nature 363:501-502. Hillis, D. M., and Bull, J. J. 1993. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst. Biol. 42:182-192. Hillis, D. M., and Huelsenbeck, J. P. 1992. Signal, noise, and reliability in molecular phylogenetic analyses. J. Hered. 83:189-195. Irwin, D. M., Kocher, T. D., and Wilson, A. C. 1991. Evolution of the cytochrome b gene of mammals. J. Mol. Evol. 32:128-144. Janke, A., Feldmaier-Fuchs, G., Kelley Thomas, W., Von Haeseler, A., and P~i~ibo, S. 1994. The marsupial mitochondrial genome and the evolution of placental mammals. Genetics 137:243-256. Johansen, S., and Johansen, T. 1994. Sequence analysis of twelve structural genes and a novel non-coding region from mitochondrial DNA of Atlantic cod Gadus morhua. Biochim. Biophys. Acta 1218:213-217. Johnson, G. D. 1986. Scombroid phylogeny: An alternative hypothesis. Bull Mar. Sci. 39:1-41. Johnson, G. D. 1992. Monophyly of the euteleostean clades: Neoteleostei, Eurypterygii, and Ctenosquamata. Copeia 1992: 8-25. Johnson, G. D. 1993. Percomorph phylogeny: Progress and problems. Bull. Mar. Sci. 52:3-28. Johnson, G. D., and Patterson, C. 1993. Percomorph phylogeny: A survey of acanthomorphs and a new proposal. Bull. Mar. Sci. 52: 554-626. Kluge, A. J. 1989. A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae, Serpentes). Syst. Zool. 38:7-25. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., P~i~ibo, S. F., Villablanca, F. X., and Wilson, A. C. 1989. Dynamics of mtDNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86:6196-6200. Kornegay, J. R., Kocher, T. D., Williams, L. A., and Wilson, A. C. 1993. Pathways of lysozyme evolution inferred from the sequences of cytochrome b in birds. J. Mol. Evol. 37:367-379. Krajewski, C., Driskell, A. C., Baberstock, P. R., and Braun, M. J. 1992. Phylogenetic relationships of the thylacine (Mammalia: Thylacinidae) among dasyuroid marsupials: Evidence from cytochrome b DNA sequences. Proc. R. Soc. Lond. B 250:19-27. Kraus, F., and Miyamoto, M. M. 1991. Rapid cladogenesis among the pecoran ruminants: Evidence from mitochondrial DNA sequences. Syst. Zool. 40:117-130. Kumar S., Tamura, K., and Nei, M. 1993. "MEGA: Molecular Evolutionary Genetics Analysis." Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park, PA. Kumazawa Y., and Nishida, M. 1993. Sequence evolution of mitochondrial tRNA genes and deep-branch animal phylogenetics. J. Mol. Evol. 37:380-398.
302
CHARLES LYDEARD AND KEVIN J. ROE
Lamb, T., and C. Lydeard, 1994. A molecular phylogeny of the gopher tortoises, with comments on familial relationships within the Testudinoidea. Mol. Phyl. Evol. 3:283-291. Lamb, T., Lydeard, C., Walker, R. B., and Gibbons, J. W. 1994. Molecular systematics of map turtles (Graptemys): A comparison of mitochondrial restriction site versus sequence data. Syst. Biol. 43: 543-559. Lauder, G. V., and Liem, K. F. 1983. The evolution and interrelationships of the actinopterygian fishes. Bull. Mus. Comp. Zool. 150: 95-197. Lockhart, P. J., Steel, M. A., Hendy, M. D., and Penny, D. 1994. Recovering evolutionary trees under a more realistic model of sequence evolution. Mol. Biol. Evol. 11:605-612. Lydeard, C., Wooten, M. C., and Meyer, A. 1995a. Cytochrome b sequence variation and a molecular phylogeny of the livebearing fish genus Gambusia (Cyprinodontiformes: Poeciliidae). Can. J. Zool. 73:213-227. Lydeard, C., Wooten, M. C., and Meyer, A. 1995b. Molecules, morphology, and area cladograms: A cladistic and biogeographic analysis of Gambusia (Teleostei: Poeciliidae). Syst. Biol. 44:221-236. Ma, D., Zharkikh, A., Graur, D., Vandeberg, J. L., and Li, W. 1993. Structure and evolution of opossum, guinea pig, and porcupine cytochrome b genes. J. Mot. Evol. 36:327-334. Margulis, L. 1970. "Origin of Eukaryotic Cells." Yale University Press, New Haven, CT. Martin, A. P., Naylor, G. J. P., and Palumbi, S. R. 1992. Rates of mitochondrial DNA evolution in sharks are slow compared with mammals. Nature 357:153-155. McVeigh, H. P., Bartlett, S. E., and Davidson, W. S. 1991. Polymerase chain reaction/direct sequence analysis of the cytochrome b gene in Salmo salar. Aquaculture 95:225-233. Meyer, A. 1993. Evolution of mitochondrial DNA in fishes. In "Biochemistry and Molecular Biology of Fishes" (Hochachka and Mommsen, eds.), vol. 2. Elsevier, Amsterdam. Meyer, A. 1994. Shortcomings of the cytochrome b gene as a molecular marker. TREE 9:278-280. Meyer, A., and Dolven, S. I. 1992. Molecules, fossils, and the origin of tetrapods. J. Mol. Evol. 35:102-113. Meyer, A., Kocher, T. D., Basasibwaki, P., and Wilson, A. C. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature 347:550-553. Meyer, A., Kocher, T. D., and Wilson, A. C. 1991. African fishes. Nature 350: 467-468. Meyer, A., and Lydeard, C. 1993. The evolution of copulatory organs, internal fertilization, placentae and viviparity in killifishes (Cyprinodontiformes) inferred from a DNA phylogeny of the tyrosine kinase gene X-src. Proc. R. Soc. Lond. B 254:153-162. Meyer, A., Morrissey, J. M., and Schartl, M. 1994. Recurrent origin of a sexually selected trait in Xiphophorus fishes inferred from a molecular phylogeny. Nature 368: 539-542. Meyer, A., and Wilson, A. C. 1990. Origin of tetrapods inferred from their mitochondrial DNA affiliation to lungfish. J. Mol. Evol. 31: 359-364. Miyamoto, M. M., and Cracraft, J. 1991. Phylogenetic inference, DNA sequence analysis, and the future of molecular systematics. In "Phylogenetic Analysis of DNA Sequences" (M. Miyamoto and J. Cracraft, eds.), pp. 3-17. Oxford University Press, New York. Moritz, C., Schneider, C. J., and Wake, D. B. 1992. Evolutionary relationships within the Ensatina eschscholtzii complex confirm the ring species interpretation. Syst. Biol. 41:273-291. Nelson, G. 1989. Phylogeny of major fish groups. In "The Hierarchy of Life" (B. Fernholm, K. Bremer, and H. Jornvall, eds.). Elsevier, Amsterdam.
Nelson, J. S. 1994. "Fishes of the World," 3rd Ed. Wiley, New York. Normark, B. B., McCune, A. R., and Harrison, R. G. 1991. Phylogenetic relationships of neopterygian fishes, inferred from mitochondrial DNA sequences. Mol. Biol. Evol. 8:819-834. Orti, G., Bell, M. A., Reimchen, T. E., and Meyer, A. 1994. Global survey of mitochondrial DNA sequences in the threespine stickleback: Evidence for recent migrations. Evolution 48:608-622. P~i~ibo, S. 1990. Amplifying ancient DNA. In "PCR Protocols: A Guide to Methods and Applications" (M. A. Innes, D. H. Gelfand, J. J. Sninsky, and T. J. White, eds.), pp. 159-166. Academic Press, San Diego. Palumbi, S., Martin, A., Romano, S., McMillan, W. O., Stice, L., and Grabowski, G. 1991. "The Simple Fool's Guide to PCR." University of Hawaii, Honolulu, HI. Parenti, L. R. 1981. A phylogenetic and biogeographic analysis of cyprinodontiform fishes (Teleostei, Atherinomorpha). Bull. Am. Mus. Nat. Hist. 168:335-557. Patarnello, T., Bargelloni, L., Caldara, F., and Colombo, L. 1994. Cytochrome b and 16S rRNA sequence variation in the Salmo trutta (Salmonidae, Teleostei) species complex. Mol. Phyl. Evol. 3: 69- 74. Patterson, C. 1993. An overview of the early fossil record of acanthomorphs. Bull. Mar. Sci. 52:29-59. Roe, B. A., Ma, D. P., Wilson, R. K., and Wong, J. F. 1985. The complete nucleotide sequence of the Xenopus taevis mitochondrial genome. J. Biol. Chem. 260:9759-9774. Rosen, D. E. 1973. Interrelationships of higher euteleostean fishes. In "Interrelationships of fishes" (P. H. Greenwood, R. S. Miles, and C. Patterson, eds.), pp. 397-513. Zool. J. Linn. Soc. 53 :Suppl. 1. Rosen, D. E. 1985. An essay on euteleostean classification. Am. Mus. Novit. 2827:1-57. Rubin, D. A., and Dores, R. M. 1995. Obtaining a more resolute teleost growth hormone phylogeny by the introduction of gaps in sequence alignment. Mol. Phyl. Evol. 4:129-138. Saiki, R. K., Scharf, S., Faloona, F., Mullis, K. B., Horn, G. T., Erlich, H. A., and Arnheim, N. 1985. Enzymatic amplification of b-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science 230:1350-1354. Sanford, C. P. J. 1990. The phylogenetic relationships of salmonoid fishes. Bull. Brit. Mus. Nat. Hist. (Zool.) 56:145-153. Schmidt, T. R., and Gold, J. R. 1993. The complete sequence of the mitochondrial cytochrome b gene in the cherryfin shiner, Lythrurus roseipinnis (Teleostei: Cyprinidae). Copeia 1993:880-883. Schmidt, T. R., and Gold, J. R. 1995a. Systematic affinities of Notropis topeka (Topeka shiner) inferred from sequences of the cytochrome b gene. Copeia 1995:199-204. Slobodyanyuk, S. J., Kirilchik, S. V., Pavlova, M. E., Belikov, S. I., and Novitsky, A. L. 1995. The evolutionary relationships of two families of cottoid fishes of Lake Baikal (East Siberia) as suggested by analysis of mitochondrial DNA. J. Mol. Evol. 40: 392-399. Song, Choon-Bok. 1994. "Molecular Evolution of the Cytochrome b Gene among Percid Fishes." Unpublished dissertation thesis, University of Illinois at Urbana-Champaign, IL. Stiassny, M. L. J. 1986. The limits and relationships of the acanthomorph teleosts. J. ZooI. Lond. (B) 1986: 411-460. Stiassny, M. L. J. 1991. Phylogenetic intrarelationships of the family Cichlidae: An overview. In "Cichlid Fishes: Behavior, Ecology, and Evolution" (M. H. A. Keenleyside, ed.), pp. 1-35. Chapman and Hall, London. Stiassny, M. L. J., and Jensen, J. S. 1987. Labroid intrarelationships revisited: Morphological complexity, key innovations, and the study of comparative diversity. Bull. Mus. Comp. Zool. 151: 268-319. Swofford, D. L. 1993. "PAUP: Phylogenetic Analysis Using Parsimony, Version 3.1.1." Illinois Nat. Hist. Surv., Champaign, IL.
17. Cytochrome b Gene
Sturmbauer, C., and Meyer, A. 1992. Genetic divergence speciation and morphological stasis in a lineage of African cichlid fishes. Nature 359:578-581. Sturmbauer, C., and Meyer, A. 1993. Mitochondrial phylogeny of the endemic mouthbrooding lineages of cichlid fishes from Lake Tanganyika in eastern Africa. Mol. Biol. Evol. 10: 751-768. Sturmbauer, C., Verheyen, E., and Meyer, A. 1994. Mitochondrial phylogeny of the Lamprologini, the major substrate spawning lineage of cichlid fishes from Lake Tanganyika in eastern Africa. Mol. Biol. Evol. 11:691-703. Taberlet, P., Meyer, A., and Bouvet, J. 1992. Unusual mitochondrial DNA polymorphism in two local populations of blue tit (Parus caeruleus). Mol. Ecol. 1: 27-36. Tzeng, C. S., Hui, C. F., Shen, S. C., and Huang, P. C. 1992. The complete nucleotide sequence of the Crossostoma lacustre mitochon-
303
drial genome: Conservation and variations among vertebrates. Nucleic Acids Res. 20:4853-4858. Whitmore, D. H., Thai, T. H., and Craft, C. M. 1994. The largemouth bass cytochrome b gene. J. Fish Biol., 44:637-645. Wilson, M. V. H., and Williams, R. R. G. 1993. Phylogenetic, biogeographic, and ecological significance of early fossil records of North American freshwater teleostean fishes. In "Systematics, Historical Ecology, and North American Freshwater Fishes" (R. L. Mayden, ed.). Stanford University Press, Stanford. Yang, W., Oyaizu, Y., Oyaizu, H., Olson, G. J., and Woese, C. R. 1985. Mitochondrial origins. Proc. Natl. Acad. Sci. USA 82:4443-4447. Zhu, D., Jamieson, B. G. M., Hugall, A., and Moritz, C. 1994. Sequence evolution and phylogenetic signal in control-region and cytochrome b sequences of rainbow fishes (Melanotaeniidae). Mol. Biol. Evol. 11:672-683.
This Page Intentionally Left Blank
Taxonomic Index
A
Abramites, 231f Abudefduf, 4, 116f, 117t, 119-123 Acanthemblemaria, 246 t, 256f, 258f, 259, 265-266
Acantholingua, 146 Acanthomorpha, 287f, 288 t, 297-298 Acanthopterygii, 276, 287f, 288 t, 294, 299 ACESTRORHYNCHIDAE, 22 I f Acestrorhynchus, 220f, 221f, 228, 229f, 231f, 237 Acipenser, 129, 137, 140, 288, 292 t, 294295, 296f, 297f ACIPENSERIDAE, 129, 137, 140, 288, 292 t, 294-295, 296f, 297f Acnodon, 227-228 Actinopterygii, 2, 4, 285-303 Adinia, 189-191, 192t, 193f Aequidens, 124
Ammocrypta, 2, 76-77, 79, 81, 87, 91, 93, 130-131, 138 ANABLEPIDAE, 164t, 167, 176, 186,
171-172f, 173f, 179f, 180f Anableps, 164t, 167, 176, 186, 172-173f, 179f, 180f ANGUILLIDAE, 276f Anisotremus, 116f, 117 t Anodus, 219 ANOSTOMIDAE, 220t, 221f, 231f, 236, 238 Anotophysi, 276f Apareiodon, 231f, 238 Aphanius, 164 t, 167-168, 172f, 173f174f, 175, 177, 180f, 181, 188 APHYOCHARACIDAE, 22 I f
ALOPIIDAE, 202, 203f, 210-212, 214, 216 Alticorpus, 26
Aphyocharax, 221f, 229f, 231f Aplocheilichthyinae, 164 t Aplocheilichthys, 164t, 171f- 172f, 175, 179f, 186 APLOCHEILIDAE, 76, 155, 164t, 168, 171f- 172f, 175, 179f, 186, 288 t Aplocheiloidei, 164 t Apodichthyinae, 251 t Apodichthys (Xererpes), 251 t, 256f, 259f, 260f, 262, 267 Apteronotus, 232f ARAPAIMIDAE, 276f Argentinoidei, 237, 276f Astatotilapia, 180f Astatoreochromis, 43f, 44, 280f Astyanax, 125, 221f, 227-228, 229f, 231f, 237-238
Alutera, 117 t Amblyglyphidodon, 123
Atheriniformes, 275 Atherinoidei, 276f
ALBULIDAE, 276f Alcolapia, 43f, 44 Alestes, 220f, 221f, 228, 229f, 231f, 232, 233f, 234f, 235-236 ALESTIDAE, 220t, 221f, 228, 231f, 232, 235-236, 238 Alestiinae, 220f, 221f, 227f, 228, 230, 232, 236, 238 Alopias, 203f, 205f, 207f, 209, 21Of, 211,
212f, 215f
305
Atherinomorpha, 276f Auchenionchus, 246 t, 256f, 258f Auha, 275 Aulonocara, 274f, 275, 281 t Aulopiformes, 276f Axoclinus, 246 t, 256f, 258f, 259f, 260, 266
B
Baione, 149 BALISTIDAE, 117 BALITORIDAE, 204, 224, 227f, 288 t, 292 t, 294, 296f, 297f Barbus, 239 bass, 129, 140, 274, 288-289, 298 BATHYDRACONIDAE, 247f, 250, 251 t, 253 t, 260f, 261,267 Batrachoidiformes, 276f Beloniformes, 276f Beryciformes, 276f BLENNIIDAE, 93, l17t, 119, 120t, 245270 Blennioidei, 4, 245-270 bogue, 289 bonito, 289 Boops, 288 t, 289, 294-295, 292 t, 294f,
296f, 297f Boulengerella, 224, 227f, 228, 229f, 231f, 232f, 233f, 234f, 236, 238 Brachydanio, 273f, 274f, 275, 281 t Brachymystax, 146-148, 149f Brycon, 125, 221f, 228, 229f, 231f, 238 Bryconinae, 221f, 228, 238
butterflyfish, 27
Index
306
C Callochromis, 18 t Caperea, 294f, 296f, 297f CARACIDAE, 228, 238-239 CARANGIDAE, 288 t, 289, 292 t, 294f, 295, 296f, 297f, 298
Carassius, 233f, 234f CARCHARHINIDAE, 202, 209, 214215, 289, 292 t, 294f, 296f, 297f Carcharhiniformes, 204
Carcharhinus, 205f, 207f, 21Of, 21 l f, 214-215, 289, 292 t, 294f, 296f, 297f Carcharias, 202, 203f, 205f, 207f, 209, 210f, 211f, 212f, 214-215 Carcharodon, 203f, 205f, 207f, 209, 210f, 211,212f, 215f Carnegiella, 220f, 231f carp, 204-206, 207f, 210f, 211f, 273, 288 catfish, 199-218, 230-232, 235-237 Catoprion, 227-228 CENTRARCHIDAE, 120, 288 t, 292 t, 294f, 295, 296f, 297f, 298
Cetopsis, 232f CETORHINIDAE, 202, 209, 211-212, 214 Cetorhinus, 203f, 205f, 207f, 209, 21Of, 211-212, 214, 215f, 216 CHAENOPSIDAE, 117 t, 245, 246 t, 247f, 247, 249-250, 253 t, 254, 256, 258f, 259, 262-266, 268 Chaenopsis, 246 t, 254, 256f, 258f, 259, 264-266, 268
CICHLIDAE, 2, 4-5, 8-9, 13, 15, 16f, 18, 19-20, 25-37, 39-51, 97-111, 117, 124, 129, 140, 164, 188, 173f174f, 18Of, 219, 239, 271-283, 286, 287f, 288 t, 289-291, 292 t, 294f, 295-296, 297f, 298 ciscoe, 145, 227f, 230 CITHARINIDAE, 220f, 221f, 227f, 228, 230, 231f, 236-238 Citharinus, 220f, 227f, 228, 229f, 230231, 232f, 238 CLINIDAE, 245-270 Clinini, 246 t, 249, 256f, 257, 258f, 263264 Clinitrachus, 246 t, 247, 249, 256f, 257, 258f, 264 Clupea, 232, 233f, 234f CLUPEIDAE, 276f Clupeiformes, 221f, 276f Clupeochara, 219 Clupeoidei, 276f Clupeomorpha, 276f Cnesterodon, 164t, 186, 171f-172f, 179f cod, 76, 129, 137, 288 cod icefish, 251 t, 256f, 260f, 267 coelacanth, 273 COELACANTHIDAE, 273, 275, 281 t Coelacanthiformes, 273, 275, 281 t Colossoma, 227f, 228, 229f, 231f, 238239 Coralliozetus, 266 Coregoninae, 145, 149f
Chaetodon, 117 t
Corynopoma, 231f
CHAETODONTIDAE, 27, 117 Chalceus, 221f, 227f, 229, 231f, 232, 233f, 234f, 236, 238 Channiformes, 276f Channoidei, 276f char, 145-162 CHARACIDAE, 123-125, 219-243, 253 Characidium, 231f, 238 Characiformes, 2, 4, 5, 123, 219-243, 253, 276f Characinae, 220f, 221f Chasmodes, 93
COTTIDAE, 286 t, 288 t
Cheirodon, 221f, 229f, 231f Cheirodontinae, 227f, 232, 238 CHILODONTIDAE, 220t, 221f, 231f Chilodus, 231f, 238 CHIROCENTRIDAE, 276f CHIRONEMIDAE, 227f, 238, 246 t Chondrichthiomorphi, 287, 295 Chondrichthyes, 281 t, 287f, 294f, 295,
296f, 297f
Chondrostei, 287f, 288 t
Chromis, 117 t Cichla, 43f, 44 Cichlasoma, 287f, 288 t, 289f, 292 t, 294f,
296f, 297f
Crenichthys, 164t, 171f, 172f, 179f, 186, 191 CRENUCHIDAE, 231f Cristivomer, 149 Crossoptergii, 273, 275 Crossostoma, 204, 205f, 207f, 210f, 211f, 224, 227f, 232f, 288, 292 t, 294 t,
296f, 297f Cryptotremini, 246 t, 247, 249, 256f, 257, 258f, 265 Crystallaria, 76-77, 130-131, 138 CTENOLUCIIDAE, 220t, 221f, 224, 227f, 228-229, 231f, 236, 238 Ctenolucius, 220f, 228, 229f, 231f, 238 Ctenosquamata, 287f, 297 Cubanichthyinae, 164 t Cubanichthys, 164t, 176, 181, 186, 171f-
174f, 179f, 180f Culaea, 164t, 173f, 174f, 180, 188 CURIMATIDAE, 220t, 221f, 227f, 230, 231f, 237 CYNODONTIDAE, 221f Cynolebias, 164t, 171f-172f, 179f, 186 CYNOPoTAMIDAE, 221f Cynopotamus, 221f, 229f, 231f
Cyphocharax, 227f, 231f Cyphotilapia, 43f, 44-45, 48 CYPRINIDAE, 92-93, 189, 204, 227f, 231-232, 234f, 235-236, 239, 279, 281 t, 286 t, 288, 292 t, 294-295,
29G 297f Cypriniformes, 2, 163, 164t, 189-197, 221f, 222, 227f, 232f, 236-237, 239, 276f, 279, 287f, 288 t, 289, 294-295, 296f, 297f, 298 Cyprinodon, 164 t, 186, 188, 171f- 174f,
179f, 180f CYPRINODONTIDAE, 5, 93, 164, 167168, 170, 171f-174f, 173, 175-177, 179f, 18Of, 180t, 181-182, 186, 188 Cyprinodontiformes, 4, 7, 163-188, 189-197, 276f, 279 Cyprinodontinae, 164t, 176-177, 181 Cyprinodontoidei, 163-188 Cyprinus, 204, 205f, 207f, 21Of, 211f, 232f, 233f, 234f, 273f, 275, 281 t, 288 t, 292 t, 294, 296f, 297f
Cyrtocara, 43f, 44f
D
Dacrylopteriformes, 276f DACTYLOSCOPIDAE, 245-270 damselfish, 4, 121, 288 Danio, 279, 233f, 234f darter, 2, 4, 8, 75-96, 129-142, 288 DENTICIPITIDAE, 276f Denticipitoidei, 276f Dialommus, 263 Dicentrarchus, 288 t, 289, 292 t, 294, 296f,
297f Dictyosoma, 25/t, 256f, 260f Didelphis, 294f, 296f, 297f Diodon, 117 t DIODONTIDAE, 117 Dipnoan, 273 DISTICHODONTIDAE, 219, 220 t, 221f, 227f, 228, 230-238 Distichodus, 228, 229f, 230-233, 234f, 235-236, 238 dragonfish, 251 t, 256f, 260f, 267 DUSSUMIERIIDAE, 276f
E
Ecsenius, 246 t, 256f, 258f, 259f, 261, 266 eelpout, 251 t, 256f, 260f, 267
Eigenmannia, 232f, 233f, 234f elasmobranchs, 199-218, 286 t electric fish, 232, 235-237 ELOPIDAE, 276f Elopomorpha, 276f
Index
307
EMBIOTOCIDAE, 289
Geophagine, 287f, 288 t
Hoplias, 219, 220f, 224, 227f, 228-229,
Emblemaria, 246 t, 256f, 258f, 259, 265-
Gephyrocharax, 231f GERREIDAE, 116f, 117 t Gerres, 116f, 117 t Gibbonsia, 246 t, 254, 256f, 258f, 261-
236 huchen, 145 Hucho, 146-149, 150f, 151f Huchonini, 147 t Hybopsis, 93 Hydrocynus, 219, 228, 229f, 231f Hyphessobrycon, 125 HYPOPOMIDAE, 124 Hypopomus, 124
266
Empetrichthys, 191 ENGRAULIDAE, 276f Entomacrodus, 246 t, 254, 256f, 258f, 259f, 261-262 Eretmodini, 98- 99, 103-109 Eretmodus, 99, 103, 104f, 105 ERYTHRINIDAE, 219, 220t, 221f, 224, 227f, 228, 231f, 238 ESOCIDAE, 93, 227f, 232, 237 Esociformes, 221f, 237 Esocoidei, 237 Esox, 93, 227f, 232, 233f, 234f, 237 Etheostoma, 75-96, 130-131, 134-135, 136f, 138-139, 288, 292 t, 294f, 295, 296f, 297f Etheostomatini, 130-131 Eurypterygii, 287 t, 297 Euteleostei, 221f, 237, 276, 287 t, 288 t, 294-295, 296f, 297f, 298 Exerpes, 246 t, 256f, 257, 258f, 265
F
Floridichthys, 164 t, 173f- 174f, 18Of, 188 Fluviphylacinae, 164 t Fluviphylax, 164t, 171f-172f, 175, 179f, 186
Fontinus, 191, 195 Fugu, 281 t
262, 264
Ginglymostoma, 274f, 279, 281 t GINGLYMOSTOMATIDAE, 274f, 279, 281t Glandulocaudinae, 219, 230, 231f Gnathocharax, 221f, 224, 227f, 229f, 231f Gnathochromis, 18f Gnatholepis, 120 Gobiesociformes, 276f GOBIIDAE, 120 Gonorhynchiformes, 221f, 225, 227f, 230, 232f, 236-237, 276f GOODEIDAE, 164t, 186, 171f-172f, 179f, 191 gunnel, 246 t, 250, 251 t, 256f, 260f, 267 guppy, 274f GYMNARCHIDAE, 276f Gymnocephalus, 76, 130-131, 134f, 135, 136f, 137, 281 t Gymnocorymbus, 220f, 227f, 232, 233f, 234f, 235-236, 238 Gymnodraco, 251 t, 256f, 260f, 267
193-195
G GADIDAE, 76, 129, 138, 288, 292 t, 294,
296f, 297d Gadiformes, 276f, 288 t Gadus, 76, 129, 137, 288, 292 t, 294, 296f,
297f Galaxioidea, 276f
Galeocerdo, 205f, 207f, 210f, 211f, 289, 292 t, 294f, 296f, 297f Gambusia, 76, 93 Garmanella, 164t, 170, 173f-174f, 180f, 188 GASTEROPELECIDAE, 219, 220f, 220t, 230, 231f, 232, 235-236, 238 Gasteropelecus, 231-232, 233f, 234f, 235-236, 238 Gasterosteiformes, 276f
Gasterosteus, 281 t, 286 t GASTROMYZONTIDAE, 224
261-262 I ICHTHYOBORIDAE, 220 t ISONIDAE, 76 Isurus, 203f, 205f, 207f, 209, 211-212
1
Jenynsia, 164t, 167,171f-172f, 176,179f, 186
Jordanella, 164 t, 171f- 174f, 179f, 180f, 186, 188
Julidochromis, 43f, 44f, 44-45
Gymnogeophagus, 281 t GYMNOTIDAE, 125, 232 Gymnotiformes, 221f, 227f, 231-232, 234f, 235, 237, 276f Gymnotus, 125
FUNDULIDAE, 4, 93, 164 t, 168f, 171f174f, 179f, 180f, 181, 186, 189-197,
286t Fundulus, 93, 168f, 164t, 171f-174f, 179f, 180f, 181, 186, 188-191, 192t,
Hypostomus, 227f, 232f, 233f, 234f Hypsoblennius, 247 t, 256f, 258f, 259f,
H
HAEMULIDAE, 116f, 117 t, 120 t
Haemulon, 120t Halichoeres, 120 HALOSAURIDAE, 276f haplochromine, 25-37, 43-47, 48f, 97, 99, 276-280 Haplochromis, 25-37, 43f-44 f, 45-47,
48f hatchetfish, 219 HEMIODIDAE, 221f, 231f HEMIODONTIDAE, 219, 220t, 221f, 236, 238 Hemiodus, 231f, 233f, 234f, 236 HEPSETIDAE, 220t, 221f, 231f, 238 Hepsetus, 219, 220f, 228-230, 231f, 238 Heterochromis, 44 Heteroclinus, 246 t, 256f, 258f, 261-262, 264 Heterostichus, 246 t, 254, 256f, 258f, 259f, 264 HIODONTIDAE, 276f
Holacanthus, 116f, 117 t HOLOcENTRIDAE, 120 t Holocentrus, 120 t
K
Karalepis, 246 t, 256f, 259f, 260, 266 kelpfish, 245-270 killifish, 5, 164, 286 t klipfish, see kelpfish
Kneria, 227f, 232f KNERIIDAE, 227f, 230, 232f Kosswigichthys, 164t, 167, 170, 173f174f, 175, 177, 180f, 181, 188 L
Labeo, 239 Labeotropheus, 27 t, 30-32 LABRIDAE, 117, 120, 298 LABRISOMIDAE, 245-270 Labrisomini, 246 t, 256f, 257, 258f, 263, 265 Labrisomus, 246 t, 249, 254, 256f, 257, 258f, 261 - 265 Labroidei, 287, 288 t, 294f, 295, 296f, 298 Lamna, 203f, 205f, 207f, 209, 21Of, 211f,
212f, 215f LAMNIDAE, 199 - 218 Lamniformes, 199-218 Lampridiformes, 276f lamprologine, 44 Latimeria, 273, 275, 281 t LATIMERIIDAE, 273, 275 LEBIASINIDAE, 219, 221f, 228-229,
231f
Index
308
lenok, 145, 146t, 147t Lepidogalaxoidei, 237, 276f Lepomis, 120
Leporinus, 231f, 232f, 233f, 234f, 236 Leptoblenninae, 246 t, 260 Leptolucania, 189-190, 192t, 193f Lethrinops, 18 t, 25 Limnochromis, 18 t loach, 204-206, 288 Lophiiformes, 276f LORICARIIDAE, 227f Lucania, 189-190, 192t, 193f Luciopercinae, 130f Luciopercini, 130 lungfish, 273, 299 LUTJANIDAE, 116-117
Lutjanus, 116f, 117t Lycodes (Aprodon), 251 t, 252, 254, 256f, 260f, 261-262, 267 Lycodichthys, 251 t, 252, 256f, 260f, 267 Lythrurus, 92-93, 288, 292 t, 294, 296f, 297f M
mackerel, 289
Malacoctenus, 246 t, 249, 256f, 257, 258f, 261 - 262, 264- 265
Malapterurus, 323f Mallotus, 76 mbuna, 2, 5, 25-37 Megachasma, 203f, 205f, 207f, 209, 210f, 211f, 212, 214, 215f, 216 MEGACHASMIDAE, 202, 209, 212, 214,216 MEGALOPIDAE, 276f
Megupsilon, 164 t, 173f- 174f, 180f Melanochromis, 27, 30-32, 34, 41, 43f, 44f, 164 t, 173f- 174f, 180f, 188 Melanotaenia, 76, 281 t MELANOTAENIIDAE, 76, 130, 281 t, 286t Melichthys, 117 t Metynnis, 227-228, 232, 233f, 234f, 236,
Morone, 274f, 281 t, 288, 292 t, 294, 296f, 297f MORONIDAE, 274f, 281 t, 288 t, 289, 292 t, 294, 296f, 297f MULLIDAE, 117 t
Mulloidichthys, 117 t MURAENIDAE, 276f Myctophiformes, 276f Mylesinus, 227f, 228 Myleus, 227-228 Mylossoma, 227f, 228 Myxodagnus, 247 t, 256f, 258f, 259f, 260f, 266
Myxodes, 246 t, 256f, 257, 258f, 264 Myxodini, 246 t, 248f, 249, 256f, 257, 258f, 263-264 N
Nannobrycon, 227f, 232, 233f, 234f Nannostomus, 227f, 228, 229f, 231f, 232f Negaprion, 205f, 207f, 21Of, 211f Nemophini, 247 t, 259f, 260-261,263264, 266 Neoclinini, 246 t, 247, 256f, 258f, 257, 265 Neoclinus, 246 t, 247, 249, 256f, 258f, 260, 265 Neolamprologus, 43f, 44-45 Neopterygii, 286 t, 287f, 288 t, 295 Neoteleostei, 221f, 276, 287f, 288 t, 295,
296f NOTACANTHIDAE, 276f Notemigonus, 288, 292 t, 294, 296f, 297f
Nothobranchius, 164 t, 171f- 172f, 179f, 186, 288 t Notoclinus, 246 t, 256f, 259f, 260, 266 NOTOPTERIDAE, 276f Notopteroidei, 276f Notothenia, 251 t, 256f, 260f, 266-267 NOTOTHENIIDAE, 247f, 250, 251 t, 253 t, 256f, 257, 261, 267 Nototheniinae, 251 t, 267 Notothenioidei, 245-270
238
Micropterus, 288, 292 t, 294f, 295, 296f, 297f Microstomus, 129, 140, 262 minnow, 237
Mitsukurina, 203f, 205f, 207f, 210f, 211, 212f, 215f MITSUKURINIDAE, 202, 211, 212f Mnierpes, 246 t, 256f, 257, 258f, 259f, 263 Mnierpini, 246 t, 254, 256f, 257, 258f, 263, 265 MONAcANTHIDAE, 117 t, 288 t, 289, 292 t, 294, 296f, 297f MORMYRIDAE, 276f
O ODACIDAE, 298 ODONTASPIDIDAE, 202, 203f, 209, 211,212f, 214-216 Odontaspis, 202, 203f, 205f, 207f, 209, 21Of, 211, 212f, 214, 215f, 216 Oligosarcus, 221f, 228, 229f, 231f, 237238 Omobranchini, 247 t, 256f, 259f, 261, 263, 266
Omobranchus, 247 t, 256f, 258f, 259f, 260f, 261-263, 266
Oncorhynchus, 2, 4, 5, 40, 53-73, 76, 145-146, 148-149, 154-159, 204,
205f, 207f, 21Of, 211f, 274f, 281 t, 288, 292 t, 294, 296f, 297f Ophiclinini, 246 t, 247f, 253, 256f, 257, 258f, 263-264 Ophiclinus, 246 t, 249, 256f, 257, 258f, 263
Ophioblennius, 117t, 119, 120t, 246 t, 256f, 259f, 260f, 261, 266 Opthalmotilapia, 18 t Oreochromis, 27, 30, 40-41, 43f, 44, 280f, 281 t, 288 t, 289, 292 t, 294f, 295,
29G 297f Orestias, 164t, 167, 173f, 174f, 175, 177, 180f, 181 Orestini, 164 t, 177 OSMERIDAE, 76, 237 Osmeriformes, 221f, 276f Ostariophysi, 219-243, 276, 288 t OSTEOGLOSSIDAE, 276f Osteoglossomorpha, 276f Otophysi, 221f, 230-32, 235-237, 276f
P
pacus, 221, 228 Pagothenia, 250, 251 t, 256f, 260f, 267 PANTODONTIDAE, 276f Parablenniini, 247 t, 250, 256f, 259f, 260261, 263 Parablennius, 247 t, 259f, 250, 256f, 259f, 263, 266 Paracanthopterygii, 276, 287f, 288 t, 294, 297-298 Parachaenichthys, 251 t, 256f, 260f, 267 Paracheirodon, 227f, 232, 233f, 234f, 235-236, 238 Paraclinini, 246 t, 249, 256f, 257, 258f, 265 Paraclinus, 246 t, 249, 253-254, 255f, 256f, 257, 258f, 259f, 264-265
Paracyrichromis, 18 t Parafundulus, 189 Parahucho, 146-150, 152, 158-159 Paralabrax, 129, 140 Paranthias, 116f, 117t Parasalmo, 155 Parkneria, 232f PARODONTIDAE, 220t, 221f, 231f Perca, 76, 130-131, 133, 134f, 135, 136f, 137, 281 t Percarina, 130, 138 perch, 76, 130-131, 133, 134f, 135-139 PERCIDAE, 2, 4, 8, 75-96, 129-142, 281 t, 287f, 288, 292 t, 295, 296f, 297f, 298 Perciformes, 27, 39-49,97-109, 129-
Index 140, 245-270, 276f, 287f, 294-295, 296f, 297f, 298, 300 Percina, 76-77, 130-131, 134f, 135, 136f, 137-139 Percinae, 130-131 Percini, 130 Percoidei, 288 t Percomorpha, 275, 287f, 288 t-289t, 290 Percopsiformes, 276f
Perissodus, 18 t Petrochromis, 106f, 246 t Petroscirtes, 247 t, 256f, 259f Phenacogrammus, 227f, 228, 229f, 232, 233f, 234f, 235-236 PHOLIDAE, 247f, 250, 251 t, 253 t, 260f, 261-262, 267 Pholinae, 251 t
Pholis, 251 t, 256f, 260f Piaractus, 227-228, 239 pike-perch, 130f, 138-139, 237 Pimelodella, 124 PIMELODIDAE, 124, 227f Pimelodus, 233f, 234f piranha, 219-244, 253-254 Plancterus, 189-191, 192t, 193f, 195 Plectobranchus, 251 t, 252, 256f, 260f PLEURONECTIDAE, 129, 140, 262 Pleuronectiformes, 276f Poecilia, 164t, 171f-172f, 179f, 186, 273f,
274d, 281 t
PSEUDOCARCHARIIDAE, 202, 209, 214,216
Pseudocorynopoma, 22 I f Pseudotropheus, 18 t, 20-32, 34, 41, 43f 44f, 45 Pygocentrus, 225f-226 f, 227-228, 229f, 231f, 238-239 Pygopristis, 222 Pyrrhulina, 228, 229f, 231f R rainbow fish, 76, 130 RAPHIODONTINAE, 229, 238
Rhabdoblennius, 246 t, 256f, 258f, 259f, 260f, 261, 266
Rhamphichthys, 232f, 233f, 234f Rhamphochromis, 18 t Rhaphiodon, 221f, 229, 231f, 238 RIVULIDAE, 164 t Rivulus, 76, 155, 164t, 168, 171f-172f, 179f, 186 rockfish, 25-37
Roeboides, 123-125 Romanichthyini, 130
Romanichthys, 76, 130-131, 138 Rosenblatella, 246 t, 256f, 259f, 260, 266 ruffe, 76, 130-131, 134f, 135, 136f, 137-
286t
297f Poptella, 221f, 228, 229f, 231f, 237-238 PRIACANTHIDAE, 117 i
Priacanlhus, 117 l prickleback, 251 l, 256f, 260f
Prionace, 205f, 207f, 210f, 211f Prislobrycon, 227-228 PROCHILODONTIDAE, 220 l, 221f, 231f, 237 Prochilodus, 227f, 230, 231f PROFUNDULIDAE, 164 t, 171f- 172f, 179f, 186, 189, 190f Profundulus, 164t, 170, 171f-172f, 173, 175, 178f-179f, 180t, 181,186, 190f, 192t Protacanthopterygii, 221f, 232, 235, 237, 276, 288 t PROTOPTERIDAE, 273 Protopterus, 273
Pseudocarcharias, 203f, 205f, 207f, 209, 210f, 211f, 212f, 214, 215f, 216
SCHILBEIDAE, 227f Scomber, 288 t, 289, 292 t, 294f, 295, 296f,
297f SCOMBRIDAE, 286 t-288 t, 289, 292 t, 294f, 295, 298f, 297f, 298 Scombroidei, 286 t, 287f, 288 t, 294f, 295, 296f, 297f, 298
Scorpaena, 116f, 117 t SCORPAENIDAE, 25-37, 116f, 117t, 129, 140, 262 Scorpaeniformes, 25- 37, 116f, 117 t Sebastolobus, 129, 140, 262 SERRANIDAE, 116f, 117t, 129, 140 SERRASALMIDAE, 221f Serrasalminae, 219-243, 253 Serrasalmus, 227-228 shark, 2, 4, 114, 199-218, 262, 274, 289, 290 shiner, 288 Siluriformes (Nematognathi), 124, 221f, 227f, 231-233, 234f, 235, 276f Simochromis, 98- 99, 105, 106f, 107-109 smelt, 237 sole, 129, 140 SPARIDAE, 288 t, 289, 292fl, 294-295,
Rypticus, 116f, 117 t
296f, 297f Spathodus, 99, 103, 104f, 105, 109 Sphyrna, 205f, 207f, 21Of, 211f stargazer, 245, 247t, 250, 256f, 259f Starksia, 246 t, 249, 256f, 258f, 259f,
s Salariini, 246 t, 250, 256f, 259f, 261, 263, 266 Salminus, 219, 228, 229f, 231f
261-262, 264 Starksiini, 246 t, 249, 256f, 257, 258f, 262-263, 265 Stathmonotus, 246 t, 249, 256f, 257, 258f, 263, 265 Stegastes, 120
140
POECILIIDAE, 76, 93, 164t, 171f-172f, 175-176, 179f, 186, 274f, 281 t, POMACENTRIDAE, 4,116f, 117 t, 119 120, 121 t, 122-123, 287f, 288, 292 t, 294f, 295, 296f, 297f Pomacentrus, 288, 292 t, 294f, 295, 296f,
309
Salmo, 53-54, 146t, 148t, 149t, 233f, 234f, 237, 281f, 286t salmon, 56, 76, 145 - 162 SALMONIDAE, 2, 4- 5, 40, 53 - 73,145 162, 237, 262, 286 t, 288, 292 t, 294,
296f, 297f Salmoniformes, 53-73, 221f, 222, 227f, 232, 237, 276, 279 Salmoninae, 2, 65, 145 - 162 Salmonoidei, 237, 276f, 287, 288f, 294295 Salmothymus, 146 Salvelinus, 53, 145 - 162
Steindachneria, 231f Stethaprioninae, 237, 251 t, 252 STICHAEIDAE, 247f, 250, 250-251, 253 t, 256f, 261, 267
Sticharium, 246 t, 256f, 258f, 256f Stizostedion, 8, 76, 130-131, 132f, 133139 Stomiiformes, 276f sturgeon, 129, 137, 140, 288, 295 sunfish, 139 swordfish, 140 Synbranchiformes, 276f
Synodontis, 233f, 234f
Salvethymus, 146 t Sarcopterygii, 286 Sarda, 288 t, 289, 292 t, 294f, 295, 296f,
T
297f Sarotherodon, 40 Satanoperca, 288 t, 292 t, 294f, 296f, 297f sauger, 76, 131, 133-135, 136f, 137-139
taimen, 146 t, 147 t Tanganicodus, 99, 103, 104f, 105, 109 Teleostei, 75-96, 121,219, 245, 253, 262,
SCARIDAE, 298
Teleostomi, 262, 295 Tetradontiformes, 276f
Schilbe, 227f, 233f, 234f
287f, 288t
Index
310
Tetragonopterinae, 220f, 221f, 227f, 229, 237- 238 Tetragonopterus, 221f, 229, 231f, 237-238 tetra, 219-245
Thalassoma, 117t, 120t Thorachromis, 280f Thorichthys, 43f, 44 thornyhead, 129, 140 Thunnus, 288, 289, 292 t, 294f, 295, 296f,
297f Thymallinae, 145
Thymallus, 148f Tilapia, 40 tilapiine, 28, 40-41, 43f, 44, 97, 289 Tomeurus, 164t, 171f-172f, 179f, 186 Trachinoidei, 250, 266-268 Trachurus, 288 t, 289, 292 t, 294f, 295, 296-297 Trematominae, 251 t, 267 Trematomus, 251 t, 256f, 260f, 267
Triakis, 281f Trichomycterus, 232f Triportheus, 221f, 227f, 229f TRIPTERYGIIDAE, 245-270
Tripterygiinae, 246 t, 260, 266 Tripterygion, 246 t, 250, 256f, 259f, 260, 266 Tropheini, 99, 106 Tropheus, 39, 98-99, lOOf, 101f, 102f, 103-109 trout, 2, 4, 5, 53-73, 140, 145-162, 204206, 274f, 288 tuna, 288, 289, 292 t, 294f, 295, 296f,
297f Tylochromis, 43f, 44 Tyrannochromis, 43f, 44
U URANOSCOPIDAE, 266
W
walleye, 8, 76, 130-131, 132f, 133-135, 136f, 137t, 138-140 whitefish, 145
X
Xenisma, 191, 193f, 194-195 Xenotilapia, 18 t Xenotoca, 164t, 171f-172 f , 179f, 186 Xiphias, 140 XIPHIIDAE, 140
Xiphophorus, 164t, 168, 171f, 172f, 179f, 186
Z
zander, 133-135, 136f, 137-138 zebrafish, 273f, 274f, 275f, 279 Zeiformes, 276f Zingel, 76, 130-131, 138 Zoarces, 251 t, 252, 256f, 260f, 267 ZOARCIDAE, 247, 250, 251 t, 253 t, 256f, 261,267 Zoarcoidei, 245-270
Zoogoneticus, 164t, 171f, 172f, 179f, 186, 250, 251 t, 253 t, 260-261, 263, 266268
Zygonectes, 164t, 171f-172f, 179f, 186, 191, 193f, 195
Subject Index
A Adaptive radiation, 2, 4, 8, 39, 47, 65, 97-99, lOOf, 101, 103-105, 107108, 145, 239, 276- 279, 289 Adenosine triphosphatase 6 gene, mitochondrial DNA, 119, 124f, 156158 Adenosine triphosphatase 8 gene, mitochondrial DNA, 124f Africa, biogeography, see also Great Lakes of Africa, 4, 8, 13, 15, 18, 2537, 39-40, 42, 43f, 44-46, 48, 97111, 219-243, 248f, 249, 286 Alaska, biogeography, 50, 153-154 Allopatric distributions, 26, 39, 47, 108, 121,131,138-139,153,159,164,170 Allozymes, 2, 5-6, 26-29, 32, 34, 40, 45, 47, 54, 65-66, 77, 92, 115-116, 121-122, 131, 138-139, 145-148, 150-156, 158, 189, 191, 195, 245, 249, 252, 261f, 264-265, 267-268, 271, 273, 276-278 Amino acid replacement, 16-19, 82 t 85t, 87, 204, 285, 289t, 291-293 AMOVA, see Analysis of Molecular Variance Analysis of Molecular Variance, 92, 120, 120t, 133 Antarctica, biogeography, 250, 251 t, 263, 267 Arctic, biogeography, 146,150,153-154 Asia, biogeography, 113, 150-154, 157 Atlantic Ocean, biogeography, see also Caribbean Sea and Mediterranean Sea, 7, 53, 76, 116f, 121-122, 124, 245-270
ATPase gene, see Adenosine triphosphatase genes Australia, biogeography, 246f, 248f, 256- 259, 263- 264
B
Baja California, biogeography, see also Gulf of California and Mexico, 54, 55f, 56-57, 58t, 62, 65, 189-197, 248f, 249, 264-265 Balancing selection, 277 Base Composition bias, see Nucleotide bias BIOSYS, 31 Bootstrapped based compatibility test, 7, 165, 167, 169 Bottleneck, see Population bottleneck
C
California, biogeography, 53- 73, 189197 Caribbean Sea, biogeography, 113128, 167 Central America, biogeography, 39, 113-128, 189-197, 246t, 248t, 249, 264-266 Character weighting, see Weighting Clustal analysis program, 42, 204, 222 COI, see Cytochrome oxidase I gene, mitochondrial DNA COII, see Cytochrome oxidase II gene, mitochondrial DNA Codon position variation, see also Third codon position variation, 13-24, 311
163-188, 199-217, 232-235, 285300 Combining molecular and morphological data, 7, 87f, 91-92, 163-188 Continental drift, 39, 93, 220, 238-239 Control region or D-loop, mitochondrial DNA, 2-5, 26, 27 t, 40, 45, 56-58, 65-66, 97-111, 129-142, 156-158, 170, 199-218 primers, 56, 132 secondary structure, 132-134, 137140 tandem repeats, 4, 133-134, 136-137 terminal associated sequences, 133, 134f, 137 Convergent evolution, 2, 26, 39, 49, 199 Costa Rica, biogeography, 124-125 Covarion model, 14, 18-19 Cretaceous period, 167, 202, 298 Cytochrome b gene, mitochondrial DNA, 3-4, 20-21, 40, 75-96, 104f, 121-123, 131, 140, 145-162, 189197, 199-218, 285-300 primers, 78, 191,202, 286, 288, 289t secondary structure, 285-286, 292293 Cytochrome oxidase I gene, mitochondrial DNA, 116, 119, 121-123, 166t Cytochrome oxidase II gene, mitochondrial DNA, 157
D Delta-u distance statistic, 29, 35 D-(displacement) loop, see control region
Index
312 E
Eocene Age, 214, 250, 267, 297 Ependymin gene, nuclear DNA, 5, 219-243 primers, 222 Erie, Lake, 77, 131, 136-139 Europe, biogeography, 129-143, 145162
F Fingerprinting, 41 Fossil evidence, 7, 15, 21, 114-115, 150, 157-158, 189, 201-202, 213-214, 215f, 239, 245, 247, 249, 268, 286, 297-298 Founder effects, 40, 107, 124, 139, 277 Four-fold degenerate codon position, 14, 28 Fst analysis, 59, 76, 133
G Gamma distribution/distance model, 6, 14, 18-20, 23, 132, 135f Geminate taxa, 113-128 Gene flow, 65, 89-90, 92-93, 98, 107, 115-116, 119-120, 152-153 Gondwana supercontinent, 167, 239 Great Barrier Reel 122 Great Lakes of Africa, see also Malawi, Lake, Tanganyika, Lake and Victoria, Lake, 25-35, 39-40, 44, 46, 97111, 129, 239, 271-283 North America, 76-77, 8Of, 129-140 Growth hormone genes GH-IC and GC-IIC, nuclear DNA, 28, 145-159 Gulf of California, see also Baja California, 65-66, 121-122 Gulf of Mexico, coastal biogeography, 73-96
H
Hardy-Weinberg equilibrium, 31-32, 57, 59
I
IAM, see Infinite alleles model Indels, 5, 42, 44, 44f, 45, 48 Indian Ocean, biogeography, 122, 246 t247 t, 250 Infinite alleles model, 29, 57 Internal transcribed spacers, see rRNA Introns, 28, 145-162, 167, 272-273, 275-276, 279 Isozymes, see Allozymes
ITS, see rRNA, internal transcribed spacers
l Jukes-Cantor distances/model, 6, 14, 24, 102f, 165, 209f
K
Karyotype, 146, 151, 153, 155-156, 158-159 Kimura's two-parameter distance method, 6, 14f, 24, 43f, 56, 104, 106, 108, l17t, 118-119, 123f, 124f, 209f, 227, 235, 280f
L Lactate dehydrogenase-B gene, nuclear DNA, 191 LDH, see lactate dehydrogenase-B gene, nuclear DNA Lineage sorting, 26
M
MacClade, 81, 133, 222, 224f, 235 Macromutations, 273, 275-276 MacVector-AssemblyLIGN software, 132, 252 Major Histocompatibility Complex, nuclear DNA, 27, 271-283 methodology, 278-279, 281 t primers, 274, 278-279 secondary structure, 271-273 Malawi, Lake, 5, 25-37, 39-40, 42-45, 47-48, 97-99, 107-109, 279 Maximum-likelihood analysis, 33,117 t, 123t, 152, 165-166, 191, 194, 222, 226, 228, 230-231, 235-236, 291 Mediterranean Sea, 161, 246-247, 248f, 249-250, 256-259, 263-265 MEGA, see Molecular Evolutionary Genetics Analysis METREE, see Minimum Evolutionary Tree Mexico, biogeography, see also Baja California and Gulf of California, 39, 53 - 73, 189 - 197 MHC, see Major Histocompatibility Complex, nuclear DNA MICROSAT, 31, 47, 47f Microsatellite DNA, nuclear, 2, 5, 9, 25-37, 39-51, 53-73, 107, 164, 279 primers, 45, 56 protocols, 52, 56, 63-64 Minimum Evolutionary Tree, 123f
Minisatellite repeats, 28-35 Miocene epoch, 93, 114, 115f, 239, 249250, 262-263, 265-268 Mississippi River, drainage system, 75-96 Mitochondrial DNA, see also individual genes, 2-5, 8-9, 13-24, 26-35, 40, 44, 45, 47, 53-73, 75-96, 9899, lO G lOll, 102-109, 113-116, 119-125, 129, 131-132, 137-140, 145-162, 165, 191, 209, 219, 243, 245- 270, 277- 278, 285- 286, 299300 Molecular clock, 6-7, 9, 13-23, 113128, 139, 252, 261-268, 273 Molecular Evolutionary Genetics Analysis, 132, 135f, 151f, 156 t, 157, 169, 222, 252, 254, 289 Mutation models, 13-24, 29, 34, 45, 47, 54-55, 57, 61 t, 66, 117-119
N NADH dehydrogenase subunit 2 gene, mitochondrial DNA, 13, 15-16, 18-22, 119, 199-218 primers, 202 NADH dehydrogenase subunit 3 gene, mitochondrial DNA, 151-154, 157-158 NADH dehydrogenase subunit 5 gene, mitochondrial DNA, 153 NADH dehydrogenase subunit 6 gene, mitochondrial DNA, 153 ND2, see NADH dehydrogenase subunit 2 gene, mitochondrial DNA ND3, see NADH dehydrogenase subunit 3 gene, mitochondrial DNA ND5, see NADH dehydrogenase subunit 5 gene, mitochondrial DNA ND6, see NADH dehydrogenase subunit 6 gene, mitochondrial DNA Neighbor Joining, 6, 19, 29f, 32, 43, 45, 47f, 56-59, 62f, 63f, 77f, 86f, 102f, 104f, 123f, 124f, 133, 135f, 147f, 148f, 149f, 150f, 151-152,154f, 155f, 157f, 203f, 209, 210f, 211f, 212f, 221f, 227-229, 231,234f, 235236, 252-254, 256, 258f, 259, 260f, 262-265, 266, 273f, 274f, 276f, 280f, 291 Nei's genetic distances, 29, 33, 47, 87, 254, 261f, 262, 264 New Zealand, biogeography, 246 t NJ, see Neighbor joining Non-synonymous codon substitutions, 3, 87, 94 North America, biogeography, 53-73, 76-94, 113-114, 129-131, 137-
Index 139, 145-162, 167, 189-197, 262264 Nuclear DNA, see also Microsatellites, Polymerase Chain Reaction, Random Amplification of Polymorphic DNA, specific gene or region, 2, 4-5, 8-9, 13, 19, 25, 27-35, 3951, 54 - 55, 65, 98, 103, 107-109, 140, 145-159, 164-165, 167, 169170, 174f, 191, 199-200, 219-243, 245-283, 291,300 Nucleotide bias/skew, 6, 17, 19, 116119, 134, 200-201,204-209, 254, 291,300
O Oligocene epoch, 262 One-step mutation model, 29, 32, 35, 57, 59, 61 t, 63f, 66
P
Pacific, see also Indian Ocean coastal region biogeography, 53-73, 79, 129, 145-162, 267 Ocean biogeography, 7, 54, 113-119, 121 - 124, 153, 155-158, 245-270 Paleocene epoch, 167, 215f, 289 Panama, Isthmus of, 2, 7-8, 113-128, 194 PAUP, see Phylogenetic Analysis Using Parsimony PCR, see Polymerase Chain Reaction PHYLIP, see Phylogenetic Inference Package Phylogenetic Analysis Using Parsimony, 43, 44f, 78-81, l OOf, 106f, 133, 136f, 147, 148f, 149, 151f, 154, 155f, 157f, 165,167-170,171f, 172f, 173f, 174f, 191,204, 211,214, 222, 230, 231f, 232, 234f, 252-254, 257f, 258f, 259f, 260f, 285, 289 Phylogenetic Inference Package, 31, 47, 56-57, 59f, 62f-63f, 151-152, 169, 192 Phylogenetic Tail Probability Test, 168 Plate tectonics, see Continental drift Pleistocene epoch, 65-66, 94, 120, 125, 139, 153, 239 Pliocene epoch, 93, 113-118, 125, 131, 138, 153, 194, 262, 264, 267 Poisson distribution model, 6, 14-15, 16f, 18,23 Polymerase Chain Reaction, see also specific genes or regions, 3, 5, 8, 31, 40-42, 43f, 45-47, 54, 76, 78, 132, 163, 191,202, 251-252, 274, 278, 285, 288
protocols, 31, 41, 56, 76, 78, 191,202, 222f, 251-252, 279 Population bottleneck, 124, 129 PTP, see Phylogenetic Tail Probability Test R
Random Amplification of Polymorphic DNA, 5, 27, 39-51, 164 primers, 40-41, 42f, 43f protocols, 41-43 vectorette technique, 41, 42f RAPD, see Random Amplification of Polymorphic DNA REAP, see Restriction Enzyme Analysis Package Restriction Enzyme Analysis Package, 133 Restriction Fragment Length Polymorphisms, 3, 5, 26-27, 40, 76, 107, 119-120, 122, 131, 138-139, 147-149, 152, 154-158, 191 RFLP, see Restriction Fragment Length Polymorphisms Ribosomal DNA, see rRNA and tRNA genes rRNA genes, 145 - 162 12S rRNA gene, mitochondrial DNA, 3-5, 219-243, 245-270, 286 protocols, 251 primers, 251 secondary structure, 222-224, 225f, 228, 253, 255f, 261-262, 268 16S rRNA gene, mitochondrial DNA, 3-5, 7, 167, 168f, 170, 176f, 177f, 178f, 219-243, 267, 286 secondary structure, 4, 167, 168f, 170, 176f, 177f, 178f, 181-182, 222-224, 226f, 268 18S rRNA gene, nuclear DNA, 4, 154 Internal transcribed spacers, nuclear DNA, 4-5, 145-162, 245, 261268 Rst analysis, 57, 59, 61 t, 62f S Saltines, 34-35 Saturation, 7-8, 13-24, 54, 200, 206209, 224-225, 236-239, 268, 291, 295, 299-300 Sea of Cortez, see Gulf of California, Mexico Secondary structure, see also specific genes, 3, 18, 28, 181-182, 222-224, 225f, 226f, 228, 252-254, 255f, 261, 268, 271-274, 285-286, 293 Single Step Mutation Model, see Onestep Mutation Model
313
Single-Stranded Conformation Polymorphism, 42, 278-279 SMM, see Stepwise Mutation Model South America, biogeography, see also Panama, Isthmus of, 39, 113-114, 124f, 219-243, 245-270 Species flocks, 2, 5, 8, 25-35, 40, 45, 47-48, 97-99, 271-282 SSCP, see Single-Stranded Conformation Polymorphism St. Clair, Lake, 77, 131, 136-139 Stepwise Mutation Model, 29, 34, 45, 47, 54-55, 57, 61 t, 66 Stock structure, 66, 129-140 Substitution rates, 13-24, 132 Substitutions, see codon position variation, non-synonymous codon, saturation, and third codon position variation Superior, Lake, 131, 136-139 Sympatric distributions, 39, 97, 99, 103, 105, 108-109, 145-146, 150, 153, 155
T Tamura-Nei model of nucleotide substitution, 19-20, 123f, 132, 135f Tanganyika, Lake, 4, 8, 39-40, 43f, 4445, 97-111 Tertiary period, 113, 298 Tetraploidy, 145, 156 Third codon position variation, 2, 5, 14-16, 17f, 19-21, 23, 165, 170, 177f, 199-217, 232-235, 285, 289292, 295, 299-300 Topology-dependent Phylogenetic Tail Probability Test, 7, 133, 136, 138, 140, 165, 167-170, 173, 175f, 177f, 178f, 182, 191, 194 T-PTP, see Topology-dependent Phylogenetic Tail Probability Test tRNA genes, mitochondrial DNA, 40, 78, 94, 158, 237, 299 Transition: transversion ratios, 6, 1415, 16f, 17f, 19-21, 24, 78-79, 81, 86f, 87, lOOf, 101f, 102f, l17t, 118, 134, 139, 156-157, 169-170, 174f, 192 t, 193, 204f, 206-207, 208f, 209, 21Of, 212-213, 223-224, 226, 228229, 231-232, 234, 236, 252-254, 261,268, 285, 290t, 291,293, 295296, 297f Trophic specialization, 26, 47, 219 Two-phase mutation model, 15, 29, 34, 55, 57 Tyrosine kinase x-src gene, nuclear DNA, 5, 7, 167, 170, 171f, 175f,
178f
Index
314 U
Unweighted Pairs Group Method with Arithmetic Averages, l17t, 118, 163 UPGMA, see Unweighted Pairs Group Method with Arithmetic Averages
V Vicariant biogeography, 93, 113-128, 167, 239, 267-268 Victoria, Lake, 4-5, 39-40, 40, 43f, 4448, 97-99, 109, 271-283, 286 Viviparity, 167, 176, 246 t, 249, 263-264
W
Weighting, 6, 79, 81, 86, 87f, 92, 94, lOl f-lO2f, 123f, 163-188,200-201, 209, 222, 227-228, 230-233, 236, 252, 254, 261, 293, 295, 297, 300
FIGURE 1 Body form and coloration in two representatives of each of three mbuna genera. Taxa are, from top to bottom: Melanochromis auratus, Melanochromis parallelus, Pseudotropheus zebra "BB," Pseudotropheus zebra "black dorsal," Labeotropheus fuelleborni, and Labeotropheus trewavasae.All individuals are male except M. auratus and L. trewavasae; the latter individual displays the " O B " color morph, which occurs in females and occasional males of many mbuna species. Photographs reproduced with permission from Konings (1990).
This Page Intentionally Left Blank