Gene Families: Studies of Dna, Rna, Enzymes and Proteins : Proceedings of the October 5-10, 1999 Congress, Beijing, China, the 10th Intl Congress on Isozymes

GENE FAMILIES Studies of DNA, RNA, Enzymes and Proteins f^ Editors ,iong Xue, Yongbiao Xue, Zhihong Xu, Roger Holmes, ...

26 downloads 798 Views 17MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

GENE FAMILIES Studies of DNA, RNA, Enzymes and Proteins

f^

Editors ,iong Xue, Yongbiao Xue, Zhihong Xu, Roger Holmes, Graeme L Hammond & Hwa A. Lim

World Scientific

GENE FAMILIES Studies of DNA, RNA, Enzymes and Proteins

Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Farrer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

GENE FAMILIES: STUDIES OF DNA, RNA, ENZYMES AND PROTEINS Copyright © 2001 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN 981-02-4384-7

Printed in Singapore by World Scientific Printers

GENE FAMILIES Studies of DNA, RNA, Enzymes and Proteins Proceedings of the October 5 - 1 0 , 1 9 9 9 Congress, Beijing, China The 10th International Congress on Isozymes

Editors

Guoxiong Xue Institute of Developmental Biology Chinese Academy of Sciences, Beijing, China

Yongbiao Xue, Ph.D. Institute of Developmental Biology Chinese Academy of Sciences, Beijing, China

Zhihong Xu, Ph.D. Peking University, Beijing, China

Roger Holmes, D.SC. University of Newcastle, New South Wales, Australia

Graeme L. Hammond, M.D. Yale University, School of Medicine, Connecticut, USA

Hwa A. Lim, Ph.D., MBA D'Trends Inc., California, USA

V f e World Scientific w l

Singapore • New Jersey'London • Hong Kong

Honorary President: Clement L. Markert Chair: Zhihong Xu Co-Chairs: Boqing Qiang Fangzhen Sun Guoxiong Xue (Executive) Laining Yu

International Executive Committee: Carla Frova (Italy) Erwin Goldberg (USA) Roger Holmes (Australia) Hans Jornvall (Sweden)

Masataki Mori (Japan) C. Schnarrenberger (Germany) John VandeBerg (USA) Guoxiong Xue (China)

National Advisory Committee: Songlin Chen Yongfu Chen Zhu Chen Miao Du Guofan Hong Yunde Hou Jifang Huang Kam-Len D. Lee Zhensheng Li Cheng Ma Bo Tian Guihai Wang Guanhua Xu Yongbiao Xue Longfei Yan Shaoyi Yan Qifa Zhang Rongquan Zhang Lihuang Zhu Zuoyan Zhu

International Advisory Committee Atonnio Blanco (Argentina) W. Richard Chegwidden (USA) Jacques Drouin (Canada) Clara Gozodezki (Mexico) Hwa A. Lim (USA) Jose Luis Millan (USA) Atsushi Nakazawa (Japan) Eviatar Nevo (Israel) P.R.K. Reddy (India) Francisco Salzano (Brazil) Maria de Fatima L. Santos (Portugal) John Scandalios (USA) Wolfgang Scheffrahn (Switzerland) Michael J. Siciliano (USA) Oleg Serov (Russia) Y.H. Tan (Singapore) Athanasios Tsaftaris (Greece) Jerry Wang (Canada) Diter von Wettstein (USA) Suren Zakian (Russia)

Sessions and Session Chairpersons: (In chronological order) Plenary Symposium:

John VandeBerg Gerald Stranzinger Erwin Goldberg Che-Kun Shen Roger Holmes Boqing Qiang Yongbiao Xue

Gene Families and Isozymes:

Desmond Cooper Henry Weiner Richard Chegwidden

Population Variation of Gene Families:

Robert Gracy

Gene Structure and Mapping:

Jiayang Li

Gene Families and Human Diseases:

Hans Jornvall

Gene Families and Evolution:

Eviata Nevo Shusen Liu

Genetic Mutations and Diseases:

Masami Muramatsu

Genomes and Bioinformatics:

Hwa A. Lim Cheng Jing

Gene Families and Plants:

Desmond Cooper Claus Schnarrenberger

Gene Families and Gene Expression:

Jacques Drouin John R. McCarrey Frederick Sweet

Mammalian Gene Families:

Yongfu Chen

Gene Families and Biotechnology:

Zuoyan Zhu P.R.K. Reddy

Young Scientists:

Fuchu He Fangzhen Sun Congress Liaison: Wang Ning

ACKNOWLEDGEMENTS Technical & Logistics Support for Proceedings Production DTrends, Inc., USA

Financial Support for Proceedings Production DTrends, Inc., USA

MS WORD Editor Hwa A. Lim [email protected]

Manuscript Review Committee Charles H. Blomquist (Health Partners Ramsey Clinic, Minnesota) Xiao Zhuo Chen (Ohio University, Ohio) Paolo Fortina (University of Pennsylvania School of Medicine, Pennsylvania) Erwin Goldberg (Northwestern University, Illinois) Robert Gracy (University of North Texas Health Science Center, Texas) Perry B. Hackett (University of Minnesota, Minnesota) Graeme Hammond (Yale University School of Medicine, Connecticut) Roger Holmes (University of Newcastle, Australia) Larry Kricka (University of Pennsylvania, Pennsylvania) Hwa A. Lim (DTrends, Inc., California) John R. McCarrey (Southwest Biomedical Foundation, Texas) Jose Luis Millan (The Burnham Institute, California) Peter Parsons (University of La Trobe, Australia) Tim Robbins (University of Nottingham, UK) Claus Schnarrenberger (Freie Universitat Berlin, Germany) Frederick Sweet (Washington University School of Medicine, Washington) Alan R. Templeton (Washington University, Missouri) John L. VandeBerg (Southwest Biomedical Foundation, Texas) Guoshun Wang (University of Iowa College of Medicine, Iowa) Henry Weiner (Purdue University, Indiana) Ditter von Wettstein (Washington State University, Washington) Edgar Wingender (Research Group Bioinformatics, GBF, Germany)

National Sponsoring Organizations Chinese Academy of Sciences National Natural Science Foundation of China Chinese Association for Science & Technology Chinese Committee for International Union of Biological Sciences Changjiang Fisheries Institute, Chinese Fisheries Academy of Sciences Molecular Developmental Biology Open Lab, Chinese Academy of Sciences Plant Molecular Developmental Biology Lab, Institute of Developmental Biology

Industry Organization Sponsoring Congress DTrends, Inc., USA http://www.d-trends.com

Industry Exhibitors Gene Company Ltd. Amersham Pharmacia Biotech Shanghai Sangon Co. Ltd. Bio-Rad LTI Co. Promega Co. EG&G Co. Perkin-Elmer Applied Biosy stems Eppendorf Olympus Co. Nikon Co. Novo Nordisk Beijing Tianxiangren Biotechnology Co. Ltd. Beijing Liuyi Instrument Factory Pall Co. Millipore Biotinge-Tech Co. Ltd. Beijing SBS Biotechnology Inc.

PREFACE As Chairman of the International Executive Committee of the 10 International Congress on Genes, Gene Families, and Isozymes, I am pleased to make the opening remarks for this special volume. The meeting, held in Beijing, coincided with the celebrations of the 50th anniversary of the founding of People's Republic of China. The Congress was organized by Dr. Guoxiong Xue and his Co-Chairs, Drs. Boqing Qiang, Fangzhen Su, and Laining Yu. The Chair of the Congress, Dr. Zhihong Xu, is Vice President of the Chinese Academy of Sciences, and since the Congress was appointed also as President of Peking University. As Presiding Chair of the Congress, and on behalf of all of the participants, I thank the organizers and the Chair for the effort they committed to making the meeting a huge success scientifically as well as socially and culturally. The 10th Congress continued the precedent, established at the 9th Congress, of emphasizing genes and gene families as the primary determinants of isozymes and of protein multiplicity. Toward that goal, the organizers structured the Congress around the following themes: Gene Families and Isozymes Gene Families and Enzymes Population Variation of Gene Families Gene Families and Human Disease Genetic Mutation and Disease Gene Families and Evolution Mammalian Gene Families Gene Families and Plants Gene Families and Gene Expression Gene Families and Biotechnology Genomes and Bio-information The 10th Congress also continued the tradition of exploring the interfaces among various biological disciplines, rather than focusing on individual disciplines as is most common at scientific meetings. Cross-fertilization was highly evident among cognate disciplines in the biological sciences in studies of disease, evolutionary biology, medical genetics and gene regulation. Moreover, research was presented involving a wide range of organisms including bacteria, protozoa, plants, and mammals. This volume contains selected papers from the Congress, all of which have gone through a rigorous peer review process. These papers are representative of the breadth of scientific topics discussed at the meeting and of the high scientific quality of the meeting. I thank the authors, the reviewers, and the editors for their IX

X

Preface

commitment to the excellence of this volume as a lasting tribute to the Congress and to the field that was founded by Professor Clement Markert. The 10* Congress began with a tribute to Professor Markert, who passed away just four days before the opening of the meeting. Professor Markert had been named as the Honorary President of the Congress, in recognition of his role in discovering isozymes and pioneering the concepts of isozymes and gene families. Professor Graeme Hammond presented a moving tribute to Professor Markert and to the professional and personal contributions that he made to science and to society throughout his life. The 10* Congress was held 30 years after Dr. Markert and his colleagues first published the concept of isozymes. The Congress had a strong international flair, with presentations from scientists representing 29 countries and regions. They included Australia, Austria, Belorussia, Brazil, Canada, the Czech Republic, Denmark, Finland, Germany, Greece, Hong Kong Special Administrative Region of China, India, Ireland, Israel, Italy, Japan, Malaysia, The Netherlands, People's Republic of China, Portugal, Russia, Singapore, Spain, Sweden, Switzerland, Taipei, Thailand, United Kingdom, and USA. As has been traditional from this series of Congresses, the organizers arranged a variety of special social and cultural activities that complemented the scientific activities of the Congress. These included a superb welcoming reception, a visit to the Chinese Opera, and a day trip to the Great Wall and other local attractions. Since their inception in 1961, this series of congresses has provided an opportunity for scientists working in a wide variety of fields involving isozymes and gene families to interact and to learn from the isozyme concept as it is applied diversely to many biological disciplines. I look forward to the 11 International Congress on Genes, Gene Families, and Isozymes to be hosted by Dr. Hans Jornvall of the Karolinska Institute in Stockholm in 2001, the 30th anniversary of this series of congresses.

September 2000 JOHN L. VANDEBERG

Director, Southwest Regional Primate Research Center San Antonio, Texas, USA E-mail: [email protected]

OBITUARY Clement L. Markert (1917-1999) The concept of isozymes was developed by Clement Markert and Freddy Moller in 1959,' which paved the way for extensive studies on enzyme, protein and gene multiplicity across all living organisms. This important scientific discovery has had a profound influence on the biological sciences for more than 40 years, and provided the basis for regular international meetings to discuss the biological and biomedical implications of enzyme multiplicity. More recently, this concept has been extended to a wide range of gene families for DNA, RNA, proteins and enzymes. As the Honorary President of the 10th International Congress on Genes, Gene Families and Isozymes, recently held in Beijing China in October 1999, Dr Markert was planning to attend and participate in his 10th 'Isozyme' Conference. Unfortunately, it was announced at the Congress Opening by Clem's friend and collaborator from Yale University, Dr Graeme Hammond, that he had passed away four days earlier following a recent illness. As Clem would have wanted it, the Congress proceeded in his absence and was an outstanding success. All of us attending the meeting, however, were saddened by the absence of a wonderful scientist who established the field of gene families and isozymes as a fundamental concept of living organisms. We also missed his friendship, good humor and contributions of criticism, advice and commendation of the papers presented at the Congress. Clem Markert was born on April 11 1917 in Los Animas, Colorado USA, and passed away in Colorado Springs on October 1 1999. His 82 years were filled with hard work, adventure, outstanding science, international travel, love for his wife Margaret and their children Alan, Robert and Betsy, and in his younger years, controversy. Clem graduated from the University of Colorado in 1940, following activities in Spain fighting with the International Brigades against the fascist regime in that country in 1938. During the Second World War, he served in the U.S. merchant navy, following completion of his Masters degree at UCLA. After the war, he completed his Ph.D. in 1948 at John Hopkins University in Baltimore, followed by a two-year Postdoctoral at the California Institute of Technology. His first academic appointment as Assistant Professor was at the University of Michigan during 195056. It was during this period that he came under scrutiny by the Committee for Unamerican Activities, commonly referred to as the McCarthy Committee. The response from Clem and Margaret, and the outstanding support provided by his

Markert and F. Moller, Multiple forms of enzymes: Tissue, ontogenetic and species specific patterns. Proc. Natl. Acad. Sci. USA 45 (1959) pp. 753-763. XI

xn

Obituary

scientific colleagues and friends, reflected their strength and resilience, and their desire to ensure that free speech was a protected right. His appointment at Johns Hopkins University during 1956-65 was enormously productive and rich in biological discovery and conceptual development. In 1957, the 'zymogram' technique was published jointly with Robert Hunter, a colleague from the University of Michigan. This method combined the resolution power of multiple forms of enzymes by starch gel electrophoresis with the specificity derived from histochemistry in the staining of enzymes. It was applied initially to esterases and subsequently to many other enzymes, including lactate dehydrogenase. It was the latter path-breaking work that led to the 'isozyme' concept, and many scientific discoveries around the world, with studies on micro-organisms, plants and animals revealing the extensive multiplicity of genes, gene families and isozymes in all biological species. In 1965, Dr. Markert moved to Yale University to become Chairman of the Department of Biology during 1965-1971, and continued on at Yale as Professor of Biology until 1986. During 1974-86, he served as the Director of the Center for Reproductive Biology, reflecting his pioneering role in the related field of transgenics. Together with his collaborator, Jon Gordon, Clem developed a powerful tool for developmental genetics, involving the microinjection or micromanipulation of nuclei of mammalian eggs. After 21 years at Yale University, Clem and Margaret moved to North Carolina State University, where he was appointed as a Distinguished University Professor during 1986-93. There, he joined a longstanding friend and colleague, John Scandalios, who had been appointed to a similar prestigious position supported by the State of North Carolina, in its enhancement program for scientific research. At the age of 76, Clem became an Emeritus Distinguished Professor of the University, and returned to live in Colorado Springs, which is near where the Markert family spent their summer break at their 'cabin' high in the Santa Cruz Mountains. During this distinguished career, Clem was recognized with many honors and awards, including election to the National Academy of Sciences (governing Council member during 1970-71,1977-1980); American Institute of Biological Sciences (President, 1965); American Genetic Association (President, 1980); American Society of Naturalists (Vice-President, 1967); American Society of Zoologists (President, 1967); and many other societies. He has served as a scientific editor on a number of journals and other publications, including positions as Managing Editor of the Journal of Experimental Zoology (1963-85); member of the Editorial Board of the Archives of Biochemistry and Biophysics; Differentiation; Cancer Research; Developmental Genetics; Transgenics; and Editor of the Proceedings of the 3 r , and

2

R.L. Hunter and C.L. Markert, Histochemical demonstration of enzymes separated by zone electrophoresis in starch gels. Science 125 (1957) pp. 1294-1295.

Obituary

Xlll

5th ^th j S 0 Z y m e Congresses and of the Prentice-Hall Series in Developmental Biology. Ten international congresses have now been held on isozymes and gene families: the first (1961) and second (1966) in New York under the sponsorship of the New York Academy of Sciences, and the third (1974) at Yale University and Chaired by Clem, with outstanding support from Margaret. Subsequent Congresses have been regularly held at various locations, including Austin, Texas (4 , 1982); Island of Kos, Greece (5th, 1986); Toyama, Japan (6th, 1989); Novosibirsk, Russia (7th, 1992); Brisbane, Australia (8th, 1995); San Antonio, Texas (9th, 1997); and Beijing, China (10th, 1999). All of these Congresses, with the exception of the last Congress, were attended by Clem, who played major roles in them all by the delivery of Plenary Lectures as Congress President or Plenary/Symposium Chairman, and strong participation in the scientific and social activities of the Congresses. We have not only lost a great scientist and pioneer in field of isozymes and gene families, he has been a friend, mentor, student and postdoctoral supervisor, collaborator, editor, referee and supportive critic for many of us. Clem has been strongly supported throughout his distinguished career by his wife Margaret, who accompanied him to conferences, Isozyme Congresses and other visits with friends and colleagues around the world. They were always generous hosts, inviting us into their homes in Ann Arbor, Baltimore, New Haven, Raleigh and Colorado Springs, as well as to their mountain retreat in the Santa Cruz mountains. Clem was a strong person in every sense of the word, physically and mentally. He was an engaging conversationalist with a remarkably broad knowledge of the biological sciences, and with strong interests in science policy, which were expressed at the highest levels in government and scientific organizations. He also was an internationalist, with a healthy cohort of 'foreigners' undertaking graduate and postdoctoral study. His contribution to the development of science in many countries is well known, including China and Russia. Clem Marker! will be remembered by many scientists, colleagues and friends around the world and has given all of us a lasting legacy in the biological sciences with the isozyme concept now being applied in a broad range of gene families in molecular biology and biochemistry, cellular differentiation, developmental biology, biomedical science, gene regulation, structure and function of enzymes and isozymes, transgenics, and population biology. Farewell Clem. Our condolences go to Margaret and the Markert family with our love and best wishes. September 2000 ROGER S. HOLMES

Vice-Chancellor and President, The University of Newcastle Callaghan, New South Wales, AUSTRALIA Email: vc@newcastle,edu.au

TABLE OF CONTENTS Preface

ix

Obituary

xi

Clement Markert G. L. Hammond

1

Identification of Novel Gene Family Members Based on Efficient Full-Length cDNA Cloning J. Gu, X.-Y. Wu, M. Ye, Q.-H. Zhang, Z. G. Han, H.-D. Song, Y. -D. Peng and Z. Chen Strategies for Testis Specific Gene Expression E. Goldberg Oxidized Isoforms as Diagnostic Biomarkers of Alzheimer's Disease R. W. Gracy, J. M. Talent, C. Malakowsky, R. Dawson, P. Marshall and C. C. Conrad Transgenic Fish and Biosafety W. Hu and Z. Zhu

5

21

29

39

Aldehyde Dehydrogenases of Human Corneal and Lens Epithelial Cells R. S. Holmes

49

X-Chromosome Inactivation During Spermatogenesis: The Original Dosage Compensation Mechanism in Mammals? J. R. McCarrey

59

Molecular Evolution and Environmental-Stress E. Nevo

73

Nitric Oxide Related Enzymes and Coronary Artery Disease X. L. Wang

89

xv

xvi Pathways, Compartmentation and Gene Evolution C. Schnarrenberger and C. F. Martin

103

Tomato CF Genes for Resistance to Cladosporium fulvum C. M. Thomas, M. S. Dixon and J. D. G. Jones

115

Gene Expression and Intermolecular Forces in Estrogen/Receptor Binding Q. Chen, S. Adler and F. Sweet

133

Probing for the Basic of the Low Activity of the Oriental Variant of Liver Mitochondral Aldehyde Dehydrogenase B. Wei and H. Weiner

141

S RNases and Self and Non-self Pollen Recognition in Flowering Plants Y. Xue, H. Cui, Z. Lai, W. Ma, L. Liang, H. Yang and Y. Zhang

149

The Roles of Carbonic Anhydrase Isozymes in Cancer W. R. Chegwidden, I. M. Spencer and C. T. Supuran

157

Biochip and Miniaturization J. Zhang, W. -L. Xing, Y. -X. Zhou and J. Cheng

171

Functional Genomics: A Platform for the Discovery of New Therapies D. Cohen

179

A Novel Mathematical Analysis of Human Leukocyte Antigen (HLA) Polymorphism B. Feng, D. Pan, S. Chen, Z. Ye and A. Xu

185

Characterization of a New Tissue-Specific Mutation of the Yellow Gene Which Supports Transvection J.-L. Chen, J. Liu, K. Huisinga, P. Geyer, J. Mossis and C.-T. Wu MHC Class II Suppression by Trophoblast cDNAs G. L Hammond, D. Mandapati, J. Davila, M. A. Coady and A. L. M. Bothwell

195

203

XVII

Null Activity Mutation of Phenoloxidase in Drosophila melanogaster N. Asada, N. Kawamoto and T. Hatta Molecular Information Fusion for Metabolic Networks R. Hofestadt, M. Lange and U. Scholz Intron-Size and Exon Polymorphisms in the Mouse Tissue-Nonspecific Alkaline Phosphatase Gene N. Frohlander and J. L. Milldn Lipoxygenases and Cyclooxygenases of the Testis of Rat S. Neeraja, P. Reddanna and P. R. K. Reddy Effect of Heterogeneous Sperm and Hybridization of DNA Fragment in Allogynogenetic Silver Crucian Carp D. Xia, G. Xue and L. Zhang Gene Expression During Carrot Somatic Embryogenesis N. Wu, F. Diao, M. Qi, Y. Cheng, L. Zhang, M. Huang and F. Chen

215

221

233

243

251

263

Epigenetic Modifications in Maize Parental Inbreds and Hybrids and their Relationship to Hybrid Vigor and Stability A. S. Tsaftaris, A. N. Polidoros and E. Tani

277

CIS-Elements and Transcription Factors Regulating Antioxidant Gene Expression in Response to Biotic and Abiotic Signals J. G. Scandalios and L. M. Guan

287

Index

303

Photo 1. A rest during a picnic on an island on the Ob Sea - a man-made sea in Siberia, Russia. (I to r): Mrs. and Erwin Goldberg, John Scandalios, Clement Markert, Mrs. and Athanasios Tsaftaris, (unidentified), Eviatar Nevo, Eobert Gracy, Michael Crawford. Prof. Ma&ert was an active participant in every single congress since the inception of the Conpess series in 1%!. (The 7 th International Congress on Isozymes, Novosibirsk, Russia, September 6-13,1992).

m

Photo 2. A business dinner at Oleg Serov's residence. (1 to r): Leonid Korochkin (Congress Co-Chair), Roger Homes, Oleg Serov (Congress Co-Chair), John Scandalios and Clement Markert. (The 7th International Congress on Isozymes, Novosibirsk, Russia, September 6-13,1992).

Photo 3. Clement Markert opening the 8th International Congress on Isozymes. (The 8th International Congress on Isozymes, Brisbane, Australia, June 25-July 1, 1995).

Photo 4. Margaret and Clement Markert, between sessions at the Congress Auditorium. (The 8!h International Congress on Isozymes, Brisbane, Australia, June 25-July 1,1995).

Phot© S. Sidney Atanan and Clement Markert Sid, 1989 Chemistry Mobe! laureate for the disccwery of catalytic properties of RNA, is Clement's longtime colleague and Mend. Sid was one of the six Nobel keynote speakers at the Congress to commemorate Clement's 80th birthday. (The 9 th International Congress on Isozymes, San Antonio, USA, April 14-19,1997).

Photo 6. Clement Markert speaking at a dinner to celebrate his 80 birthday. (The 9th International Congress on Isozymes, San Antonio, USA, April 14-19,1997).

Photo 7. Guoxiong Xue and Clement Markert. Guoxiong is Executive Co-Chair of the 10* International Congress on Isozymes, Genes, and Gene Families. Clement was Honorary President of the Congress. This photo was taken in 1995, Brisbane, Australia.

Photo 8. Congress banquet at a Yunan restaurant in Beijing. The Congress Chairman, Professor Zhihong Xu (President, Peking University), is taking part in one of the entertainment programs. Conspicuously absent is Prof. Clement Markert, who passed away four days before the 10* Congress commenced. (The 10* International Congress on Isozymes, Beijing, China, October 5-10,1999).

CLEMENT MARKERT 1917- 1999 GRAEME L. HAMMOND Yale University School of Medicine, Department of Surgery, 333 Cedar Street -121 FMB, New Haven, CT 06510 USA E-MAIL: [email protected] Thank you Dr. VandeBerg, members and guests of the Chinese Academy of Sciences and participants in the Congress. In April 1997, Clem Markert was diagnosed with carcinoma of the colon and underwent right hemi-colectomy. The pathology report showed 19 positive lymph nodes in a tumor that had invaded through the bowel serosa. He underwent three months of chemotherapy and was then discovered to have widespread pleural pulmonary metastases. For the next two years, he led a very active life with trips to Alaska and Africa, and boating on the Columbia and Snake Rivers. His course for the past three months, however, was one of steady deterioration - to the point that he knew he would be unable to attend the 10th International Congress. He died on the evening of October 1st, 1999 in Colorado Springs. As a surgeon and member of the Department of Surgery at Yale, I collaborated with Clem for many years during his tenure as Chairman and member of the Department of Biology at Yale. This collaboration began in an unusual way. I was investigating how the ischemic myocardium worked - an issue of great importance to medicine and to patients with coronary artery disease. During these studies, I recognized that there must be fundamental changes in the way the heart uses energy - in short, that it must be able to function anaerobically and, as the terminal step in glycolysis, be able to convert pyruvic acid to lactic acid. However, the heart never normally makes this conversion. I searched the literature for an explanation and came across papers by Clement Markert from Johns Hopkins describing the theory of isozymes and showing that the LDH isozyme pattern differed from organ to organ depending upon its energy requirements. I tried to contact Dr. Markert at Johns Hopkins and was told finally that he was, in fact, at my own institution. When I broached the idea to him that the LDH isozyme pattern in ischemic myocardium must be changing to favor lactic acid formation, he responded that this would be tantamount to Lamarckian biology. However, we later showed that the LDH pattern did change and he quickly said, "We both learned a lot from this." His openness, honesty, and high principles attracted many followers from around the world and was responsible, along with such work as his discovery of isozymes, for his election to The National Academy of Sciences.

1

2

G. L. Hammond

Because of the unfortunate terminal nature of his disease in late September, his wife, Margaret, asked if I would give his Presidential Address which follows and which he entitled: Isozymes: A Brief Historical Perspective "Since the initial recognition of multiple molecular forms of enzymes (isozymes) by Markert and Moller (1959), isozymes have been extensively studied or used as markers in a wide range of studies, with virtually every known organism, and at all levels of biological organization. On numerous occasions since the establishment of the isozyme concept, researchers from around the world have gathered in international congresses to discuss their work and to be brought up to date on the technologies employed and the novel applications of isozymes. Each Congress has resulted in significant publications that have proved helpful to scientists from a wide range of disciplines. The study of isozymes has provided insights into the structure and function of the genome, the regulation of gene function during cell differentiation and development, and the structure, function, and evolution of isozymes and their encoding genes per se. With the advent of new technologies and developments in molecular biology came a rapid expansion in the dissection of genes encoding isozymes. Prior knowledge of isozyme structure and function served as a critical foundation of information on which research at the DNA and RNA levels could be based. The significance of gene families encoding functionally related isozymes is becoming increasingly apparent. As the genomes of various organisms from microbes to higher eukaryotes are being resolved, the question of the product of the various genes will greatly be impacted by the early and current studies with isozymes and gene families. Since the first Isozyme Congress was held in New York thirty-eight years ago (1961), there have been several revolutions in biology. Each development has had an impact on our science, and the work presented at each of the subsequent Congresses has in turn impacted biology, medicine, and agriculture in significant ways. The International Congress on Genes, Gene Families, and Isozymes has provided and will continue to provide a unique forum for international communication among biologists. As in the past, I am certain that the 10th Congress being held here in Beijing will prove to be another significant milestone in the dissemination of important information that will further enhance the use of isozymes as basic to all aspects of the biological sciences. It is obvious that isozymes will be a continuing part of biological research and will play a central role in

3

Clement Markert enlarging and deepening our understanding of biological organization. A rich, rewarding, expanding, and exciting future is clearly in store for the field of isozymes as we analyze the structural and functional organization of the genome that creates the metabolic patterns which collectively make all organisms what they are." Clement L. Markert, 1999

In closing, I would like to add two personal notes. The first is from Dr. Erwin Goldberg, Professor of Biochemistry, Molecular Biology, and Cell Biology at Northwestern University, who has known Clem for many years. Dr Goldberg writes: "While discoveries in biology move the field forward, it is rare that an individual's accomplishments can have such an impact on an entire discipline. That is Clem Markert's legacy for present and future biologists." Finally, from myself, I would like to add that Clem Markert significantly affected the lives and careers of many people in this audience, including my own, and that his imprint on clinical medicine and surgery was just as great as it was on biology. His identification of isozymes is used every day in every major hospital in the world for diagnosing pulmonary emboli, myocardial infarction, and ischemia in virtually every organ. His ability to conceptualize led to our present understanding of cell and organ stress, the impending or actual presence of cell death and how to reverse these effects before death of the patient. This is Clem Markert's legacy for present and future clinicians.

IDENTIFICATION OF NOVEL GENE FAMILY MEMBERS BASED ON EFFICIENT FULL-LENGTH CDNA CLONING JIAN G U ' ' 2 , X I N - Y A N W U \ M I N Y E 2 , Q I N G - H U A Z H A N G 2 , Z E - G U A N G H A N 1 , H U A I - D O N G S O N G 3 , Y O N G - D E P E N G 1 , 3 , AND Z H U C H E N 1 ' 2

'Chinese National Human Genome Center at Shanghai, 351 Guo Shoujing Road, Pudong, Shanghai, 201203, China Shanghai Institute of Hematology, Rui-Jin Hospital, Shanghai Second Medical University, 197 Rui-Jin Road II, Shanghai, 2000250, China 3 Shanghai Institute of Endocrinology, Rui-Jin Hospital, Shanghai Second Medical University, 197 Rui-Jin Road II, Shanghai, 200025, China Email:

[email protected]

A combination of EST analysis, application of bioinformatics, primer walking, reversetranscription PCR and RACE has been widely used in obtaining novel full-length cDNAs. By applying this method we have cloned 600 novel full-length cDNAs sequences mainly from endocrine and hematopoietic systems. Some of these genes can be categorized into several gene families, which included some transcription factors and those involved in vesicle trafficking and signal transduction. There are also many novel genes showing homology to genes discovered in relatively lower creatures. The bioinformatic analysis combined with experimental methods were used for identifying new members of known gene superfamilies.

1

Introduction

Human genome project now is at a historic turning point, from structural genomics to functional genomics. According to announcement from both public sector and private company sequencing efforts, a working draft of the human genome sequence will be obtained soon, though the finishing will take some longer time [1,2]. The gene discovery and understanding of genetic information will require annotation of the sequence data using bioinformatic tools [3]. On the other hand, cloning of fulllength cDNA has been listed as one of the major tasks of the next phase of genomic science [1]. The integration of cDNA sequences into the genomic ones will greatly facilitate the identification of transcriptional units, the gene isoforms, and the mRNA level and specificity in cell/tissues as a result of genome expression. On the other hand, cDNA project links directly to the protein structural biology and exert significant impact to the medical genetics and biotechnology/pharmaceutical industries. Several approaches have been used to identify full-length cDNA, but the most efficient and popular way is EST sequencing. The conception of EST project is first proposed in the early 1980s, when some scientists recognized that short stretches of cDNA sequences could be used as marker for genes [4]. Until ten years later did this conception turn into reality with large flow of EST data output as the sequencing technique became more automatic and efficient [5]. The dbEST database bulked up 5

6

J. Gu et al.

in the last decade when EST projects from different tissue and diverse collection of organisms have been conducted. Along with the finishing of genomic sequencing and gene identification of some model organisms, rapid cloning of important human genes by leaping over taxonomic boundaries, from genes identified in model organisms to those embedded in the more complex genomes of human and mouse is now a possibility. Moreover, ESTs based STSs have been widely used in the construction of gene-based physical map. It also offers valuable experimental evidence of transcription when compared with other computational programs used to predict exons. So that it is a powerful tool for the prediction of exon-intron organization of genes, identification of alternative splicing events and unusual genome organization cases. Gene expression profiles in specific tissue or cell type reflect the functional features of them, hereby identification of novel genes preferentially expressed in this tissue or cell type would be important to clarify the molecular basis of certain physiological process. Functional assays of those obtained novel genes could be of great value to both research and commercial domains. Over the last 3 years, we have been undertaking projects of cataloging the expressed sequence tags (ESTs) from cDNA libraries relatively enriched in full-length cDNA of CD34+ hematopoietic stem/progenitor cells (HSPCs) populations and Hypothalamus-Pituitary-Adrenal (HPA) endocrine system. This approach turned out to be very successful in terms of both gene expression profiling and the discovery of novel genes in an efficient way [6]. Based on bioinformatic analysis, important clues could be obtained with regard to the structural and functional characteristics in the context of the cell compartments of each open reading frame (ORF) in large amount of putative fulllength cDNA sequences. Gene families and groups were identified through homology search across wide range of species through evolution. Application of bioinformatic information from public database allowed to assign the chromosomal localizations for the majority of the novel genes and to obtain the genome organizations in part of the genes, and to all at last in future. Moreover, the gene expression patterns were further approached using both "electronic Northern" and cDNA array so that genes with cell/tissue specific expression could be picked up for further functional studies. 2

2.1

Materials and Methods

EST sequencing and data analysis

CD34+ cells were harvested from cord blood and bone marrow, with gradient separation and anti-CD34 McAb-conjugated MACS (Miltenyi Biotec, Germany) separation twice. RNA extraction, lambda ZAPII cDNA libraries construction,

Identification of Novel Gene Family Members

7

Bluescript phagemid templates preparation, sequencing strategy and data management were manipulated as previous reports of our group [6]. Libraries of HPA endocrine system were constructed using classical strategy according to manufacturer's protocol (Stratagene, USA). The sequencing primers were universe primers including Ml3 Reverse and/or Forward, T3 and/orT7 primers, sequencing mix was BigDye Terminator (Perkin Elmer). 5' or 3' end ESTs generated were categorized into known-gene, known-EST and novel EST groups by BLAST searching against GenBank database with Blast and Fasta programs integrated in GCG package (Madison Wisconsin) (release 108 and later due to working time). 2.2

Full-length cDNA open reading frames cloning

The unknown-gene clones were candidates for novel full-length cDNA ORF cloning. The HSPC clone inserts sequences were obtained with combination of primer extension, partial deletion and subcloning sequencing. AutoAssembler (Perkin Elmer) was applied to assemble the sequences to get the contigs, DNA Strider (Version 1.0) was employed to analyze the reading frames of the contigs. To those partial reading frames clones, 'in silico' EST assembly and rapid amplification of cDNA ends (RACE) was efficiently applied. Proper Marathon-ready cDNA libraries (Clontech, Palo Alto, CA) were selected as RACE template, and the gene specific primers (GSP) were generated from the sequences from HSPC clone. The whole open reading frames were thus obtained and confirmed by RT-PCR. 2.3

Chromosomal mapping

Electronic mapping - For novel genes, dbEST were searched to find the hit EST, then UniGene database (http://www.ncbi.nlm.nih.gov/UniGene) was applied to determine the tissue expression pattern and chromosomal mapping of these novel genes. Those cDNA matched genomic DNA sequence data can also provide mapping information. Radiation hybrid - In addition to the electronic mapping results, Stanford G3 and GeneBridge 4 Radiation Hybrid panels (Research Genetics Inc, Huntsville, AL) was applied as a complementary method to map the novel genes [7]. The results were obtained by submitting the PCR results to the Radiation Hybrid Mapping Server at Stanford Human Genome Center (http://www-shgc.stanford.edu) and Whitehead Institute / MIT center for Genome Research (http://www-genome.wi.mit.edu/cgibin/contig/rhmapper.pl). SHGC or MIT framework markers linked to the subjected genes with a LOD score >6.0 were returned from the auto-servers.

8 2.4

J. Gu et al. Preliminarily structure and function analysis with bioinformatics

Sequence Similarity Comparison - GCG package contains the release versions of EMBL and GenBank databases where the known genes and predicted ORFs were deposited. All amino acid sequences encoded by our novel genes were searched against the nucleic acid sequence sub-databases of some important model organisms such as E.coli, S.cerevisiae, C.elegans, Drosophila, Arabidopsis, and mammals (excluding primates) with the tfasta program in GCG package, respectively. There were two reasons to choose this strategy for homology search: first, there were much more nucleic acid sequences than amino acid ones in the databases; second, through evolution, the amino acid sequences are more conservative than those of nucleic acid ones. In this study, two amino acid sequences were considered as homologues when they shared more than 25% similarity over a region of 50-100 amino acids or the Z-score value higher than 200. Fundamental Structural and Functional Elements Searching - Programs including motifs, profile scan in GCG package and prosite at the Expacy website (http://www.expacy.ch/tools/scnpsite.html) were employed to scan for the motifs on primary structure of the peptides. Programs like peptide structure, plotstructure, pepplot, coilscan and hthscan in the GCG package were applied to analyze the secondary structure of the proteins, and spscan (GCG package), signalP (http://www.cbs.dtu.dk/services/SignalP/) as well as TMHMM(http://www.cbs.dtu.dk/services/TMHMM-l-0/) were used to predict the signal peptide and the a-helix transmembrane domains in those novel ORFs so as to characterize the secreted or membrane anchored proteins. In order to acquire more information about some genes, the psort (http://www.psort.nibb.ac.jp.8800) and NNPSL (http://www.predict.sanger.ac.uk/nnpsl_mult.cgi) were chosen to predict their subcellular localization. Gene Expression Pattern - Unigene and dbEST databases were used to search for the gene expression patterns, namely as Electronic Northern. Gene expression patterns of part of these novel genes were also performed by applying Northern blotting and semi-quantitative RT-PCR. Functional assays of zinc finger genes Functional Analysis of Putative Transregulatory Domain of Construction Expression Plasmid - In order to define the transregulatory properties of zinc finger genes, we select three of them, namely ZNF191, ZNF253 and ZNF255, for further functional assay. Non-zinc finger regions of these genes were inserted into yeast plasmid pGBT9 and mammalian cell plasmid pM (Clontech). Both pGBT9 and pM contain DNA-binding domain (GAL4-BD)(l-147aa) of GAL4, which was driven by ADH1 and SV40 promoters respectively. pGBT9 or pM vector inserted with target sequences were constructed to generate fusion genes encoding GAL4-ZNF191,

Identification of Novel Gene Family Members

9

GAL4-ZNF253 and GAL4-ZNF255 chimeras, respectively. The amplified regions and the junctions in these constructs were verified by DNA sequencing. Yeast One-Hybrid System - Yeast one-hybrid system was used to detect DNAprotein interaction. Yeast reporter strain Y187 (CLONTECH), which contains an integrated lacZ reporter construct protein, was transformed with hybrid expression plasmids containing different GAL4 fusion protein, the negative control pGBT9, weak positive control pGBT9-HA (hemagglutinin) and strong positive control pCLl encoding the full-length wild-type GAL4 according to the protocol of TransActTM Assay Kit (Clontech). Qualitative and quantitative analyses of p-galactosidase were performed with the colony-lift filter assay and liquid culture assay using onitrophenyl (3-D-galactopyranoside (Sigma) as substrate, respectively. Mammalian Cell Transfection-In the mammalian assay system, the recombinant pM with different GAL4 fusions, the negative control pM and the positive control pM3-VP16 encoding herpes virus protein were cotransfected by lipofectAMIN (Gibco/BRL) into NIH3T3 or CHO cells with reporter plasmid pGAL45tkLUC containing five consensus GAL4 binding sites and thymidine kinase (TK) minimal promoter upstream of the luciferase. Different ratios of plasmids to be tested and reporter plasmid were compared in transfection assays. Analyses of luciferase were performed according to the protocol of Luciferase Assay System (Promega) and relative light unit (RLU) was measured on luminometer (Lumat LB9507). 3

Results

Totally, 50000 ESTs were generated from both CD34+ cells and endocrine system, from which 750 novel open reading frames (ORF) were obtained, which included 600 full ORFs and 150 partial ORFs. (Available on website http://www.chgc.sh.cn) Only full ORFs were submitted for further functional analysis. After homology and motifs searching, the 600 ORFs were divided into 7 functional categories according to the functions of their homologue genes or possible functional domains they contain as shown in table 1. Regarding those genes with unknown functions, we compared them to genes discovered in relatively lower model organisms from virus to plants and 151 of them showed homology as shown in table 2. While considering the average length of either full-length cDNA or their deduced peptides, we found that most of them were around 500-1500 in nucleotides and 100-300 amino acids respectively, suggesting a more efficient full-length cloning strategy should be developed to obtain longer genes.

J. Gu et al. Table 1. Functional category of genes. categories Gene number Cell division 26 Cell signaling 50 Cell structure/mobility 13 Cell/Organism defense 9 Gene/Protein expression 99 Metabolism 69 unclassified 334 Table 2. Genes with Homology to those from Lower Creatures. Creature Gene number cowpox virus 3 Bacillus subtilis 3 Haemophilus somnus 1 Saccharomyces cerevisiae 41 Caenorhabditis elegans 79 Drosophila melanogaster 13 Arabidopsis thaliana 11

Further analysis of those genes with homology to known genes reveals that part of them belong to several gene families. The biggest gene family is zinc finger and leucine zipper family, with 17 members respectively. Vesicle transporting related gene families are also abundant in our catalogues, which included 6 ras-related protein, 3 VAMP proteins, 2 sec22 protein and so on. We also identified some gene families involved in signal transduction, for example, the PKA and PTP family with 6 and 1 members respectively. Zinc finger gene family belongs to one of the largest human gene families and plays an important role in the regulation of transcription [8]. This large family may be divided into many subfamilies such as Cys2/His2 type, glucocorticoid receptor, ring finger, GATA-1 type, GAL4 type and LIM family [9-10,13]. In Cys2/His2 type zinc finger genes, there are highly conserved consensus sequence TGEKPYX (X representing any amino acid) between both zinc finger motifs. The zinc finger proteins containing this specific structure are named after kriippel-like zinc finger proteins because the structure was firstly found in the Drosophila kriippel-protein [11]. In our study, we found 14 typical C2H2 zinc finger genes and 3 ring finger ones. Bioinformatics analyses revealed that ZNF191, ZNF253, ZNF254, ZNF255, ZNF256, and ZNF257 were novel genes belonging to Kriippel-like zinc finger gene family. The deduced amino acid sequences of these genes contain 3-18 tandemly repeated zinc finger motifs related to Drosophila Kriippel gene family at the Cterminal and possible transcriptional regulatory elements such as KRAB and SCAN box at their N-terminal. The amino acid "knuckle" between zinc finger motifs, typified by the amino acid sequence TGE(R/K) P (F/Y) X, was also highly conserved in all six deduced amino acid sequences. From these features it was

11

Identification of Novel Gene Family Members

reasonable to predict that all six genes could encode DNA-binding proteins with transcriptional regulatory properties. A Novel Trans-regulatory Domain KRNB Analysis of Non ZF Regions Deduced 368 amino acid sequence of ZNF191 had 4 continuous typical krilppel-like zinc finger motifs in C-terminal and contains rich acidic amino acids in non-zinc finger region. An 81 amino acid stretch at the N-terminal of these genes was highly conserved and has been designated as the SCAN box [12]. In addition to 3, 14, 13 and 4 tandemly arranged typical Cys2Mis2 zinc finger motifs respectively, ZNF253, ZNF254, 3^F256f and 7NF257 genes contained Krtippel-associated box (KRAB) in their non-zinc finger regions. These domains consisting of approximately 75 amino acids are all located at the N-terminal moiety of the genes and enriched in hydrophobic and negatively charged residues with the L (X6)L at its core. This core isflankedby certain residues "(e.g. E, L, V, and C) that arefrequentlyfound in ahelices. Although ZNF2S5 has 18 continuously tandem zinc finger motifs homologous with KrOppel-like zinc finger, its deduced amino acid sequence contains a previously undefined domain, which consists of approximately 81 amino acids, at the N-terminal of the protein. This region was homologous with a few zinc finger genes such as FDZF2 (GenBank accession number U95044) and Q14588, which are enriched in hydrophobic amino acids (e.g. G, I, A, L, F) and negatively charged acidic amino acids (e.g. D) (Fig. 1). This new domain was thus nominated as Krappel-related novel box (KRNB). *

80

140

*

*

100

160

*

*

120

180

*

2

OA55527 bitz£2 £dz£2 q02386 00

*

220

*

240

*

260

figure 1. Amino acid alignment of non-zincfingerregionfromZNF255 and related proteins. Conserved amino acid residues are in same color.

12

/. Gu et al.

Expression Pattern of ZNF191, ZNF253, ZNF254, ZNF255, ZNF256, and ZNF257- Northern blots and semi-quantative RT-PCR were carried out to examine the tissue or cell expression pattern of these zinc finger genes (data not shown). ZNF191 gene was expressed in almost all tissues and cell lines except for heart. The other five genes were selectively expressed in different tissue. Within the hematopoietic system, ZNF191 and ZNF255 were expressed in all lineages, whereas ZNF253 expression was restricted to monocyte (U937) and immature erythroid (K562). ZNF256 and ZNF257 tended to be expressed in myelomonocytic lineages (HL-60 and U937), although a low expression level could be detected in Tlymphocytes (MOLT-4) and early erythroid cells (K562). ZNF253 expression was observed in all lineages except for K562 cells. Functional Analysis of KRNB in Comparison with SCAN and KRAB Transcriptional Regulatory Domain To further address the function of the six genes isolated in the present work, ZNF191, ZNF253 and ZNF255 were chosen to study their transcriptional regulatory activities, since these genes contain SCAN, KRAB and KRNB, respectively. The recombinant pGBT9-ZNF191 containing GAL4ZNF191 chimera and control plasmids pGBT9, pGBT9-HA, and pCLl were then used to transform Yeast strain Y187. The qualitative and quantitative analyses of Pgalactosidase indicated that ZNF-191 might be a transactivator in Y187, since a substantial activity of P-galactosidase from the GAL4-ZNF-191 chimera was observed, as compared to the controls (Fig. 2A). However, when a recombinant pM containing GAL4-ZNF191 chimera was cotransfected with a luciferase reporter plasmid into mammalian cells CHO and NIH3T3, it failed to stimulate the expression of the reporter gene. The luciferase activity was even lower than that of pM with basal activity (Fig. 2B and C). Analysis using yeast one-hybrid system and mammalian cell transfection for defining the functions of KRAB domain from ZNF253 generated, nevertheless, coherent results. After Y187 was transformed with pGBT9-ZNF253 containing GAL4 BD-ZNF253 (l-174aa) chimera, both qualitative and quantitative assays of P-galactosidase displayed a suppressive effect of ZNF253 non-zinc finger region on the transcription of reporter gene lacZ, making the galactosidase activity lower than that from pGBT9 with minimal basal stimulation. A similar transcriptional repressor effect was also observed in mammalian cells in that recombinant pM containing GAL4-ZNF253 fusion gene inhibited significantly the expression of reporter plasmid pGAL45tkLUC in CHO and NIH3T3 cell lines (Fig. 2A, B and C).

Identification of Novel Gene

Figure 2. Functional analysis of putative transregulatory domain of ZNF191, ZNF253, ZNF255 by yeast one-hybrid system and mammalian cell transfection. Each value represents the mean of three replicate assays. The error bars indicate standard deviation from the mean. Where the error bars are not visible, the standard deviation was smaller. A, quantitative analysis of |3-glactosidase in yeast reporter strain Y187 transformed with hybrid expression plasmids containing different GALA fusion protein such as GAL4 BD-ZNF191 (1-25laa), GAL4 BD-ZNF253 (l-174aa) and GAL4 BD-ZNF255 (l-81aa). Y187 were also transformed with the negative control pGBT9, weak positive control pGBT9-HA (hemagglutinin) and strong positive control pCLl encoding the full-length wild-type GAM simultaneously. B and C, analysis of luciferase in CHO and NH3T3 cells cotransfected by constructive plasmids derived from pM containing GAL4 BD with reporter plasmid pGAL45tkLUC, respectively. The recombinant pM including GAL4 BD-ZNF191 (l-251aa), GAL4 BD-ZNF253 (l-174aa) and GAL4 BD-ZNF255 (l-81aa), the negative control pM and the positive control pM3-VP16 encoding herpes virus protein were cotransfected into CHO and NM3T3 cells with different molar ratios of plasmids to be tested and reporter plasmid. Open columns and filled columns represent ratio of 1:1 and ratio of 1:3, respectively.

14

J. Gu et al.

To approach the property in transcriptional regulation of ZNF255 containing KRNB domain, the same experimental procedures were performed. Non-zinc finger region (l-81aa) including the KRNB domain was subcloned into the pGBT9 and pM to form in frame fusions, which were used to transform Y187 and transfect mammalian cell lines, respectively. It is interesting to note that the fusion protein GAL4-ZNF255 can stimulate the expression of reporter gene lacZ in yeast. However, slight transcriptional suppression was observed in both CHO and NIH3T3 cell lines (Fig.2A, B and C). Chromosome Localization of ZNF191, ZNF253, ZNF254, ZNF255, ZNF256, and ZNF257. Using FISH, ZNF191 was mapped on chromosome 18ql2.1. Interestingly, ZNF253, ZNF254, ZNF255, ZNF256, ZNF257, ADCAHAOl and CBCBHDIO were all mapped on chromosome 19, ZNF253, ZNF254, ZNF257 being located at 19pl3 and ZNF255, ZNF256 ,ADCAHA01 and CBCBHDIO at 19ql3 by RH technique (Fig 3).

13.3

13 2 13.1

u

ZNF253

\znns! IHK54

12

II12

g

13.1 . CBCBHDIO ZF255

13.2 13.3 13.4

: Chr. 19

Figure 3. 7 novel zinc finger genes were localized on chromosome 19 by using RH and STS searching.

Identification of Novel Gene Family Members

15

Ring Finger is another subfamily of zinc finger family which contains two subtypes of C3HC4 and C3H2C3 ring finger [13]. This subfamily contains several functional important tumor related genes such as PML and BRCA1 [14,15]. However, we have identified three members in our library. Further functional analysis of these genes are now undertaking. Leucine zipper is a kind of transcription factor with a characteristic L-X(6)L pattern [16]. 17 novel full-length cDNA were found to have this pattern while 6 of them have this pattern localized at a-helix or coil region. When NLS searching was performed, only 3 of them showed this signal. 4

Discussion

Since tissue- or development stage-related differential expression exists for many genes, cloning of full-length cDNA based on EST analysis in different tissues represents a useful approach for gene identification, especially for those subject to temporal-spatial regulation. In strict sense, a full-length cDNA should cover both the ORF and the complete 5' and 3' UTR. Though a number of methods have been used to surmount the technical obstacles for getting the 5' end of cDNA [17], it is still difficult to reach the transcription start site in many cases. However, as the most important functional information of the mRNA is contained in the ORF, cDNAs containing entire ORFs are often considered as being full-length. By combining several technologies including construction of full-length cDNA enriched libraries, in silico cloning and RACE, a relatively efficient working system has been established to obtain full-length cDNAs, or more precisely cDNAs including entire ORFs, in a cost-effective way. This system has enabled the first resource of cDNAs with entire ORFs to be generated for novel genes whose expression is found in human CD34+ HSPCs and neuro-endocrine system. One strong challenge to the genomic science nowadays is to elucidate the function of the newly discovered huge amount of genes. In this work, we tried to apply the currently available bioinformatic tools to the analysis of the structural and functional characteristics of each ORF. Some experimental assays were also performed to explore the functions of some important genes. Using BLAST search, totally 266 out of 600 ORF were found to share homology to genes with known functions, offering thus important clues for the choice of appropriate functional assays in further study. Hereby we divided them into several gene families involved in transcriptional regulation, vesicle transporting, signal transduction and so on. Cys2/His2 type zinc finger gene family is one of the largest gene families and each member has repeated zinc finger motifs containing finger-like structure by 2 cysteine and 2 histidine covalently binding to one zinc ion[18]. It is estimated that in this huge family, about one third of the members are Kriippel-like genes as characterized

16

J. Gu et al.

by the presence of highly conserved connecting sequences "TGEKPYX" between adjacent zinc finger motifs. Substantial evidence indicates that Kriippel proteins are important players in many physiological processes as transcriptional regulators. Kriippel-like zinc finger family has many subfamilies based on non-zinc finger regions and these subfamilies play distinct roles in terms of transcriptional regulation of target genes. So far several domains are found in non-zinc finger regions, such as KRAB, FAX (finger-associated box), POZ (poxvirus and zinc finger), SCAN, FAR (finger-associated repeats) and PR domains [12,19-22]. These domains may affect transcription directly or indirectly. There is evidence to show that KRAB domain, present in one-third of the Kriippel-like zinc finger genes, functions as transcriptional repressor [23]. Of note, 4 genes cloned in the present work contain KRAB domain and the experiments on ZNF253 containing KRAB domain did show transcriptional repressive activities. In view of its wide existence, it is reasonable to suggest that the KRAB domain play an important role in transcription regulation. However, results on the transregulatory properties of other domains from different authors could be controversial. In this study, the SCAN domain from ZNF191 showed distinct activities in different experimental systems, slight transactivator in yeast cells but transrepressor in CHO and NIH3T3 cells. It is possible that the properties of SCAN are determined by gene and/or cell context with distinct transcriptional machineries. A previously undefined domain, nominated here as KRNB, was discovered in ZNF255. This domain, when fused with GAM BD, upregulated the transcriptional expression of luciferase reporter gene in yeast, although no obvious effect was observed in both CHO and NIH3T3 cells. It is thus possible that the KRNB functions as a conditional transactivator. Previous work showed that in human being, more than 40 zinc finger genes aggregated on chromosome 19pl3 and more than 10 genes on chromosome 19ql3 [24-25]. Chromosomal localization also supports this conclusion because 7 of them are aggregated on chromosome 19pl3 or 19ql3 regions except for ZNF191 that has been mapped to chromosome 18. The precise functions of these genes should be further elucidated. However, exploring the function of these novel genes with homology to genes of known functions may provide an insight into their novel functions as well as confirming their known functions. One imporant clue to the possible functions of these novel genes was their expression pattern (available on website: www.chgc.sh.cn) . It is interesting to note that that ZNF253, ZNF254, ZNF256, and ZNF257 are selectively expressed in certain leukemia cell lines representing different lineages, and thus could be related to the differentiation and maturation of hematopoiesis. In contrast, ZNF191 and ZNF255 show ubiquitously expression in leukemia cell lines Regarding these novel genes without ascertained functions, bioinformatic tools are used to search the functional motifs and domains as well as their possible subcellular localization, thus speculate the possible pathways it may involve in.

Identification of Novel Gene Family Members

17

The difficulty was how to deal with the majority of the ORFs without obvious functional information. We therefore attempted to evaluate the conservatism of the sequences through evolution. As a result, 151 ORF show over 25% similarity at amino acid level to those identified in organisms including E.coli, S.cerevisiae, C.elegans, Drosophila, Arabidopsis and mammals. Though a large proportion of these evolutionarily conserved genes are of unknown function, this analysis can provide at least the following information: on one hand, they are most likely to exert important biological function; and on the other hand, the low organisms containing homologous sequences can be used as models in the functional study with gene knock-out or other methods. Moreover, efforts have been made to approach the gene function by search of distinct motifs, including those related to the subcellular localizations. Regarding those orphan genes with no homologous genes available, de novo functional analysis should be taken while keeping comparison to genetic information of any newly sequenced genomes of model organisms. New approach such as more efficient functional analysis assays and 3D modeling software needs to be developed, in order to speed up the shift from structural genomics research to functional genomics research. 5

Acknowledgement

This work was supported in part by the Chinese High Tech Program (863), the Chinese National Key Program for Basic Research (973), the National Natural Science Foundation of China, Shanghai Commission for Science and Technology, and the Clyde Wu Foundation of SIH. The authors thank all members of SIH, SIE and CHGC for their constructive discussion and encouragement. Reference 1.

2.

3. 4.

Collins F.S., Patrinos A., Jordan E., Chakravarti A., Gesteland R., Walters L., New goals for the U.S. Human Genome Project: 1998-2003. Science 282 (1998) pp. 682-689. Venter J.C., Adams M.D., Sutton G.G., Kerlavage A.R., Smith H.O., Hunkapiller M. Shotgun sequencing of the human genome. Science 280 (1998) pp. 1540-1542. Burge C , Karlin S., Prediction of complete gene structures in human genomic DNA. /. Mol. Biol. 268 (1997) pp. 78-94. Putney S.D., Herlihy W.C, and Schimmel P., A new troponin T and cDNA clones for 13 different muscle proteins, found by shotgun sequencing. Nature 302 (1983) pp. 718-721.

18 5.

6.

7.

8. 9.

10.

11. 12.

13. 14.

15.

16. 17.

/. Gu et al. Adams M.D., Kelley J.M., Gocayne J.D., et al.,Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252 (1991) pp. 1651-1656. Mao M., Fu G., WU J.S., Zhang Q.H., Zhou J., Kan L.X., Huang Q.H., He K.L., Gu B.W., Han Z.G., Shen Y., Gu J., Yu Y.P., XU S.H., Wang Y.X., Chen S.J., and Chen Z., Identification of genes expressed in human CD34(+) hematopoietic stem/progenitor cells by expressed sequence tags and efficient full-length cDNA cloning. Proc Natl Acad Sci USA 95 (1998) pp. 8175-8180. He K.L., Gu B.W., Zhang Q.H., Fu G., Wu J.S., Han Z.G., Cao W.J., Zhou J., Mao M., Liu J.X., Chen Z. and Chen S.J.. Application of radiation hybrid in gene mapping. Science in China (Series C) 41 (1998) pp. 644-649. Klug A., Zinc finger peptides for the regulation of gene expression. J. Mol. Biol. 293 (1999) pp. 215-8. Hammarstrom A., Berndt K.D.., Sillard R., Adermann K., Otting G., Solution structure of a naturally-occurring zinc-peptide complex demonstrates that the Nterminal zinc-binding module of the Lasp-1 LIM domain is an independent folding unit. Biochemistry 35 (1996) pp. 12723-12732. Barlow P.N., Luisi B., Milner A., Elliott M, Everett R. Structure of the C3HC4 domain by lH-nuclear magnetic resonance spectroscopy. A new structural class of zinc-finger. J. Mol. Biol. 237 (1994) pp. 201-211. Klug A., Schwabe J.W., Protein motifs 5. Zinc fingers. FASEB J9 (1995) pp. 597-604. Williams A.J., Khachigian L.M., Show T., Collins T. Isolation and characterization of a novel zinc-finger protein with transcription repressor activity. / Biol. Chem. 270 (1995) pp.22143-22152. Borden K.I., Freemont P.S., The RING finger domain: a recent example of a sequence-structure family. Curr. Opin. Stuct Bio. 6 (1996) pp.395-401. Borden K.L., Boddy M.N., Lally J., O'Reilly N.J., Martin S., Howe K., Solomon E., Freemont P.S., The solution structure of the RING finger domain from the acute promyelocytic leukaemia proto-oncoprotein PML. EMBO J. 14 (1995)pp.l532-1541. Miki Y., Swensen J., Shattuck-Eidens D., Futreal P.A., Harshman K., Tavtigian S., Liu Q., Cochran C , Bennett L.M., Ding W., et al. A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science 266 (1994) pp. 66-71. Alber T., Structure of the leucine zipper. Curr. Opin. Genet. Dev. 2 (1992) pp. 205-210. Carninci P., Kvam C , Kitamura A., Ohsumi T., Okazaki Y., Itoh M., Kamiya M., Shibata K., Sasaki N., Izawa M., Muramatsu M., Hayashizaki Y., Schneider C , High-efficiency full-length cDNA cloning by biotinylated CAP trapper. Genomics 37 (1996) pp.327-336.

Identification of Novel Gene Family Members

19

18. Jacobs G.H., Determination of the base recognition positions of zinc fingers from sequence analysis. EMBO J. 11 (1992) pp.4507-4517. 19. Knochel W., Poting A., Koster M., Elbaradi T., Nietfeld W., Bouwmeester T., Pieler T., Evolutionary conserved modules associated with zinc fingers in Xenopus laevis. Proc. Natl. Acad. Sci. USA 86 (1989) pp. 6097-6100. 20. Bellefroid E.J., Poncelet D.A., Lecoq P.J., Revelant O., Martial J.A., The evolutionarily conserved Kruppel-associated box domain defines a subfamily of eukaryotic multifingered proteins. Proc. Natl. Acad. Sci. USA 88 (1991) pp.3608-3612. 21. Albagli O., Dhordain P., Deweindt C., Lococq G., Leprince D., The BTB/POZ domain: a new protein-protein interaction motif common to DNA- and actinbinding proteins. Cell Growth Differ. 6 (1995) pp.1193-1198. 22. Liu L., Shao G., Steele-Perkins G., Huang S., The retinoblastoma interacting zinc finger gene RIZ produces a PR domain-lacking product through an internal promoter. J. Biol. Chem 272 (1997) pp. 2984-2991. 23. Friedman J.R., Fredericks W.J., Jensen P.E., Speicher D.W., Huang X.P., Neilson E.G., Rauscher F.J. 3rd., KAP-1, a novel corepressor for the highly conserved KRAB repression domain. Genes Dev. 10 (1996) pp.2067-2078. 24. Shannon M., Ashworth L.K., Mucenski M., Lamerdin J.E., Branscomb S., Comparative analysis of a conserved zinc finger gene cluster on human chromosome 19q and mouse chromosome 7. Genomics 33 (1996) pp.112-120. 25. Bellefroid E.J., Lecocq P.J., Benhida A., Porcelet D.A., Belayew A., Martial J.A., The human genome contains hundreds of genes coding for finger proteins of the Kruppel type. DNA 8 (1989) pp.377-387.

STRATEGIES FOR TESTIS SPECIFIC GENE EXPRESSION ERWINN GOLDBERG Department of Biochemistry,

Molecular Biology and Cell Biology, Northwestern Evanston, 1L 60208 U.S.A. E-mail:

University,

[email protected]

The mammalian testis is a unique organ programmed for both endocrine and germ cell production, functions that are clearly interdependent. The germ cell component displays distinctive developmental properties illustrated by programmed molecular events that occur with the onset and during spermatogenesis and include activation and inactivation of numerous genes yielding protein products with distinct or modified properties. Genes expressed during spermatogenesis can be classified as "housekeeping" or structural. Both categories include testis specific isozymes and isoforms. One such example is LDHGt, a member of the lactate dehydrogenase gene family that is transcribed only during prophase of the first meiotic division. We have cloned and sequenced the promoter of this gene and demonstrated functionality. Even though this gene and protein are well-studied, there remains the question of why LDH-C4 supplants the other lactate dehydrogenases in testis and sperm metabolism. A second example of an unique protein is provided by calpastatin. This protein is the endogenous inhibitor of calpain, a cytoplasmic cysteine protease. The calpastatin gene, unlike ldhc, is the product of alternative promoter usage by which a truncated testis specific isoform of the somatic calpastatin is produced. Testis calpastatin (tCAST) is transcribed and translated in round spermatids. The promoter region and coding exon is located within an intron of the somatic gene. We have co-localized the testis calpastatin and calpain to the region of the sperm between the plasma membrane and outer acrosomal membrane where presumably it may be a player in the events associated with the acrosome reaction and/or with sperm-egg fusion. A third example is UDP-N-acetylhexosamine pyrophosphorylase, described originally as AgX, the product of an alternatively spliced mRNA. A 16 amino acid deletion in the protein product results in a change in substrate specificity. The large number of testis specific and testis abundant isozyme and protein isoforms suggests that they are not a biological curiosity, but rather are required for both full and complete spermatogenesis and for sperm function. Mechanisms regulating testis-specific gene expression, and structure/function aspects of testis gene expression will be addressed in this report.

1

Specific Gene Expression During Spermatogenesis

Development of an undifferentiated stem cell to the highly specialized spermatozoan is a complex process. Spermatogenesis occurs in three stages. First a stem cell, the spermatogonium undergoes a series of mitotic divisions resulting in renewal of the stem cell population, apoptosis, or commitment to enter the meiotic pathway. This in turn leads to the spermatocyte which in the meiotic phase undergoes two divisions that yield the haploid spermatid. The spermatid then undergoes extensive

21

22

E. Goldberg

remodeling involving cytoplasmic reduction and nuclear chromatin condensation for delivery to the egg during fertilization. In addition to well-studied morphological changes spermatogenesis is characterized by activation and repression of many genes including those encoding isozymes and protein isoforms unique to the testis [1,2]. This paper describes three examples of regulatory strategies resulting in expression of testis specific proteins. 2

Alternative Splice Variants

There are a number of examples of alternative splice variants in germ cells and other tissues 1. AgX subsequently named SPAG2 for Sperm Antigen 2, was discovered in a screen of a human testis cDNA expression library with a pool of sera containing antibodies that agglutinated spermatozoa [3]. AgX cDNAs isolated from testis and placenta cDNA libraries (AgX-1 and AgX-2, respectively) differed by a 48-bp deletion in the open-reading frame (ORF). The AgX-1 and AgX-2 ORF's encoded putative peptide chains of 505 and 521 amino acids (-55.5 and -67.3 kDa), respectively. Both AgX isoforms occur in the testis, but AgX-1 appears to be the only species present in spermatozoa. Immunofluorescence analysis of human spermatozoa detected AgX in the principal piece of the tail. Subsequently, we showed by immunoelectronmicroscopy, localization to the outer dense fibers, structural filaments associated with the mammalian sperm axoneme [4]. Southern analysis of human genomic DNA with a probe common to both AgX isoforms indicated a single AgX gene, therefore alternative splicing is the likely mechanism for production of these variant mRNAs. The AgX isoforms differed by a 16 amino acid deletion suggesting that the AgX-1 mRNA resulted from splicing out of a "miniexon", as has been suggested for mRNAs that differ by a small insertion, e.g. the CI and C2 heterogeneous ribonuclear proteins [5]. Furthermore, alternative splicing of short exons has been proposed as an "on/off switch" for the testis isoforms of other proteins such as the cyclic adenosine 3',5'-monophosphate response element binding protein (CREB) [6]. The tight control of gene expression by alternative splicing occurs frequently in differentiating tissues such as the testis [7]. When we originally described AgX3, the cDNA nucleotide and derived amino acid sequences were not similar to any sequences in the Genbank, EMBL SwissProtein, or PIR data libraries. Recently Mio et al. [8] reported that the cDNA for human UDP-N-acetylglucosamine pyrophosphorylase (UAP1) was identical to AgX. The substrate for this enzyme, N-acetylglucosamine-1-phosphate (GlcNAc-1-P), is a ubiquitous and essential metabolite and plays important roles in several metabolic processes. Subsequently, Wang-Gillam et al. [9] found that substrate specificity of these isoforms differ. The amino acid deletion in AgX-1 changes UDP-N-

Testis Specific Gene Expression

23

acetylhexosamine pyrophosphorylase specificity from UDP-GlcNAc to UDPGalNAc. The significance of this shift in substrate specificity is not immediately apparent but identification of AgX-1 as UDP-GalNAcPP should help unravel its function in spermatozoa. 3

Alternative Promoter Usage: Testis Specific Calpastatin (tCAST)

A testis specific isoform of calpastatin was identified in a screen of a human testis cDNA expression library with serum from an infertile woman [10]. Three transcripts were detected in human testes by Northern blots; the smallest of which (1.9 kb) was specific to testis [11]. The testis specific transcript contained 186 bp of unique sequence at the 5' end. The remainder of the molecule was virtually identical to somatic calpastatin. Calpastatin is the endogenous inhibitor of calpain, a widely distributed cysteine protease. The calpain/calpastatin system is Ca+2 activated and plays important roles in membrane fusion events and vesicle formation. Calpain has been detected in porcine and human sperm [12,13], suggesting critical involvement in sperm function and fertilization. The importance of Ca+2 to calpain activation and the absolute requirement for Ca+2 influx in initiating the acrosome reaction [14,15] support this contention. Isoforms of calpastatin have been described in several tissues. Differences arise from alternate splicing and exon skipping. The testis isoform of calpastatin, however, is the product of unique promoter usage and a single exon residing within intron 14 of the somatic calpastatin gene 11. The overall structure of tCAST is similar to that of the testis specific isozyme, angiotensin-converting enzyme (ACE). A testis specific mRNA encodes t-ACE [16,17] which arises from a unique promoter and single exon in intron 12 of the somatic ACE gene [18,19]. A similar transcriptional strategy generates calspermin from the gene encoding a Ca+2 /calmodulin-dependent protein kinase IV. The calspermin transcript is produced by utilization of a testis-specific promoter located within an intron of the calmodulin kinase IV gene [20]. Functional data for the testis isoforms including tCAST have yet to establish a specific role for each during spermatogenesis. In the case of tCAST, we have learned that this isoform localizes to the space between the plasma membrane and outer acrosomal membrane of the sperm [21] and appears to be associated with the acrosomal vesicle during spermiogenesis (Li & Goldberg, in preparation). As noted above, a functional role in the acrosome reaction seems plausible and is amenable to testing. The more compelling question concerns the selection of the intronic promoter that initiates t-CAST transcription. Possibly, a testis specific trans-activation factor(s) is involved in this regulatory strategy.

24 4

E. Goldberg Unique Gene Expression

The testis specific isozyme of lactate dehydrogenase (LDH-C4) has been studied extensively and has served as an important model for testis specific gene expression. Lactate dehydrogenases became the foundation for the isozyme concept which Clement Markert formulated in 1959. Since that time the LDH literature has increased logarithmically and studies on LDH have been applied to evolution, protein structure, function and diversity, clinical manifestations and gene expression. The evolution of the LDH gene family, tissue distribution of LDH isozymes and physiological implications, have been described on numerous occasions and in exquisite detail (see, for example [22]). Molecular cloning technology applied to the ldhc gene in my laboratory has confirmed the origin in mammals of ldhc as a duplication of ldha [23]. Additionally we have cloned and sequenced the promoter region of ldhc and demonstrated its function by in vitro transcription assays [24] and in vivo as a transgene construct [25]. Surprisingly, our transgene studies revealed that the promoter was active only during the pre-meiotic stages of spermatogenesis even though the protein accumulates throughout germ cell development and differentiation. Whether this is due to stable mRNA or low turnover of the protein remains to be established. Lack of a reliable germ cell culture system makes difficult analyses of testis gene expression in general and ldhc gene expression in particular. Additionally, the question of LDH-C4 function during spermatogenesis or as a sperm enzyme remains open. Our approach to this question, therefore is to target disruption of the gene by homologous recombination. Difficulties in preparing a suitable targeting construct have been resolved by sequencing the entire gene. The gene is large (14 kb) and contains an abundance of intronic repetitive elements (Olssen, unpublished observations) which tend to confound the analyses of targeting construct. Nevertheless, we (Goldberg & Millan, unpublished observations) are completing this project to obtain the ldhc-/-mutant for phenotypic analysis. 5

Summary

Specific gene expression during spermatogenesis seems to have become the norm rather than the exception. The variety of strategies reflect the complexity of the process. Alternative splice sites, alternate promoter usage and cell specific gene expression occur in many cells during development and differentiation. The uniqueness of these gene regulatory paradigms in the testis lies in timing, distribution and specialization of the final product, the spermatozoan.

Testis Specific Gene Expression 6

25

Acknowledgements

The many students who contributed to studies on LDH-C4 are acknowledged as coauthors of the publications from my laboratory. As a personal reflection, I recall that it was my first meeting with Clem Markert at an AIBS meeting in Bloomington, Indiana in 1961 that turned me on to look for multiple forms of LDH in spermatozoa. Blanco and Zinkham and I reported in Science papers simultaneously the discovery of LDH-X (its operational designation) in 1963. Subsequently, I visited with Clem at Johns Hopkins University and clarified the existence of the C subunit of lactate dehydrogenase. Clem's perception, interest and collaboration were instrumental in supporting my long association with this isozyme. This work was supported by NIH HD05863, NIH Sub-5-U54-HD29099, and P 30 HD28048. References 1. 2. 3. 4.

5.

6.

7. 8.

9.

Goldberg E., Minireview: Transcriptional regulatory strategies in male germ cells. J. Androl. 17 (1996) pp. 628-632. Hecht N.B., Molecular mechanisms of male germ cell differentiation. BioEssays 20 (1998), pp. 555-561. Diekman A.B. and Goldberg E., Characterization of a human antigen with sera from infertile patients. Biol Reprod. 50 (1994) pp. 1087-1093. Diekman A.B., Olson G., and Goldberg E., Expression of the human antigen SPAG2 in the testis and localization to the outer dense fibers in spermatozoa. Molec. Reprod. Develop. 50 (1998) pp. 284-293. Nakagawa T.Y., Swanson M.S., Wold B.J., and Dreyfuss G., Molecular cloning of cDNA for the nuclear ribonuclear particle C proteins: A conserved gene family. Proc. Natl. Acad. Sci. USA 83 (1986) pp. 2007-2011. Waeber G., Meyer T.E., LeSieur M., Hermann H.L., Gerard N., and Habener J.F., Developmental stage-specific expression of cyclic adenosine 3',5'monophosphate response element-binding protein CREB during spermatogenesis involves alternatiave exon splicing. Mol. Endocrinol. 5 (1991) pp. 1418-1430. Smith C.W.J., Patton G., and Nadal-Ginard B., Alternative splicing in the control of gene expression. Annu. Rev. Genet. 23 (1989) pp. 527-577. Mio T., Yabe T., Arisawa M., and Yamada-Okabe H., The eukaryotic UDP-NAcetylglucosamine pyrophosphorylases. /. Biol. Chem. 273 (1998) pp. 1439214397. Wang-Gillam A., Pastuszak I., and Elbein A.D., A 17-amino acid insert changes UDP-N-Acetylhexosamine pyrophosphorylase specificity from UDP-GalNAc to UDP-GlcNAc. J. Biol. Chem. 273 (1998) pp. 27055-27057.

26

E. Goldberg

10. Liang Z.G., O'Hern P.A., Yavetz B., Yavetz H., and Goldberg E., Human testis cDNAs identified by sera from infertile patients: a molecular biological approach to immunocontraceptive development. Reprod. Fertil. Develop. 6 (1994) pp. 297-305. 11. Li S., Liang Z.-G., Wang G.-Y., Yavetz B., Kim E.D., Ngai K.-L., and Goldberg E., Characterization of a membrane associated mouse testis calpastatin. Biol. Reprod. (in Press, 2000). 12. Rojas F.J., Brush M., and Moretti-Rojas I., Calpain-calpastatin: a novel, complete calcium-dependent protease system in human spermatozoa. Molec. Human Reprod. 5 (1999) pp. 520-526. 13. Schollmeyer J.E., Identification of calpain II in porcine sperm. Biol. Reprod. 34 (1986) pp. 721-731. 14. Green D.P., The induction of the acrosome reaction in guinea-pig sperm by the divalent metal cation ionophore A23187. J. Cell. Sci. 32 (1978) pp. 137-151. 15. Talbot P., Summers R.G. Hylander B.L., Keough E.M., and Franklin L.E., The role of calcium in the acrosome reaction: an analysis using ionophore. J. Exp. Zool. 198 (1976) pp. 383-392. 16. Ehlers M.R.W., Fox E.A., Strydom D.J., and Diordan J.F.,. Molecular cloning of human testicular angiotensin-converting enzyme: the testis isozyme is identical to the carboxyl-terminal half of endothelial angiotensin-converting enzyme. Proc. Natl. Acad. Sci. USA 86 (1989) pp. 7741-7745. 17. Bernstein K.E., Martin B.M., Bernstein E.A., Linton J., Striker L. and Striker G., The isolation of angiotensin-converting enzyme cDNA. J. Biol. Chem. 263 (1988) pp. 11021-11024. 18. Howard T.E., Shai S.-Y., Langford K.G., Martin B.M., and Bernstein K.E., Transcription of testicular angiotensin-converting enzyme (ACE) is initiated within the 12th intron of the somatic ACE gene. Mol. Cell. Biol. 10 (1990) pp. 4294-4302. 19. Langford K.G., Shai S.-Y., Howard T.E., Kovac M.J., Overbeek P.A., and Bernstein K.E., Transgenic mice demonstrate a testis-specific promoter for angiotensin-converting enzyme. J. Biol. Chem. 266 (1991) pp. 15559-15562. 20. Means A.R., Cruzalegui F., LeMangueresse B., Needleman D.S., Slaughter G.R., and Ono T., A Novel Ca2+/calmodulin-dependent protein kinase and a male germ cell-specific calmodulin-binding protein are derived from the same gene. Molec. Cell. Biol. 11 (1991) pp. 3960-3971. 21. Yudin A.I., Goldberg E., Robertson K.R., and Overstreet J.W., Calpain and calpastatin are located between the plasma membrane and outer acrosomal membrane of cynomolgus macaque spermatozoa. /. Androl. (in Press, 2000) 22. Markert C.L, Isozymes: model systems for analyzing the origin, evolution, regulation, and function of gene families. In: Gene Families: Structure, Function, Genetics and Evoluton. Holmes, R.S. and Lim, H.A. (Eds.) (World Scientific Publishing Co, New Jersey, 1995) pp. 3-7.

Testis Specific Gene Expression

27

23. Millan J.L., Driscoll E.E., LeVan K.M., and Goldberg E., Epitopes of human testis-specific lactate dehydrogenase deduced from a cDNA sequence. Proc. Natl. Acad. Sci. USA 84 (1987) pp. 5311-5315. 24. Zhou W., Xu J., and Goldberg E., A 60-bp core promoter sequence of murine lactate dehydrogenase C is sufficient to direct testis-specific transcription in vitro. Biol. Reprod. 51 (1994) pp. 425-432. 25. Li S., Zhou W., Doglio L., and Goldberg E, Transgenic mice demonstrate a testis-specific promoter for lactate dehydrogenase (LDH). J. Biol. Chem. 273 (1998) pp. 31191-31194.

OXIDIZED ISOFORMS AS DIAGNOSTIC BIOMARKERS OF ALZHEIMER'S DISEASE ROBERT W. G R A C Y , JOHN M. TALENT, CHRISTINA MALAKOWSKY, RACHEL D A W S O N , P A M MARSHALL, AND C R A I G C. CONRAD

Molecular Aging Unit, Department of Molecular Biology and Immunology, University of North Texas Health Science Center, Fort Worth, Texas, 76107, USA Email: [email protected] Senile plaques consist of beta-amyloid (AfJ), and is the major pathology found with Alzheimer's Disease (AD). Ap\ is particularly sensitive to oxidation, but can also produce reactive oxygen species (ROS) during Afl-fibril formation. Cells from AD subjects are more sensitive to oxidation than non-AD age-matched controls, and it appears that a number of proteins are preferentially oxidized in plasma samples from AD compared to non-AD. We are using immunoprobes specific for oxidized proteins to elucidate the mechanism of oxidative damage and apoptosis in the neuron and to evaluate the potential of oxidized isoforms as biomarkers for early detection of AD.

1

Introduction

1.1 Oxygen and ROS Oxidative metabolism is more efficient than anaerobic metabolism, however, incomplete oxygen metabolism leads to cytotoxic reactive oxygen species (ROS). There are many different types of ROS, including oxygen radicals (e.g., superoxide anion, hydroxyl radical), non-radical oxygen species (e.g., hydrogen peroxide, ozone), reactive lipids and carbohydrate derivatives (e.g., hydroxynonenal, malondialdehyde, ketoamines, or ketoaldehydes), as well as others. These ROS can spontaneously react with virtually all cellular macromolecules (e.g., proteins, lipids, and nucleic acids), causing undesirable damage and cell death. For recent reviews see: [1,2,4,5,8]. As seen in Figure 1, damaging ROS also occur from environmental exposure. For example, cigarette smoke, air/water pollutants, ozone, some food additives, and medications all contain powerful oxidizing compounds that directly cause ROS or indirectly generate ROS during breakdown and catabolism. Furthermore, low-level cosmic irradiation, x-rays and other types of electromagnetic irradiation can generate ROS. Even ultraviolet light produced by sunlight can induce photooxidations.

29

30

R. W. Gracy et al.

In vivo ROS activated astrocytes. or glial cells, etc.

EXTERNAL ROS Pollutants, UV radiation, etc.

Cell Death Figure 1. The production of Reactive Oxygen Species (ROS). Damaging oxygen/nitrogen species can be generated in vivo or from the environment. Severe damage can result in cell dealth via Necrotic, or Apoptotic pathways.

ROS are also produced by the cellular immune system to combat infections. Macrophages kill invading microorganisms by generating toxic ROS. Because ROS can damage cells indiscriminately, some of the host's cells also succumb to the macrophage attack on the invading microorganism. In the case of chronic inflammations, such as autoimmune responses, much of the tissue damage is due to ROS generated by the immune system. 1.2 ROS Damage and Aging Susceptibility to oxidative stress is more pronounced with age. Organisms accumulate oxidative damage with age, and ROS are implicated in the fundamental process of aging. This has been substantiated both in vitro and in vivo. Cells and tissues exposed to low-level ROS accumulate oxidized proteins similar to those observed in aged cells and tissues. Furthermore, when laboratory animals are fed a

Oxidized Isoforms as Diagnostic Biomarkers of Alzheimer's Disease

31

caloric-restricted diet (to increase life span), the amount of oxidative damage to cellular components is reduced [2]. This evidence supports a correlation between age and the accumulation of oxidatively damaged proteins. ROS react with DNA and cause mutations that can lead to cancer. Beckman and Ames [1] have estimated that a steady state level of DNA damage is approximately 150,000 oxidative adducts per cell, and these oxidative modifications may contribute to half of all human cancers. Furthermore, oxidized proteins may represent 30-50% of the total cellular protein of an old individual [2]. These modifications can result in peptide fragmentation, cross-linking, and amino acid modifications. Essentially every amino acid in a protein is potentially susceptible to chemical modification by oxidation. Such modifications can result in changes in the protein secondary and tertiary structures, and these conformational changes may expose previously shielded regions to further oxidations, or other types of spontaneous modifications such as deamidation [6]. The turnover of modified or damaged proteins also decreases with increasing age. Modified or damaged molecules are more readily degraded in young cells and tissues compared to similar proteins in old cells and tissues, which may interfere with the cell's ability to maintain homeostasis. The accumulation of such oxidized proteins with age was originally thought to result from random oxidation events. However, different ROS are not equally damaging to all amino acids, and different proteins exhibit different susceptibilities to such damage. Schoneich and Yang [13] have pointed out the importance of peptide sequence and neighboring groups in the oxidation potential of methioninecontaining peptides. In addition, protein oxidation can result in free radical propagation. The amyloid beta peptide (A(3) in the brains of patients with Alzheimer's Disease is an example. The A(J peptide contains 40 amino acids, of which only one methionine residue (Met35) can be "easily" oxidized [15]. In contrast, peptides that contain the same amino acids, but in the reverse sequence or scrambled sequences, do not become oxidized. This emphasizes the importance of specific amino acid sequences for susceptibility to oxidation. Because A(3 can also generate free radicals, it is believed to contribute to oxidative damage and neurotoxicicity that occur in the brains of Alzheimer's patients [7,9,10]. Substitution of cysteine for Met35 eliminates the toxic effects of Ap"s toward cells in culture [16].

32 2

R. W. Gracy et al. Results

2.1 Oxidative Damage and Alzheimer's Disease We now recognize that many of the age-related neurodegenerative diseases such as Parkinson's, Alzheimer's and other dementias are either caused, or exacerbated by oxidative damage. This can be explained because the brain is particularly susceptible to ROS damage. First, the brain relies on very large amounts of oxygen (e.g., approximately 20% of the total body oxygen consumption is for brain metabolism). Secondly, brain tissue contains a high concentration of unsaturated fatty acids that are highly susceptible to oxidation. Thirdly, the brain contains high levels of iron but has a relatively low capacity for iron binding. Iron catalyzes the spontaneous generation of ROS. Finally, the brain has relatively low levels of antioxidants. Thus, ROS may play an important role in the etiology of many types of chronic neuropathies. Figure 2 shows the pathological cascade believed to take place during the development of AD. Mutations in several different genes give rise to the abnormal production of the AP peptide, which can lead to increased ROS as discussed above [14]. For example, mutations in genes for the Amyloid Precursor Protein (APP), or in genes encoding the enzymes that cleave this protein, result in the accumulation of the Ap\ AP is believed to mediate the oxidative damage, but it is not clear whether it does this directly or indirectly (or both). Some data suggest that as the peptide undergoes aggregation to oligomers, it generates ROS as a consequence of packing of the nontoxic AP monomers into a toxic oligomer. Other studies suggest that AP causes the stimulation of glial cells (AP is not toxic to glial cells), and that the resulting hyperactivity of the glial cell generates ROS. Mutation m APR l'S-1 FS-2

l-nvir.> imejiUii

I

INCRI:.\SI-D production and ;uy£reiiatiort of Ap

->

McAD,

ha 0

{OS generatio

j

*k

. ti b

OXIDATIVE NEURONAL INJURY

* Neuronal CELL DEATH DEMENTIA

Inflammatory response (Uhtfl nervation &. astrocvtosis)

*

t

iHyperplxwphorylation j ofuuiand I riiitirofjbruliuylaiigle j (NI*T) formation

^

Oxidized Isoforms as Diagnostic Biomarkers of Alzheimer's Disease

33

In young cells, these processes do not result in large amounts of accumulated oxidized proteins, but in old cells, the oxidation is greater and leads to neurodegeneration. This could be due to an age-related lack of neuroprotective agent(s), the loss of antioxidants with age, or the failure of old cells to recognize and destroy oxidized proteins. Both genetic pre-disposition and environmental factors play key roles in the age of onset of AD. The addition of AB cells in culture can cause cell death (Figure 3). The cells from AD patients are more susceptible to oxidative damage than non-AD controls. 125

too«

BAD 01 Control

* T *

75

m so

*

rt 25

NT

Ap

HBO

if Ap/HBO

Figure 3. Survival of AD and control fibroblasts following exposure to the AP peptide (residue 25-35), Hyperbaric Oxygen (HBO), or both. Control and AD cells were grown to a density of 125,000 cells, then incubated as follows: No Treatment (NT); 50 um of A|3; 3ATM of hyperbaric oxygen (HBO); and 50 um of AP + HBO (Ap/HBO). Each bar represents the average and SEM of three separate 30 mm culture dishes.

At the subcellular level, damage caused by exposure to ROS and AB may lead to increased membrane permeability, which results in calcium leaking into the cell. The elevated intracellular calcium may activate calmodulin and stimulate the inducible isoform of nitric oxide synthetase. Nitric oxide in the presence of superoxide spontaneously forms peroxynitrite, which can modify tyrosine to nitrotyrosine. Such nitrotyrosine modifications could effect phosphorylation cascades, such as the hyper-phosphorylation of tau in neurofibrilary tangle formation. Ultimately, the consequences of ROS damage lead to apoptotosis of the neuron causing dementia.

34 2.2 Oxidized Isoforms as Biomarkers of Alzheimer's

R. W. Gracy et al.

Two forms of AD have been characterized; early onset (familial) and late onset (sporadic, greater than 97% of all cases). Because the pathology for both forms of AD are similar, the mechanism(s) that lead to the neuronal death due to excessive Ap* deposition are thought to be similar. The initial stages of AD begin long before clinical symptoms are apparent. The ability to detect pre-clinical AD would offer opportunities to develop and test preventative measures (e.g. antioxidants). Unfortunately, postmortem observation of brain tissue is the only reliable method to date for the 100% confirmation AD. Postmortem confirmation of AD relies on the presence of senile amyloid plaques and neurofibrillary tangles of the aggregated and phosphorylated tau protein. Clearly, predictive diagnostic biomarkers for AD are needed. Genetic biomarkers can be used to predict familial AD, but this is only a small (less that 3%) subset of patients likely to develop the disease. Furthermore, genetic biomarkers are of little use for monitoring the development, progression or prevention of AD. For such purposes, oxidized protein isoforms would offer the best potential diagnostic test. Also, the oxidative damage of AD may not be restricted to proteins in the brain since the ROS may damage cells that make up the blood brain barrier. Moreover, the antioxidant defense systems are compromised in AD brains. Thus, it is likely that specific oxidized proteins may be found in the blood or cerebral spinal fluid (CSF) of persons susceptible to or suffering from AD. The identification of such blood or CSF oxidized protein biomarkers may be the key to diagnosing AD. Furthermore, the degree of oxidation of such isoforms might be reflective of the level of progression of the disease similar to the glycosylation of hemoglobin (HbAlc) in the diabetic. We are using Western blots coupled with immunological staining to identify specifically oxidized AD protein biomarkers. We have found several potential biomarkers in the blood serum. Figure 4 shows a western blot that has been immunostained and quantified for oxidized proteins. Although the protein fingerprints are similar when stained for total protein (not shown), immunostaining reveals specific proteins were more oxidized from the AD samples compared to the age-, gender-, and race- matched controls. Figure 4A (band 1) represents one possible biomarker. The data in figure 4B show that the level of oxidation of band 1 is increased nearly 3-fold in the serum from AD patients compared to Non-AD controls. This increase in oxidation appears to be specific for the protein(s) in band 1. This specific oxidation of band 1 is apparent when band 2 is quantified. Band 2 is not specifically oxidized because there is no apparent oxidation changes when AD and non-AD controls are compared.

Oxidized Isofarms as Diagnostic Biomarkers of Alzheimer's Disease

35

2.3 Antioxidants Many antioxidants exist in vivo. These include metabolites (e.g., glutathione, NAD(P) H, cysteine), enzymes (superoxide dismutase, catalase), and vitamins (e.g., vitamin A, C, E). It has also been proposed that some proteins contain specific regions of antioxidant amino acids that serve as the last line of protection against EOS damage [11]. Since the antioxidant defenses may become compromised with age, and especially in potential AD subjects, 'antioxidants may prove to be useful in preventative therapy. For example, Vitamin E has been shown to slow the progression of the AD [12]. Estrogen replacement therapy has also been used for prevention and treatment of AD. It is now recognized that this is 'due to the antioxidant properties of estrogen. In animal models, powerful antioxidants have been reported to- reverse the damage cause by ROS. Furthermore, these compounds when administered to senile animals decreased levels of oxidized proteins in their brains, and restored short-term memory [3]. However, more research is needed before the optimal antioxidants can be prescribed. For example, it is not known which antioxidants may work best, or the optimal dosage or delivery routes.

AD C AD

A Baixf 1 »#- « * Band 2 *#* * ; *

B

jfc*n

30 < 2C»° 10-

(ji

i A

AD C

a ^m ;G Band 11 | ^GBand2j |

ffrh *^1 1 I B 1 m I! AD

Control

Figure 4. Oxidation of specific proteins in Alzheimer's Disease. Panel A is a representation of a 1-D Western blot of blood sera from AD and non-AD-control (all samples are the same age, gender, and race). The difference in specific oxidative damage can be determined by visual inspection of band 1 and band 2; Panel B shows the Integrated Density Values (IDVxlO) of the band areas and shows that band 1 is more specifically oxidized in AD'samples compared to non-AD controls, hi contrast, band 2 is equally oxidized in both AD and non-AD subjects. Values represent the mean ± SEM for four persons in each group. The data were analyzed statistically by ANOVA, and the means with significant differences (P<0.05) were further compared by Tukey-Kramer adjustments for multiple comparisons. The means showing significant difference (P<0.05 level) are identified by asterisks (*).

36 3 Conclusions

R. W. Gracy et al.

3.1 Use It or Lose It Is oxidative damage inevitable? Several new insights suggest a more optimistic alternative. Nerve growth factors and neuroprotective agents may prevent or even repair age-related chronic neurodegenerative diseases. The positive effects of a mentally stimulating "enriched environment" on neurological development are very promising. Both prospective and retrospective studies have suggested that people with a mentally stimulating lifestyle are less likely to develop dementia than those in a mentally less challenging environment. Animal studies have also clearly demonstrated this. For example, rats raised in a mentally enriched environment exhibited increased cognitive performance, neurogenesis, survival of neurons and memory. These physiological effects were substantiated by molecular changes that included decreased apoptotic death of hippocampal cells, and activation and induction of neurotrophic factors. Moreover, an enriched environment protected against chemically-induced seizures [17]. While the applications of these findings to humans are not fully understood, it appears mat "use it or lose it" as applied to the brain has a scientific basis. In summary, it is clear that humans are living longer and that the average life expectancy will continue to increase, even without genetic intervention. Those extra years in an aerobic environment will result in additional exposure to ROS. This increased exposure will occur at an age when antioxidant defenses may be marginal. Thus, as we grow older it becomes even more important that we understand the balance between oxidative damage and antioxidants. As we better understand the sources of ROS and the sites most vulnerable to critical oxidative damage, we can better design antioxidants and strategies to minimize or avoid such oxidative damage. The use of oxidized protein isoforms may play a critical diagnostic role in the development of these strategies. 4

Acknowledgements

This research was supported by grants from the Robert A. Welch Foundation (BK0502) and the Alzheimer's Association (IIRG-98-037). References 1.

Beckman K.B. and Ames B.N., Oxidative decay of DNA, J. Biol. Chem. 272 (1997) pp. 19633-19636.

Oxidized Isoforms as Diagnostic Biomarkers of Alzheimer's Disease 2. 3. 4.

5.

6.

7.

8. 9.

10.

11.

12.

13.

14.

37

Berlett B.S. and Stadtman E.R., Protein oxidation in aging, disease, and oxidative stress, J. Biol. Chem. 272 (1997) pp. 20313-20316. Carney J.M. and Carney A.M., Role of protein oxidation in aging and in ageassociated neurodegenerative diseases, Life Sci. 55 (1994) pp. 2097-2103. Chisolm G.M. Ill, Hazen S.L., Fox P.L., and Cathcart M.K., The oxidation of lipoproteins by monocytes-macrophages. Biochemical and biological mechanisms, J. Biol. Chem. 274 (1999) pp. 25959-25962. Gracy R. W., Talent J. M., Kong Y., and Conrad C. C , Reactive oxygen species: The unavoidable environmental insult?, Mutation Research 428 (1999) pp. 17-22. Gracy R.W., Yuksel K.U., and Schnackerz K.D., Isozymes and Aging: How do enzymes wear out? In Isozymes: Organization and roles in evolution, (Rivers Edge, NJ: Genetics and Physiology World Scientific Publishing Company, 1994) pp. 127-146. Harris M.E., Hensley K., Butterfield D.A., Leedle R.A., and Carney J.M., Direct evidence of oxidative injury produced by the Alzheimer's beta- amyloid peptide (1-40) in cultured hippocampal neurons, Exp. Neurol. 131 (1995) pp. 193-202. Henle E.S. and Linn S., Formation, prevention, and repair of DNA damage by iron/hydrogen peroxide, J. Biol. Chem. 272 (1997) pp. 19095-19098. Hensley K., Carney J.M., Mattson M.P., Aksenova M., Harris M., Wu J.F., Floyd R.A., and Butterfield D.A., A model for beta-amyloid aggregation and neurotoxicity based on free radical generation by the peptide: relevance to Alzheimer disease, Proc. Natl. Acad. Sci. U. S. A. 91 (1994) pp. 3270-3274. Koppal T., Drake J., Yatin S., Jordan B., Varadarajan S., Bettenhausen L„ and Butterfield D.A., Peroxynitrite-induced alterations in synaptosomal membrane proteins: insight into oxidative stress in Alzheimer's disease, J. Neurochem. 72 (1999) pp. 310-317. Levine R.L., Berlett B.S., Moskovitz J., Mosoni L., and Stadtman E.R.. Methionine residues may protect proteins from critical oxidative damage, Mech. Ageing Dev. 107 (1999) pp. 323-332. Sano M., Ernesto C , Thomas R.G., Klauber M.R., Schafer K., Grundman M., Woodbury P., Growdon J., Cotman C.W., Pfeiffer E., Schneider L.S., and Thai L.J., A controlled trial of selegiline, alpha-tocopherol, or both as treatment for Alzheimer's disease. The Alzheimer's Disease Cooperative Study, N. Engl. J. Med. 336 (1997) pp.1216-1222. Schoneich C. and Yang J. Oxidation of methionine peptides by Fenton systems: The importance of peptide sequence, neighboring groups and EDTA, J. Am. Chem. Soc. Perk T 2 (1996) pp. 1915-1924. Selkoe D.J., Translating cell biology into therapeutic advances in Alzheimer's disease, Nature 399 (1999) pp. A23-A31.

38

R. W. Gracy et al.

15. Watson A.A., Fairlie D.P., and Craik D.J., Solution structure of methionineoxidized amyloid beta-peptide (1-40). Does oxidation affect conformational switching?, Biochemistry 37 (1998) pp. 12700-12706. 16. Yatin S.M., Varadarajan S., Link CD., and Butterfield D.A., In vitro and in vivo oxidative stress associated with Alzheimer's amyloid beta-peptide (1-42), Neurobiol. Aging 20 (1999) pp. 325-330. 17. Young D., Lawlor P.A., Leone P., Dragunow M., and During M.J., Environmental enrichment inhibits spontaneous apoptosis, prevents seizures and is neuroprotective, Nat. Med. 5 (1999) pp. 448-453.

TRANSGENIC FISH AND BIOSAFETY Hu W E I , AND Z H U ZUOYAN

State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of the Chinese Academy of Sciences, Wuhan 430072, China zyzhu @ ihb.ac.cn

Hydrobiology,

The first batch of transgenic fish was produced by microinjection of human growth hormone gene into fish eggs and then a model of transgenic fish was established. In comparing with the control fish, the transgenics not only grew faster but also were more efficient in utilizing dietary protein. The transgenics had significantly higher body contents of dry matter and protein, but lower of lipid. Two groups of mice, fed with "all-fish"-transgenics or the control fish, did not show any significant differences in physiological and pathological characteristics. In addition, the genetic and the ecological safety of transgenics are also evaluated.

1

Introduction

Fish are at the lower stage of evolution but the most diverse in vertebrates. They appeared around 500 million years ago, and today there are about 21,700 to 28,000 species that take over almost half of the total number of vertebrata [1]. Fishes are generally the most fecund, some producing thousands of eggs on a periodic basis. These eggs are usually large and transparent. Fertilization occurs externally and the embryonic development can be easily followed. These advantages make fish excellent animal models suitable for both field and laboratory studies [2]. Additionally, fish has enormous potential to fulfill human requirements on protein source. In fact, fish culture is one of the earliest activities of human civilization. The extensive animal husbandry information is available based on thousands of years of practical experience from fish farmers and hobbyists. The first record of pond culture in the world can be dated back 2500 years to the Handbook of Fish Culture, in which the author, Fan Li, described the domestication and cultivation of common carp in ponds. Advancement in the techniques of fish culture has increased but has always fallen short of the demands of the expanding human population. Development of high technology has become a pressing matter for aquaculture industry. Fortunately, the advancement of modern biology, especially molecular genetics, makes fish breeding towards a directional way. Using gene transfer [3], the transgenic fish with traits of growth enhancement, high food utilization efficiency and disease resistance can be produced. Thus, gene transfer technology will revolutionize traditional fish breeding. Now transgenic fish is on the verge of commercializing in China, Canada and other countries. However, as a genetically modified organism (GMO), the biosafety has been deeply concerned. In this paper, we will focus on transgenic fish breeding and its biosafety. 39

40 2

W. Hu & Z. Zhu Gene Transfer in Fish

Gene transfer is based on two major advances in molecular and developmental biology. The first is molecular cloning and the second is embryonic micromanipulation. Molecular cloning enables the isolation and cloning of a single gene that codes for a unique protein while micro-manipulation enables "foreign gene(s)" to be introduced into an organism's genome. In the early 1980s, recombinant genes were cloned and transferred into host animals, of which the transgenic "super mouse" was the most exciting achievement on transgenic studies [4]. Based on the technique of microinjection, the recombinant gene MThGH, a mouse metallothionein-I gene promoter triggering a DNA sequence coding for human growth hormone, was introduced into fish eggs and the first batch of transgenic fish were produced in 1984 in China [3]. The principal advantage of microinjection is the efficiency of generating transgenic lines that express most genes in a predictable manner, and the method has been most widely and successfully used for generating transgenic animals. The art of microinjection, however, is labor- and time-consuming. To avoid the disadvantages, much more convenient techniques, e.g. using electroporation or sperm-mediated for DNA transfer, were subsequently developed. By the new methods, transgenics were made in a number of species of fish such as salmon {Oncorhynchus tshawytscha), goldfish (Carassius auratus), loach (Misgurnus anguillicaudatua) and zebrafish (Danio rerio), common carp (Cyprinus carpio) and shellfish (Pinctada maxima Jameson) [5-11]. In recent years, more than 10 laboratories throughout the world have been successful in generating transgenic fish in a variety of species using different foreign gene constructs and transgenic techniques. When a novel gene is transferred into the fertilized eggs of fish, it will behave in different ways. It may replicate and its descendants persist for several cell divisions. It may integrate into the chromosomal DNA of some host cells and generate transgenic somatic cells. It may integrate into the chromosomal DNA of the host progenitor germ cells and the founder will pass the transgene onto the Fl progeny. Alternatively, it may be lost in a few of the founder embryos. It was in a fish model that the mechanisms controlling foreign gene replication, integration, expression, biological function, and the pattern of inheritance by germ line transmission was intensively investigated by Zhu et al. [12]. Southern blot hybridization revealed a dynamic process of foreign gene in host fish including replication, degradation, concatemer-forming (polymerization), and possible integration during the embryogenesis. Within a few minutes of being transferred into the eggs, the linear form of the foreign DNA was converted into different physical forms: circular, dimer and concatemers. The dimer and concatamers were the preferred forms of foreign gene replication in the host. The replication began at very early stage of cleavage and an onset occurred at late-blastula to early-neurula stages. After neurula stage, most of the foreign gene

Transgenic Fish and Biosafety

41

was migrating with the host chromosomal DNA which revealed on the agarose gel lectrophoresis [12]. It was suggested that foreign gene at these stages was in a form of large concatemers or could integrate into the host genome. If the transferred DNA is incorporated into the chromosomal DNA of some germ cells, the founder fish are expected to be mosaic in their germline and the transgene will be passed to some of the progeny. The unintegrated circular, monomer and dimer forms of the "foreign DNA" were degrading after the stage of muscular reaction. Northern hybridization showed that the transcripts of hGH gene could only be found past late-gastrula stage, which was consistent with the timing of the differentiation of fish embryogenesis. As a result of the expression of the MThGH gene, a few of the transgenic individuals showed dramatic growth enhancement trait while the growth rate of others didn't change or even decreased. This unexpected observation was reasonable when the multi-site integration and transgenic mosaicism was taken into account. In fish fertilized eggs, the pronucleus was invisible and foreign gene could only be microinjected into the cytoplasm ruther than the pronucleus. So the foreign gene integration was spanned a long time course during embryogenesis resulting in multi-site integration and consequently transgenic mosaicism. There were three categories of integration sites: (1) functional integration in which the transgene is suitable for expression; (2) silent integration in which the transgene is integrated in a region of host chromosomal region where expression is abolished; (3) toxic integration in which the transgene interrupts or breaks sequences which is critical to normal cellular function, such as those of so called "house-keeping" genes. Among these categories, only the functional integration of the GH gene could result in the generation of fast growing fish [12]. The specific growth rates (SGR) of the M77iG//-transgenic founder and the Fl to F4 generation were significantly higher than those of the controls [12-16]. Food consumption, growth and energy budget between A/77iGf/-transgenic compared with controls had been thoroughly worked out. When feeding with fresh tubificid worms, the energy budget equation for MT/iG/Z-transgenic F2 fish is 100C=8.9F+0.63U+49.03R+41.44G, whereas that for control is 100C=7.37F+1.14U+53.36R+38.13G [15]. In which, G is the energy channelled to growth, C is the energy from food, F is the energy lost in faeces, U is the energy lost in nitrogenous excretion and R is the energy channelled to metabolism. Compared with controls, transgenic fish had a significantly higher proportion of food energy channelled to F and G and a significantly lower proportion of that channelled to R and U. The transgenic fish saved 6.62% of the total food energy partly for growth improvement. This phenomenon is known as the "fastgrowing and less-eating" [15]. Growth and feed utilization by Af77iG//-transgenic F4 fish fed with diets containing different protein levels had also been carried out [16]. In comparison with the controls, intakes of protein and energy were significantly higher in the transgenics fed the 20% protein diet, and recovered energy, as a proportion of protein intake, was significantly higher in the transgenics fed the 40% protein diet.

42

W. Hu & Z. Zhu

It revealed that at a lower dietary protein level, transgenics achieved higher growth rates mainly by increasing feed intake; and at a higher dietary protein level, transgenics achieved higher growth rates mainly through higher energy conversion efficiency. That is to say, transgenics were more efficient in utilizing dietary protein than the controls, which led to transgenics getting a significantly higher specific growth rate [16]. The transgenic fish had significantly higher body contents of dry matter and protein, but lower contents of lipid than the controls [17]. The apparent digestibility of amino acids tended to be higher in the transgenics than in the controls, especially in fish fed diets with lower protein levels. While taking a look at the proportion of amino acids in transgenics and controls, there were no difference of 17 amino acids between transgenics and the controls [17]. Thus, the transgenic fish have higher nutritious value than the control fish. It is reasonable that this kind of fish with "fast growing and less eating" and "higher content of protein and lower content of lipid" traits will fulfill humanity's increasing requirement for protein food source. Nevertheless, in view of the biosafety considerations, both the mouse MT-1 gene promoter and the hGH structural gene are not suitable for the purpose of fish breeding [12]. For this reason, researchers cloned common carp p-actin (CA) gene [18], grass carp growth hormone (gcGH) gene [19], and generated a new construct of pCAgcGH, an "all-fish" genomic construct cloned in pUC118 [20]. The construct of pCAgcGH consists of P-actin gene promoter of the common carp and the whole transcription unit of GH gene from grass carp. The P-actin gene promoter of common carp is a powerful promoter and grass carp is a fast growing farm species, it is reasonable to believe that the pCAgcGH gene will be a strong "generator" to promote fish growth rates [20]. In the spring of 1997, this construct had been microinjected into the fertilized eggs of Yellow River Carp and a batch of CAgcG/Z-transgenic fish was produced. The transgenic fish weighted 2.75kg in 5 months while the largest of the non-transgenic controls was 1.1kg. It's more stimulating that the heaviest body weight of 17-month-old transgenics, 7.65kg, was more than double of their non-transgenic siblings [unpublished data]. To date, there have been more than ten "all-fish" recombinant genes constructed from different laboratories since the first "all-fish" expression vector was created in 1990 [21-22]. The most dramatic "super fish" was produced by Delvin et al. who inserted an all-salmon gene construct (pOnMTGHl) into coho salmon {Oncorhynchus kisutch Walbaum). On average, the transgenic salmon were more than 11-fold heavier than the controls, with a range from no growth stimulation to one individual 37 times larger than the controls [23].

Transgenic Fish and Biosafety 3

43

The Biosafety of Transgenic Fish

As described earlier, the "all-fish" gene construct C4gcG//-transgenic Yellow River carp has potential in aquaculture industry. However, transgenics are genetically modified organism (GMO). Care must be fully taken to put any GMO in use. Their food safety, genetic safety and ecological safety should be strictly evaluated. At present, the widely accepted principle on safety evaluation of foods produced by modern biotechnology is the "substantial equivalence principle" delivered by the European OECD (Organization for Economic Cooperation and Development) in 1993 [24]. According to this principle, the safety class of "allfish" gene construct CAgcGH has been assessed. The construct of the CAgcG/Z-transgene consists of p-actin gene promoter from common carp and the whole transcription unit of GH gene from grass carp. Both grass carp and common carp are farmed species. They belong to the same family of Syprinedae. Between the two species, the similarity of the exons of the GH gene is 84.1-93.2% [19] and the homology of the amino acid sequence of GH polypeptide is 97% [25]. As a result of the transgene's expression, the transgenic Yellow River carp contains grass carp GH of 2-10ng/mL serum. Is it safe for consumers? The food safety was studied on Kunming mice; 120 mice of each group were fed with fresh meat juice of the CAgcG/Z-transgenic Yellow River carp and the nontransgenic controls, respectively. The feeding dosage per day was lOg/Kg body weight and 5g/Kg body weight, respectively. The pathological standards issued by the Chinese National Ministry of Health were used for reference to evaluate the results. In comparison with the control, test group of mice did not show any significant differences in growth, reproduction, general appearance of blood, biochemical indicators on blood, and histochemical analysis of tissues, etc. The Same results have also been revealed for the Fl generation of the two groups (unpublished data in Zuoyan Zhu's lab). In addition, the polypeptide of GHs is unstable and would be degraded rapidly in relation to acid, alkali and heat, etc. Thus, both the transgene construct and the expression product in the transgenics should be as safe as in the "parental" ones to be used as food resources. We concluded that the "all-fish" gene transferred-Yellow River carp is substantially equivalent to the control on the aspect of food safety. As to the genetic and ecological safety of G//-transgenic fish, Cui et al have made preliminary studies in a polyculture system [26]. They stocked the GHtransgenic red common carp with crusian carp (Carassius auratus L.), grass carp, big-head carp (Aristichthys nobilis Richardson) and silver carp {Hypophthalmichthys molitrix Cuvier et Valenciennes) in very well isolated ponds. PCR analyses show that the G//-transgene could only flow among individuals within species but not between species by natural reproduction. In addition, the transgenics and the controls had no different effect on the growth of crusian carp, grass carp, bighead carp and silver carp. Furthermore, the total yield in polyculture

44

W. Hu & Z. Zhu

system with the transgenic common carp is higher than that with the non-transgenic ones. The results suggested that stocking the transgenics in ponds would gain fish productivity without transgene flowing between species. In fact, gene flowing between fish species occurs in water body from time to time. For example, crossbreeding between closely related species happens naturally. And artificial crossbreeding is even more frequently carried out in fish farms and not enough attention is paid to keep the offspring from escaping to natural water body. Each genome of fishes contains about 105 genes. There are 105 genes added from one species to the other when crossbreeding occurs. Nevertheless, in the case of producing transgenics, there is only one gene from one species added to the other. And in the G4gcG//-transgenic Yellow River carp, one GH gene from grass carp is added into common carp genome; the mass of gene flow is 1/105 as the two species crossbreeding. In other words, the risk of stocking "all-fish" genetransferred fish should be much less, or at most, substantial equivalent to that of stocking cross fish. In a word, according to the substantial equivalence principle, the safety class of "all-fish" transgenic Yellow River carp could be determined in the safest level "level I". Technically speaking, the CAgcG//-transgenic common carp, as well as another transgenic fish in Canada, might be hopeful to be one of the first case of commercial transgenics being adapted by the market. Application for an admission of commercial cultivation from the government and invoking public attention for understanding and supporting is urgently needed [25]. 4

The Prospect of Transgenic Fish

The transgenic fish that have been produced at present are far from a genetically homogenous strain. Scientists are pursuing transgenics with transgenes of sitespecific integration, controllable expression and stable germ-line-transmission. The constructs of foreign genes being transferred in fish are usually less than lOkb in size. These smaller constructs might miss some important elements for regulation of gene activity. With the development of artificial chromosomes capable of cloning DNA fragments up to 2 megabases long, it is now possible to use intact genomic loci as transgenes [27]. Consequently, the foreign genes would be similar to the context of their native sequence environment in the transgenics, enabling the transferred genes to be expressed at levels comparable to that of the corresponding endogenous gene, as well as in a tissue-specific and developmentally correct manner. In 1998, a simple method for modification of bacterial artificial chromosomes (BACs) through Chi-stimulated homologous recombination had been developed and used for zebra fish transgenesis. The DNA constructs microinjected into zebrafish embryos with the modified BAC can display the correct spatiotemporal gene expression pattern. More importantly, those embryos show less mosaic and had improved the foreign gene expression in special cells compared

Transgenic Fish and Biosafety

45

with the smaller constructs [28]. It is suggested that the artificial chromosomes have potential for producing transgenics of site-specific integration and controllable expression. The "gene targeting" technique may be an alternative way for producing a pure line of transgenic fish. Gene targeting, homologous recombination between DNA sequence residing in the chromosome and a newly introduced cloned DNA sequence, allows the transfer of any modified gene into the host genome of living cells and is hopeful to be carried out to gain embryonic cell lines carrying artificially modified and site-specific integrated gene [29]. Fish are at lower evolutionary stage and the totipotency of fish cells (both embryonic and somatic cells) is much higher than that of other vertebrate cells [30]. Additionally, embryonic stem-like-cell line has been partially achieved in zebra fish and medaka (Oryzias latipes) [31-33]. Meanwhile, nuclear-cytoplasm hybrid fish could be produced between different species and even between different genera via nuclear transplantation [34-35]. Therefore, by nuclear transplantation with gene-targeted embryonic cells, the genetically homogenous strain of transgenic fish with desired traits would be generated. As mentioned in the preceding part of the article, genetic safety and ecological safety are two important aspects of the biosafety of transgenic fish. The fundamental potential way to solve the genetic and ecological problems of transgenic fish for aquaculture is to make them sterile. Fortunately, polyploidbreeding in fish has become very popular [36] and artificial tetraploid fish have been obtained via hybridization of common carp against goldfish [37]. By crossing the tetraploid fish with the haploid transgenics, the resulting transgenics are sterile. It is reasonable that stocking infertile transgenic fish will lessen their impact to water ecosystem to the least degree. 5

Acknowledgements

The project was supported by the National High-Tech (863) programme through the grant to Professor Zhu Zuoyan (Grant No. 101-05-02-01, 101-06-02-02, 819-0405). References 1. Nelson, J. S., Fishes of the World. 2nd edition, (A Wiley-Interscience Pub, New York, 1984). 2. Powers D. A., Fish as model systems. Science 246 (1989) pp. 352-358. 3. Zhu Z., Li G., He L., Chen S., Novel gene transfer into the fertilized eggs of goldfish (Carassius auratus L. 1758). ZAngew Ichthyol 1 (1985) pp. 31-34. 4. Palmiter R.D., Brinster R. L., Hammer R. E., Trumbauer M. E., Rosenfeld M. G., Bimberg N. C , Evans R. M., Dramatic growth of mice that develop from

46

5.

6.

7.

8. 9. 10. 11.

12. 13.

14. 15.

16.

17.

18. 19.

W. Hu & Z Zhu eggs microinjected with metallothionein-growth hormone fusion gene. Nature 300 (1982) pp. 611-615. Synonds J. E., Walker S. P. and Sin F. Y. T., Electroporation of salmon sperm with plasmid DNA: evidence of enhanced sperm/DNA association. Aquaculture 119 (1994) pp.313-327. Yu J. K., Yan W., Zhang Y. L., Shen Y. and Yan S. Y., Sperm mediated gene transfer and method of detectation of integrated gene by PCR. Acta Zoologica Sinica 40 (1994) pp. 96-99. Tsai H. J., Tseng F. S. and Liao I. C , Electroporation of sperm to introduce foreign DNA into the genome of loach (Misgurnus anguillicaudatus). Can. J Fish. Aquat. Sci 52 (1995) pp. 776-787. Khoo H. W., Ang L. H., Lim H. B., et al. Sperm cells as vectors for introducing foreign DNA into zebra fish. Aquaculture 107 (1992) pp. 1-19. Xie Y., Liu D., Zou J., Li G.( Zhu Z., Gene transfer via electroporation in fish. Aquaculture 111 (1993) pp. 207-213. Li G., Cui Z., Zhu Z., Huang S., Introduction of foreign gene carried by sperms. Acta Hydrobiol Sin 20 (1996) pp. 242-247. Hu Wei, Yu Dahui, Wang Yapin, Wu Kaichang and Zhu Zuoyan. Electroporated sperm mediates gene transfer in Pinctada maxima (Jameson). Chinese J. Biotech 23 (2000) pp. 156-160. Zhu Z., Xu K., Xie Y., Li G., He L., A model of transgenic fish. Scientia Sinica (B) 2 (1989) pp. 147-155. Xu K., Wei Y., Guo L., Zhu Z., The effects of growth enhancement of human growth hormone gene transfer and human growth administration on crucian carp (Carassius auratus gibelio, Bloch). Acta Hydrobiol Sinica 15 (1991) pp. 103-109. Wei Y., Xie Y., Xu K., et al. Heredity of human growth hormone gene in transgenic carp (Cyprinus carpio L). Chinese J Biotech 8 (1992) pp. 140-144. Cui Z., Zhu Z., Cui Y., Li G., Xu K., Food consumption and energy budget in MThGH-transgenic F2 red carp (Cyprinus carpio L. red var.). Chinese Science Bulletin 41 (1996) pp. 591-596. Fu C , Cui Y., Hung S. S. O., Zhu Z., Growth and feed utilization by F4 human growth hormone transgenic carp fed diets with different protein levels. J Fish BioZ 53 (1998) pp. 115-129. Fu C , Growth and feed utilization by F4 human growth hormone transgenic red carp, Cyprinus Carpio L.: effects of dietary protein level. Thesis for Mac (Supervised by Zuoyan Zhu) submitted to the Institute of Hydrobiology, Chinese Academy of Sciences (1998). Liu Z., Zhu Z., Roberg K., et al. Isolation and characterization of • -actin gene of carp (Cyprinus carpio). DNA sequence 1 (1990) pp. 125-136. Zhu Z., He L., Chen T. T., Primary-structural and evolutionary analyses of growth-hormone gene from grass carp (Ctenopharyngodon idellus). Eur J Biochem 207 (1992) pp. 643-648.

Transgenic Fish and Biosafety

47

20. Zuoyan Zhu. Generation of fast growing transgenic fish: methods and mechanisms. In Transgenic fish. Hew C L and Fletcher G L (eds.), (World Scientific Publishing, Singapore, 1992) pp.92-119. 21. Cui Z., Zhu Z., Several interesting questions about breeding transgenic fish. J Biotech (in Chinese) 136 (1998) pp. 1-10. 22. Liu Z., Moav B., Faras A. J., Guise K. S., Kapuscinski A. R., and Hackett P. B., Development of expression vectors for gene transfer into fish. Bio/Technology 8 (1990) pp. 1268-1272. 23. Devlin R. H., Yesaki T. Y., Biagy C. A., Donaldson E. M., Swanson P., Chan W. K., Extraordinary salmon growth. Nature 371 (1994) pp. 209-210. 24. OECD (Organization for Economic Cooperation and Development), Safety evaluation of foods produced by modern biotechnology: concepts and principles. (OECD, Paris, 1993). 25. Zhu Z., Zeng Z., Open a door for transgenic fish to market. J Biotech (in Chinese) 1 (2000) pp. 1-7. 26. Cui Z., Biosafety assessment of GH-transgenic common carp (cyprinus carpio L.). Thesis for Ph.D (Supervised by Zuoyan Zhu) submitted to the Institute of Hydrobiology, Chinese Academy of Sciences, (1998). 27. Peterson K. R., Production of transgenic mice with yeast artificial chromosomes. TIG 13 (1997) pp. 61-66. 28. Jessen J. R., Meng A. M., Mcfarlane R. J., Paw B. H., Zon L. I., Smith G. R. and Lin S., Modification of bacterial artificial chromosomes through Chistimulated homologous recombination and its application in zebra fish transgenesis. Proc. Natl. Acad. Sci. USA 95 (1998) pp. 5121-5126. 29. Capecchi M. R., Altering the genome by homologous recombination. Science 244 (1989) pp. 1288-1292. 30. Zhu Z., Growth hormone gene and the transgenic fish. In Agricultural Biotechnology. Edited by You C. B. and Chen Z. Z., (China Science and Techology Press. Beijing, 1992) pp. 106-116. 31. Wakamatsu Y., Ozato K., Sasado T., Establishment of a pluripotent cell line derived from a medaka (Oryzias latipes) blastula embryo. Mol Mar Biol Biotechnol 3 (1994) pp. 185-191. 32. Sun L., Bradford C. S., Ghosh C , Cotllodi P., Barnes D. W., ES-like cell cultures derived from early zebrafish embryos. Mol Mar Biol Dev 4 (1995) pp. 193-199. 33. Hong Y., Schartl M., Establishment and growth responses of early medakafish (Oryzias latipes) embryonic cells in feeder layer-free cultures. Mol Mar Biol Biotechnol 5 (1996) pp. 93-104. 34. Tung T. C , Nuclear transplantation in teleosts. I. Hybrid fish from the nucleus of carp and the cytoplasm of crucian. Scientia Sinica 23 (1980) pp. 517-523. 35. Yan S. Y., Lu D. Y., Zhu Z. Y., et al. Nuclear transplantation in teleosts. II. Hybrid fish from the nucleus of crucian and the cytoplasm of carp. Scientia Sinica (B) 27 (1984) pp.1029-1034.

48

W. Hit & Z. Zhu

36. Zhang J. and Sun X., The survey and prospect offish genetics and breeding research. In: The Selected Paper of Breeding in Jian Carp (Cyprinus Carpio var. Jian) Written by Zhang J., Sun X. et al. (Science Press: Beijing 1994) pp. 1-10. 37. Liu Yun. Propagation physiology of main cultivated fish in China. (Agricultural Publishing House, Beijing 1993) pp. 145-148.

ALDEHYDE DEHYDROGENASES OF HUMAN CORNEAL AND LENS EPITHELIAL CELLS ROGER S HOLMES

The University of Newcastle, Callaghan, NSW 2308,

Australia

Email.:vc@newcastle. edu. au Aldehyde dehydrogenase (ALDH) isozymes, ALDH1 and ALDH3, and albumin, are the major soluble proteins within human corneal and lens epithelial cells. These ALDHs may perform a variety of functions in human anterior eye tissues: the oxidation of UVR-induced peroxidic aldehydes; the maintenance of high levels of reduced coenzyme with these cells; serving as 'crystallin' proteins to assist in the transmission of visible light; and in the biological filtration of UV-B radiation. It also appears likely that NAD(P)H contributes strongly to the absorption of UVA and UV-B by corneal and lens epithelial cells.

1

Introduction

Human aldehyde dehydrogenases (ALDHs) are members of a complex gene family, comprising at least eight genes including a group of NAD-dependent ALDH (EC 1.2.1.3) isozymes [1,2] and ALDH4, a related enzyme, y-aminobutyraldehyde dehydrogenase [3]. These enzymes catalyse the NAD-dependent oxidation of a wide range of biological aldehydes to their corresponding carboxylic acids, including aldehydes derived from the metabolism of ethanol and other alcohols (eg. acetaldehyde), lipid peroxides, biogenic amines, retinol, y-aminobutyrate, amino acids, monoamines, diamines and polyamines [ 4-7]. The most extensively investigated human ALDHs are the major liver cytosolic (ALDH1) and mitochondrial (ALDH2) isozymes [8,9], and the major stomach/corneal ALDH3 enzyme [10,11]. Human ALDH1 and ALDH2 are tetrameric enzymes comprising distinct subunits with 500 amino acids, and sharing 68% sequence homology [1,12], whereas human ALDH3 is a dimeric enzyme, and comprises 453 amino acid residues with <40% sequence homology with ALDH1 and ALDH2 [11,13]. The genes encoding these isozymes are differentially localised on the human genome: ALDH1, chromosome 9q21; ALDH2, chromosome 12q24; and ALDH3, chromosome 17pll.2 [see 2]. The three dimensional structures for the homologous isozymes from sheep (ALDH1)[14], bovine (ALDH2)[15] and rodent (ALDH3) [16] sources have been recently reported. These showed major similarities, although distinct differences were observed, particularly for the substrate binding tunnels and the subunit-subunit binding domains, which were consistent with their differential substrate specificities and quaternary structures.

49

50

R. S. Holmes

Corneas from a diverse range of mammalian species, including human, pig, cow, sheep, baboon, mouse, rat and opossum, exhibit very high levels of ALDH3 activity [11,17-22], which has been purified and characterised from several of these sources [11, 20-22]. In addition, very high levels of ALDH1 have been reported in human cornea and lens epithelial cells, and the purified enzymes isolated and characterised [23,24]. This paper reviews recent studies on the distribution, properties and role(s) for human ALDH1 and ALDH3, which exist in very high levels within human corneal and lens epithelial cells, and discusses the likely roles for these enzymes in these anterior eye tissues. 2

Transmission of Visible Light and the Absorption of UV Radiation by Anterior Eye Tissues

The cornea and lens perform essential roles in providing a transparent path and focused transmission of visible light onto the retina. In addition, these anterior eye tissues protect the eye, particularly photosensitive retinal cells and the lens, from ultraviolet radiation (UVR) induced tissue damage by the absorption of biotic UVR in the 300-320nm (UV-B) and 320-400nm (UV-A) wavelength ranges, respectively (see Figure 1) [25,26]. UV-B radiation (292-320 nm) UV-A radiation (320-400 nm) Visible Light Figure 1. Absorption of UV Radiation by the Mammalian Cornea and Lens.

2.1

Lens

The transparency of the lens is essential for the transmission of visible light, which is achieved by the presence of very high levels of stable soluble proteins and the absence of blood vessels, nerves, nuclei and organelles within the lens fibre cells [27-29], These proteins have been extensively investigated and designated as 'crystallins', which are members of a complex gene family [30]. Crystallins are present in the lens fibre cells in concentrations of up to 60 percent of wet weight, and are closely and regularly packed in order to achieve the properties required for proper light refraction and transmission. Anterior epithelial cells exist in a single layer surrounding the lens, and following further differentiation and lateral migration, develop into fibre cells which are deposited into concentric layers overlaying older cells [see 27-29]. Lens 'crystallins' are highly stable proteins, are resistant to the deleterious effects of ageing, light radiation, oxidative free radicals and heat, and are retained throughout life as the major soluble proteins of the human lens [30]. A number of pigments have been reported in the lens [31], which are

Aldehyde Dehydrogenenases of Human Corneal and Lens Epithelial Cells

51

oxidation or conjugated glucoside products of tryptophan, namely kynurenine. These compounds are apparently responsible for the filtering of UV-A radiation by the mammalian lens, and the protection of the retina from UV-A induced damage [see 32]. 2.2

Cornea

The cornea is similarly transparent, although for distinct reasons of structure and protein composition to that of the lens. It is composed of three major sections: an epithelium at the air exposed surface consisting of 5-7 layers of epithelial cells; a thick layer of collagenous lamellae (the stroma); and the monolayer of endothelial cells facing the aqueous humour. In contrast to the lens, the cornea is subject to turnover, with the epithelial cells having a life of around a week following generation in the basal layer, differentiation within the epithelial layer, and final migration to the surface, where they are dispersed by the tear fluid. Apart from the transparent structural collagen proteins of the stroma, the major proteins of the mammalian cornea are soluble proteins localised within the epithelial cell layer [33]. In rabbit and human cornea, albumin is the major soluble protein and is predominantly localised within the epithelial cells [34,35]. Class 3 ALDH (ALDH3 in humans) is also a major soluble protein of mammalian corneal epithelial cells [20,22,23,36,37], and in humans, class 1 (ALDH1) occurs as one of three major soluble proteins: albumin, ALDH3 and ALDH1 [23,24]. The responsible chromophores for the absorption of UVR below 320 nm by the mammalian cornea have not been determined, although a number of compounds have been proposed and investigated. Below 290 nm, the response by ocular tissues to UV-B radiation is predominantly associated with the corneal epithelium, indicating almost total absorption by this cell layer [38]. Major soluble proteins such as albumin and ALDH are the likely candidates responsible for this absorption, given the reported action spectrum for UV-B induced photokeratitis of the mammalian cornea, and its similarity with the absorption spectrum of soluble proteins, particularly ALDH3 , which has a relatively high tryptophan content [13,39]. Cenedella and coworkers [40] have suggested that cholestylene,a dehydration product of cholesterol, may also contribute to abiotic UV-B filtration by the cornea. Biotic UV-B radiation, however, commences at 292nm at the earth's surface following filtration of sunlight by the ozone layer [41], and the filtration role by the cornea of UV-B in the 292-320 nm range is of major biological significance. This paper examines possible roles for ALDHs in serving as biological filters of UV-B radiation by corneal and lens epithelial cells of the eye, or in generating high levels of NAD(P)H within these cells, which are the likely absorbers of UV-B and UV-A radiation by lens and corneal epithelial cells [51,52].

52 3

R. S. Holmes ALDHs are Major Soluble Proteins of Human Corneal and Lens Epithelial Cells

Figure 2 illustrates an SDS-polyacrylamide gel resolving protein subunits extracted from human corneal and homogenates, as well as purified human corneal ALDH3 and human lens ALDH1 isozymes [27]. Together with albumin, which is the major soluble protein of human corneal epithelial cells [34,35], ALDH3 and ALDH1 present in very high levels, and comprise approximately 5% and 3% respectively of total soluble protein of human cornea [35]. Lens ALDH1 is also present as a significant protein zone within human lens extracts, although this represents only 12% of the soluble protein present, given the very high levels of 'crystallin' proteins present in this tissue. Immunohistochemical studies performed using antibodies prepared against human ALDH3, ALDH1 and albumin have demonstrated that these major soluble proteins are predominantly located within cornea and lens epithelial cells. The concentrations for these enzymes within corneal and lens epithelial cells are therefore very much higher than the above values, with levels being reported in excess of 5uM for both ALDH isozymes [23,24]. It is difficult to provide a similar estimate for the concentration of ALDH1 within lens epithelial cells. This enzyme is localised within a single cell layer, surrounding the lens fiber cells and cortex of the lens, which are packed with crystallin soluble proteins, and lacking detectable ALDH1 antigenic activity [35]. A high concentration for human lens epithelial cell ALDH1 is however apparent, given the levels of protein and activity observed in crude lens extracts, and the strong anti-ALDHl immunochemical activity observed for these cells [35]. The amino acid and genetic sequences for the ocular ALDH1 and ALDH3 isozymes and associated genes encoding these enzymes have not been reported, and it is not known whether these are identical with, or highly homologous to, the corresponding major liver ALDH1 and stomach ALDH3 enzymes previously described [10,11]. The presence of very high levels of ALDH3 in bovine corneal epithelial cells at approximately 30% of the soluble protein has lead to its description as a corneal 'crystallin', by contributing to the structural integrity and transparency of the cornea [39]. Uma and coworkers [42] have suggested that corneal ALDH serves in a detoxifying role for 'free radicals', generated following UV-B absorption, through reaction with ALDH sulphydryl groups. A major role in the filtration of UV-B radiation by the mammalian cornea has also been proposed for the major ALDH isozymes present, possibly as ALDH.reduced coenzyme complexes [20,22-24]. Moreover, Mitchell and Cenedella [43] have suggested the name 'absorbin' for corneal ALDH3, as a result of its high concentrations within corneal epithelial cells, and its proposed role in UV-B absorption.

Aldehyde Dehydrogenenases of Human Corneal and Lens Epithelial Cells

67

hmm

tiillll

53

Album ALDH3 ALDH1

43

1

2

3

4

5

Figure 2. Human corneal and lens AUDHs Analysed by SDS Gel Electrophoresis Stained with Coomasie Blue (from King and Holmes) [23]. Lane 1: protein standards with subunit MWs given at the side of the gel; Lane 2: pure lens ALDH1; Lane 3: lens homogenate; Lane 4: pure lens ALDH1 and corneal ALDH3; and Lane 5: corneal homogenate.

4

Ocular ALDH Substrate Specificities

Human corneal and lens ALDH3 and ALDH1 are capable of oxidising a wide range of aldehydes, and have been proposed as playing a major role in the metabolism of peroxidic aldehydes, generated following UV-B induced lipid peroxidation within these mammalian anterior eye tissues [20-24]: R.CHO + NAD(P) => R.COOH + NAD(P)H The physiological substrates for these enzymes are presently unknown, however, a number of 'biological aldehydes' have been investigated in terms of their kinetic properties with ocular ALDHs. Human corneal ALDH3 prefers medium chain aldehydes derived from lipid peroxidation as substrates, including 4hydroxynonenal and trans-2-hexenal [11,17-23]. This property supports a major detoxifying role for ALDH in anterior eye tissues, particularly within epithelial cells of cornea and lens, where the enzyme is predominantly located [35], ALDH3 is inactive with another common peroxidic aldehyde, malondialdehyde, however ALDH1, which exhibits an overlapping substrate specificity with ALDH3, is also capable of oxidising malondialdehyde [5].

54 5

R. S. Holmes Proposed Roles for Corneal and Lens ALDHs as Major Soluble Proteins and in the Metabolism of Peroxidic Aldehydes

Figure 3 gives a diagrammatic illustration of the various roles that may be played by corneal ALDH3 and ALDHl, which occur as major soluble proteins within corneal epithelial cells, and of reduced coenzymes, generated following oxidative activities for these enzymes. These roles include: •

the metabolism of lipid peroxidic aldehydes, generated following UVradiation impacts on membranes of corneal epithelial cells [44,45]. Toxic aldehydes such as 4-hydroxynonenal and hexenal serve as excellent substrates for both ALDH3 and ALDHl, and another common peroxidic aldehyde, malondialdehyde, is efficiently metabolised by ALDHl [5,11,22].

•

together with albumin, which is the major soluble protein of the human cornea [34,35], ALDH3 and ALDHl may serve as 'crystallin' proteins [24]. These major soluble proteins of corneal epithelial cells provide a stable, soluble and transparent environment for corneal epithelial cells, consistent with the efficient transmission of visible light by the cornea to the lens and retina of the eye. In addition, they may serve a detoxifying role for hydroxyl ions, generated following UV induced oxidative processes [42].

•

the maintenance of very high levels of NADH within corneal epithelial cells, following the reduction of NAD to NADH in the catalytic oxidation of peroxidic aldehydes. Very high concentrations of NAD and NADH have been reported for corneal epithelial cells (> 0.5 mM) [46], which provide a major coenzyme reservoir for oxidative metabolism within these cells . In addition, the high levels of NADH may protect these cells against oxidative damage, or play a more direct role in filtering UV-B radiation, either as free coenzyme [51,52], or as ALDH.NAD(P)H binary complexes [24].

•

in association with the class IV alcohol dehydrogenase (ADH), which occurs in very high levels within corneal epithelial cells [47,48], ALDHl and ALDH3 may serve a role in the regulation of NAD(P)/NAD(P)H ratios within these cells [24].

Similar roles may be undertaken by the major soluble protein, ALDHl, within lens epithelial cells, to those proposed for corneal ALDHl and ALDH3. Of particular significance are the very high levels for both oxidised and reduced forms of NAD and NADP (> 5 mM) reported for these cells, with an order of magnitude

Aldehyde Dehydrogenenases of Human Corneal and Lens Epithelial Cells

55

higher concentration than those reported for corneal epithelial cells [46]. These massive levels for NADH and NADPH may assist in protecting against oxidative processes, and in a further UV filtration role, prior to the passage of visible and UV light, particularly UV-A radiation, into the lens. Rao and Zigler [49] have investigated the pyridine nucleotide concentrations within the lenses of a wide range of species. They concluded that the very high levels present for some species are the result of the associated high content of pyridine nucleotide binding 'crystallins', such as lactate dehydrogenase or alcohol dehydrogenase. In addition, these workers recognised that NADH and NADPH are potent reducing agents, and may protect the lens fiber cells against oxidative stress effects [50]. More recent studies have supported a direct role for NAD(H) in the absorption of UV-B and UV-A by anterior mammalian eye tissues. Atherton and coworkers [51] have undertaken fluorescent measurements of human and rabbit lens epithelial cells, and have observed absorption and fluorescence spectra consistent with a role for NADPH within these cells. Moreover, Dillon and coworkers [52] have studied the optical properties of the anterior segments of primate, rat and bovine eyes, and have concluded that NAD(P)H are the most likely absorbing species of UV-A and UV-B for these species.

sail

\""V

\,• K0 • 310 • 350

:::::::::::::::::::: UpU

• X0 • 360 • • • •

• 3C0

• 80 • 880

310 380 3§0 3C0

ilBiii;

\ ,.

\ * *^--

•'

3S -NADH 0-NADM 0-NADH NADH KADPH

• 360

• 360

o Figure 3. Diagrammatic Illustration of a Corneal Epithelial Cell. A: The Transmission of Visible Light by 'CrystaUin' Proteins: Albumin (•), ALDHl (38) and ALDH3 (0); B: The Role of ALDH and ADH in the Metabolism of Peroxidic Aldehydes, the Maintenance of Reduced Coenzyme Levels and in the Inactivation of Hydroxyl Ions; C: The Absorption of Abiotic UV-B (<292nm) by 'CrystaUin' Proteins (as for A); and D: The Absorption of Biotic UV-B (292-320 nm) by ALDH.NAD(P)H ( 0 ALDH3; 38 ALDHl) and/or NAD(P)H.

56 6

R. S. Holmes Summary

Cornea and lens play essential roles in protecting eye tissues from UV radiation induced damage by the absorption of biotic UV-B and UV-A respectively, thereby protecting photosensitive cells of the cornea, lens and retina. Epithelial cells of the cornea and lens are particularly rich sources of ALDH and ADH activities, which are capable of metabolising toxic peroxidic aldehydes, generated following UV radiation absorption. ALDH may also contribute to the biological filtration of UV light within corneal and lens epithelial cells, and/or play a role in generating high levels of NADH and NADPH. These reduced coenzymes may then protect these cells from oxidative damage, and are the most likely compounds responsible for the absorption of UV-A and B within anterior eye tissues. Finally, the availability of 'knock-out' mice for ocular ALDH1 and ALDH3 will clarify the proposed roles for these enzymes, which exist in very high levels within mammalian corneal and lens epithelial cells. 7

Acknowledgements

This research was supported in part by a grant from the National Health and Medical Research Council of Australia. I gratefully acknowledge the excellent contributions of many colleagues and research students in the major projects undertaken on ocular ALDH since 1986. References 1. L.C.Hsu, W-C. Chang and A. Yoshida, Gene 151 (1994) pp. 285-289. 2. S.W.Lin, J.C.Chen, L.C.Hsu, C.L.Hsieh and A.Yoshida, Genomics 34 (1996) pp. 376-380. 3. G.Kurys, P.C.Shah, A.Kikonyogo, A.Reed, W.Ambroziak and R.Oietruszko, Eur. J. Biochem. 218 (1993) pp. 311-320. 4. R.Pietruszko, in Isozymes: in Current Topics in Biological and Medical Research, eds. M.C.Rattazzi, J.G.Scandalios and G.S.Whitt (New York, Alan R Liss, 1983) pp. 195-217. 5. E.M.Algar and R.S.Holmes, in Enzymology and Molecular Biology of Carbonyl Metabolism, eds. H.Weiner and T.G.Flynn (New York, Alan R. Liss, 1989) pp. 93-104. 6. W.Ambroziak and R.Pietruszko, J.Biol.Chem. 266 (1991) pp. 13011-13018. 7. P.A.Dockam, M-O.Lee and N.E.Sladek, Biochem. Pharmacol. 43 (1992) pp. 2453-2469. 8. N.J.Greenfield and R.Pietruszko, Biochim. Biophys. Acta 483 (1977) pp. 3545.

Aldehyde Dehydrogenenases of Human Corneal and Lens Epithelial Cells 9. 10. 11. 12. 13. 14. 15. 16.

17. 18.

19. 20. 21. 22. 23.

24. 25. 26. 27. 28. 29. 30. 31. 32.

33. 34.

57

H.Weiner, in Gene Families: Structure, Function, Genetics and Evolution, eds. R.S.Holmes and H.A.Lim (Singapore, World Scientific Press, 1996) pp. 87-94. S-J.Yin, C-S.Liao, S-L.Wang,Y-J.Chen and C-W.Wu, Biochem. Genet. 27 (1989) pp. 321-332. G.King and R.S. Holmes, Biochem. Mol. Biol. Intl. 31 (1993) pp. 49-63. J.Hempel, H.Nicholas, and R.Lindahl, J.Biol.Chem. 267 (1992) pp. 3030-3037. L.C.Hsu, W-C.Chang, A.Shibuya and A.Yoshida, J.Biol.Chem. 267 (1992) pp. 3030-3037. S.A. Moore, H.M. Baker, T.J. Blythe, K.E. Kitson, T.M. Kitson and E.N. Baker, Structure 6 (1998) pp. 1541-1551. C.G.Steinmetz, P.Xie, H.Weiner and T.D.Hurley, Structure 5 (1997) pp. 710711. Z-J.Liu, Y-J.Sun, J.Rose, Y-J.Chung, C-D.Hsiao, W-R.Chang, I.Kuo, J.Perozich, R.Lindahl, J.Hempel and B-C.Wang, Nature Structural Biology 4 (1997) pp. 317-326. R.S.Holmes and J.L.VandeBerg, Exp. Eye Res. 43 (1986) pp. 383-396. R.S.Holmes, in Biomedical and Social Aspects of Alcohol and Alcoholism, eds, K.Kuriyama, A.Takada and H.Ishii (Amsterdam, Elsevier Science Publishers, 1988) pp. 51-57. R.S.Holmes, B.Cheung and J.L.VandeBerg, Comp. Biochem. Physiol. 93B (1989) pp. 271-277. M.Abedinia, T.Pain, E.M.Algar and R.S.Holmes, Exp. Eye Res. 51 (1990) pp. 419-426. S.Evces and R.Lindahl, Arch. Biochem. Biophys 274 (1989) pp. 518-529 J.Downes and R.S.Holmes, Biochem.Mol.Biol.Intl. 30 (1993) pp. 525-535. G.King and R.S.Holmes, in Enzymology and Molecular Biology of Carbonyl Metabolism 6, eds. H.Weiner, D.W.Crabb and T.G.Flynn (New York, Plenum Press, 1996) 19-27. G.King and R.S.Holmes, J. Exp. Zool. 282 (1998) pp. 12-17. E.A.Boettner and J.R.Wolters, Invest. Ophthalmol. 1 (1962) pp. 776-783. S.Zigman, Surv. Ophthalmol. 27 (1983) pp. 317-326. W.W.de Jong, W.Hendriks, J.M.W.Mulders and H.Bloemendal, Trends in Biochem. Sci. 14 (1989) pp. 365-368. E.R.Berman, in Biochemistry of the Eye, (New York, Plenum Press, 1991) 362370. H.Maisel, in The Ocular Lens, (New York, Marcel Dekker, 1985). J.Piatigorsky and G.Wistow, Cell 57 (1989) pp. 197-206. R.van Heyningen, Nature 230 (1970) 393-394. R.B.Kurzel, M.L.Wolbarsht and B.S.Yamanashi, in Photochemical and Photobiological Reviews, ed. K.C.Smith (New York, Plenum Press, 1977) pp. 133-165. W.S.Holt and Kinoshita, Invest. Ophthalmol. Vis. Sci. 12 (1973) pp. 114-126. L.Zhu and R.K.Crouch, Cornea 11 (1992) pp. 567-572.

58

R. S. Holmes

35. G.King, L.Hirst and R.S.Holmes, in Enzymology and Molecular Biology of Carbonyl Metabolism 7, eds. H.Weiner, E.Maiser, D.W.Crabb and R.Lindahl (New York, Kluwer/Plenum Publishers, 1999) pp. 189-198. 36. C.Verhagen, R.Hoekzema, G.M. Verjans and G.M. Kijlstra, Exp. Eye Res. 53 (1991) pp. 283-284. 37. T.D.Gondhowiardgo, N.J. van Haeringen, H.J. Volker-Dieben, H.W. Beekhuis, J.H.C. Kok, G. van Rij, L.Pels and A.Kijlstra, Cornea 12 (1993) pp. 146-154. 38. D.G.Pitts, Amer.J.Optom.Physiol.Optics 55 (1978) pp. 19-35. 39. D.E.Jones, M.D.Brennan, J.Hempel and R.Lindahl, Proc.Natl.Acad.Sci USA 85 (1988) pp. 1782-1786. 40. R.J.Cenedella, L.L.Linton and C.P.Mooore, Biochem.Biophys.Res.Commun. 186 (1992) pp. 1647-1655. 41. M.Waxler, in Effects of Changes in Stratosphere Ozone and Global Climate, eds. J.G.Tibus (Washington DC, US Environmental Agency, 1986) pp. 147153. 42. L.Uma, J.Hariharan and D.Balasubramanian, Exp. Eye Res. 63 (1996) pp. 117120. 43. J.Mitchell and R.J.Cenedella, Cornea 14 (1995) pp. 266-272. 44. H.Esterbauer, H.Zollner and R.J.Schaur, ISI Atlas of Science: Biochemistry 1 (1988) pp. 311-332. 45. K.C.Bhuhan and D.K.Bhuyan, Curr. Eye Res. 3 (1984) pp. 67-81. 46. F.J.Giblin and V.N.Reddy, Exp. Eye Res. 31 (1980) pp. 601-609. 47. J.Downes, J.L.VandeBerg and R.S.Holmes, Cornea 11 (1992) pp. 560-566. 48. J.E.Downes and R.S.Holmes, in Enzymology and Molecular Biology of Carbonyl Metabolism 5, eds. H.Weiner, D.W.Crabb and T.G.Flynn (New York, Plenum Press, 1995) pp. 349-354. 49. C.M.Rao and J.S.Zigler, Photochem. Photobiol. 56 (1992) pp. 523-528. 50. J.S.Zigler and C.M.Rao, FASEB J. 5 (1991) pp. 223-225. 51. S.J.Atherton, C.Lambert, J.Schultz, N.Williams and S.Zigman, Photochem. Photobiol. 70 (1999) pp. 823-828. 52. J.Dillon, L.Zheng, J.C.Merriam and E.R.Gillard, Photochem. Photobiol. 71 (2000) 225-229.

X-CHROMOSOME INACTIVATION DURING SPERMATOGENESIS: THE ORIGINAL DOSAGE COMPENSATION MECHANISM IN MAMMALS? JOHN R. M C C A R R E Y

Department of Genetics, Southwest Foundation for Biomedical Research, P.O. Box 760549, San Antonio, TX 78245, USA [email protected] Eutherian mammals utilize random X-chromosome inactivation during embryogenesis to achieve dosage compensation between XX female and XY male somatic cells. However, during gametogenesis the single X becomes inactivated in spermatocytes and both X chromosomes become active in oocytes. The function of these changes in X-chromosome activity in germ cells remains unclear. Marsupials use non-random inactivation of the paternal X to achieve dosage compensation in somatic cells. These observations are consistent with recent molecular analyses regarding the mechanism of X-chromosome inactivation in eutherians, and support the hypothesis that changes in X chromosome activity observed during gametogenesis in eutherians represent evolutionary remnants of an earlier mammalian dosage compensation mechanism which still operates in marsupials.

1

Introduction

In most mammals, sex determination is based on heteromorphic (XX/XY) sex chromosomes. This presents a potential problem regarding differential dosage of genes on the X chromosome that are not directly related to development of the sexual phenotype. Many animals, especially mammals, do not tolerate aneuploidy well [17]. Thus, it is not surprising that mechanisms have evolved to compensate for differences in dosages of X-linked genes. In all mammals studied to date, differential dosage of X-linked genes in XX females and XY males is compensated by inactivity of one of the two X chromosomes in each somatic cell in females. This mechanism, termed X-chromosome inactivation (XCI) results in expression of a single copy of all affected X-linked genes in somatic cells, regardless of the number of X chromosomes present in each cell [18,32]. Interestingly, meiotic germ cells represent the one exceptional cell type in which this dosage compensation mechanism is not utilized. Thus, in meiotic oocytes both X chromosomes are transcriptionally active, whereas in spermatocytes the single X chromosome is silenced. While proposals have been put forth concerning potential contributions of X-chromosome activity or inactivity to meiotic processes during gametogenesis [18,41], the function of this differential activity in these cells remains largely enigmatic. In the absence of any demonstrable function, an intriguing possibility is that in eutherian mammals inactivation of the single X chromosome during spermatogenesis and reactivation of the second X chromosome during oogenesis 59

60

J. R. McCarrey

may remain as evolutionary remnants of a previously important mechanism that is no longer critically required. These observations are consistent with an hypothesis originally proposed by Cooper [12] that an ancestral mammalian dosage compensation mechanism was based upon differential X chromosome activity during gametogenesis, and that a mechanism of this sort still functions in metatherian (marsupial) mammals. 2

X-Chromosome Activity During Eutherian Development

The developmental scheme of XCI in eutherian ("placental") mammals is depicted in Figure 1. A paternal X chromosome is delivered to female embryos by the sperm in an initially inactive state (as are the autosomes contributed by the sperm), but it soon becomes transcriptionally active such that there are two active X chromosomes at the morula stage in female embryos. Then, at the late blastula stage, one of the two X chromosomes in each cell undergoes inactivation (reviewed in [18]). That same X chromosome is then retained in an inactive state in all daughter cells deriving from each embryonic cell in which XCI originally occurred [15]. The choice of which X chromosome (paternally or maternally derived) becomes inactivated is random in each cell of the embryo proper. In addition, a counting mechanism functions to ensure that all but one X becomes inactivated, leaving only a single active X in each cell. This mechanism achieves dosage compensation in all somatic cells and premeiotic germ cells deriving from these embryonic cells. The somatic cells retain this dosage compensated state for the duration of their life span. However, in females the germ cells, which enter meiotic prophase at the late embryonic/early fetal stage, reactivate the second X chromosome such that oocytes possess two transcriptionally active X chromosomes. This leads to transmission of an active maternal X chromosome to offspring of either sex, regardless of which X chromosome is segregated into the functional ovum. Thus male embryos inherit an active maternal X that remains active in all somatic cells and premeiotic germ cells. Following puberty in males, premeiotic spermatogonia begin to give rise to spermatocytes as spermatogenesis proceeds. Coincident with this event, the single X chromosome in these XY cells undergoes inactivation. This results in transcriptional silencing of all X-linked structural genes in primary spermatocytes. Interestingly, although XCI is random in cells of the embryo proper in eutherians, it is nonrandom in the extraembryonic trophectoderm. Thus it has been shown that XCI is imprinted in trophectoderm cells, such that the paternal X (Xp) is consistently inactivated, while the maternal X (Xm) remains active [50,58].

61

X-Chromosome Inactivation During Spermatogenesis

X-CHROMOSOME ACTIVITY DURING EUTHERIAN DEVELOPMENT STAGE

FEMALE

MALE

STAIUS

Gametes

Imprinted

Zygote

Non-Random Equivalent (Not Compensated)

x)Cx", Gastrula - Adult (Somatic)

xxx

(x

X Reactivation in Oocytes

Fetal Germ Cells

Adult Germ Cells

Random (Compensated)

HI

X Inactivation in Spermatocytes (Not Compensated)

Figure 1: X-chromosome activity during eutherian development. The transcriptional activity states of X chromosomes in male and female eutherian mammals are depicted at various stages of development and gametogenesis. Open circles indicate transcriptionally active X chromosomes, diagonally hatched circles indicate transcriptionally inactive X chromosomes, horizontally hatched circles indicate Y chromosomes. The imprinted status of an active X chromosome in oocytes and an inactive X chromosome in sperm leads to non-random activity of the maternal X (X m ) in female embryos at the zygotic stage, followed by reactivation of the paternal X (X p ) by the morula stage when the imprinted state of X-chromosome activity is erased and a stage of X-chromosome "equivalency" is reached in cells of the embryo proper. Thus, at the morula stage male and female embryos are not dosage compensated, but this is rectified shortly thereafter at the blastula stage when random XCI produces a compensated state with a single active X chromosome in each cell which persists in somatic cells from the gastrula stage to adults. In female embryos, reactivation of the previously silent X occurs in oocytes, as the maternally imprinted state (X m ) is reestablished on both X chromosomes. Following puberty in males, the paternal imprint becomes established (X p ) and inactivation of the single active X occurs in spermatocytes, leading to a state that is not compensated for X-chromosome activity in meiotic germ cells.

3

X-Linked Gene Expression During Spermatogenesis in Eutherians

Molecular studies of different X-linked structural genes have revealed at least three different expression patterns during spermatogenesis (Table I). A common pattern affecting several X-linked housekeeping genes includes active expression in premeiotic spermatogonia followed by cessation of expression in meiotic spermatocytes which is maintained in postmeiotic spermatids. A second pattern involves expression in spermatogonia and inactivation in spermatocytes, followed by reactivation in spermatids. A third pattern shows initiation of expression in postmeiotic spermatids, with no previous expression in either spermatogonia or

62

J. R. McCarrey

spermatocytes. The common feature of all three patterns is inactivation in spermatocytes, suggesting XCI during spermatogenesis may contribute to meiotic processes. Interestingly, it is also at the onset of meiosis that a change in X chromosome activity occurs during oogenesis. Thus both the X and Y chromosomes are transcriptionally inactive in spermatocytes, while both X chromosomes are transcriptionally active in oocytes. This has lead to suggestions that changes in X chromosome activity may contribute to meiosis-specific processes [18,41]. Table I. Patterns of X-Linked Gene Expression During Eutherian Spermatogenesis.

Premeiotic Spermatogonia + + 4

Meiotic Spermatocytes -

Postmeiotic Spermatids + +

Example X-linked Genes Pgkl, Pdhal, Phka, Zfx Ubelx Akap82, Smage

Regulation of Eutherian XCI

Although XCI, originally known as the Lyon hypothesis [32], was first described nearly 40 years ago as a mechanism to achieve dosage compensation in mammals, it has only been in recent years that some insight has been obtained into the molecular mechanism underlying this phenomenon. Studies of X chromosome deletions and X/autosome translocations defined a region on the X chromosome termed the Xinactivation center (Xic) to which other portions of the X chromosome must remain physically attached in cis if they are to undergo XCI [11,30,46]. Alleles at a locus known as the X-controlling element (Xce), which maps within the Xic, were shown to differentially influence the randomness of inactivation of maternal or paternal X chromosomes (reviewed in [10]). More recently it was shown that a gene termed X-inactive specific transcripts (Xist), which also maps to the Xic [4,8], but is distinct from the Xce [50], is transcribed exclusively from the inactive X chromosome in somatic cells in mice and humans, and is believed to act as a primary regulator of XCI [6]. It is still not known how the Xist gene, which produces an RNA that does not encode a protein, but rather remains in the nucleus coating the inactive X chromosome [7,9], acts to initiate XCI. However, knock-out experiments have demonstrated that expression of the Xist gene is indeed indispensably required for the initiation of XCI in embryonic cells of the mouse [34,43]. Thus, this gene appears to be involved in the basic mechanism of XCI in eutherian mammals. The knock-out experiments also demonstrated that while disruption of the transcribed portion of the Xist gene precluded inactivation of the X chromosome on

X-Chromosome Inactivation During Spermatogenesis

63

which the disrupted Xist allele resides, it did not impair the counting mechanism, since a second X chromosome bearing a normal Xist gene underwent nonrandom XCI in every cell. This is despite previous demonstrations that the cell counts X chromosomes by counting Xic's [44]. While the precise molecular mechanism responsible for counting remains obscure, it appears that it works by regulating expression of Xist genes such that all but one are ultimately expressed, leading to inactivation of all but one of the X chromosomes present in each cell [40]. Xist expression leading to the random inactivation seen in the embryo proper of eutherian species is initiated at a slightly later stage of embryogenesis than that in the trophectoderm, and is believed to occur after the imprint on the Xist genes has been lost. More recent results indicate that initiation of XCI in embryonic cells is mediated by changes in Xist RNA stability [42,49]. Thus there is an initial period when both homologues of Xist simultaneously express a relatively unstable RNA species. This corresponds to the period when both X chromosomes are transcriptionally active at the morula stage, and is followed by a switch to expression of a long-lived, stable RNA from the Xist gene on the X chromosome that will become inactivated, and subsequently by extinction of transcription of any RNA from the Xist gene on the X chromosome that will remain active. Initial results suggested that the unstable and stable RNAs were produced from different promoters of the Xist gene, and that a "developmentally regulated promoter switch" might mediate initiation of XCI [21]. However, more recent results indicate that expression of unstable "sense" (Xist) transcripts from both active X chromosomes in the early embryo is followed by expression of an "antisense" (Tsix) transcript from the Xist locus on the X chromosome that will remain active [29]. Tsix expression appears to lead to extinction of Xist expression, and this allows that X chromosome to remain active. In the absence of antisense Tsix, a stable Xist transcript is expressed from the remaining X chromosome(s) which initiates the inactivation process [34]. Regardless of the molecular mechanism involved, the transient period of equal expression of unstable RNA from both Xist homologues appears to correspond to the period at which any inherited imprints on the two Xist alleles are lost, and thus represents a period of equivalency [1,40]. This is then followed rapidly by implementation of the counting mechanism leading to transcription of a stable RNA from all but one Xist homologue, and hence inactivation of all but one X chromosome. Interestingly, a period of equivalent expression of unstable Xist RNAs from both homologues has not been observed in cells of the trophectoderm where non-random, imprinted XCI occurs (Ref [21] and N. Brockdorff, personal communication). Since DNA methylation appears to be involved in regulating imprinted expression of Xist genes in the trophectoderm, and since alleles of Xist become differentially methylated on the active and inactive X chromosomes in the embryo proper, it has been suggested that this could represent a manifestation of the counting mechanism [14,28,40]. If the relevant regulatory methylation patterns are established in the upstream 5' portion of the Xist gene, this could explain how

64

J. R. McCarrey

ablation of only a small portion of the 5'-flanking region plus a significant portion of the transcribed region of this gene lead to disruption of inactivation, but not of the counting mechanism in the knock-out experiments. It has also been suggested that an element internal to the Xist gene is involved in choosing the X to be inactivated and that when this element is eliminated from one X, the wild-type X will be non-randomly inactivated [33]. Recently it was shown that both strands of the Xist locus are transcribed at various stages during development, such that both the previously described Xist transcript and its antisense "Tsix" transcript are produced [29,56]. The function of the Tsix transcript in regulating XCI and/or Xist function is yet to be clarified. DNA methylation has also been strongly implicated in stabilization of the inactivated state of the X chromosome in somatic cells in eutherian species [17,18,37]. The efficacy of this stabilizing mechanism is evident in human females in whom XCI normally remains stable for their entire life span of up to 100 years or more. That methylation contributes to this stability is evidenced by treatment of human female somatic cells with a demethylating agent which results in the precocious loss of XCI and aberrant expression of genes on the previously inactive X [25]. In addition to DNA methylation, histone deacetylation has been shown to be a potentially stabilizing mechanism contributing to a repressive, condensed chromatin structure on the inactive X in eutherian somatic cells [20]. Indeed, histone acetylation has now been shown to facilitate a general mechanism of transcriptional regulation [19]. A mechanistic relationship has been deduced in eutherian cells between DNA hypermethylation, histone deacetylation and chromatin condensation. Thus, methylated DNA binds methylated DNA-binding protein (MeCP2) which, in turn, associates with histone deacetylases [39] leading to deacetylation of histones in chromatin in the methylated domain and subsequent condensation of chromatin structure, and hence, repression of transcription [26]. This effect can be manifest in either a global chromosomal manner as is the case for the eutherian inactive X chromosome [20], or on an individual gene basis as is the case for certain tissue-specific genes [16]. This is relevant because certain X-linked structural genes, especially in humans, escape inactivation [18]. Clearly there are important molecular differences distinguishing X-linked loci that undergo XCI and maintain their inactive status from those that either escape XCI or undergo a rapid reactivation. Unlike the situation in female somatic cells, XCI in spermatogenic cells and that in oogonia prior to reactivation of the second X chromosome are transient events. Thus, it is perhaps not surprising that, also unlike the situation in somatic cells, the inactivated X in germ cells is not stabilized by DNA methylation [35] or histone deacetylation [2]. Expression of the Xist gene has also been shown to occur prior to, and during XCI during spermatogenesis in eutherians [36]. However, disruption of this gene does not appear to impair spermatogenesis [34] nor inactivation of X-linked structural genes during spermatogenesis (unpublished results). Thus, in eutherians, XCI during spermatogenesis may represent a critically required event during male

X-Chromosome Inactivation During Spermatogenesis

65

meiosis, or it may be an artifact of this process. The latter concept is supported by the observation that, during first meiotic prophase in spermatocytes, the sex chromosomes are sequestered in the "sex vesicle" or "sex body," an ill-defined, non-membrane bound compartment from which both transcriptional complexes, including RNA polymerase II, and RNA splicing components appear to be excluded [3]. 5

X-Chromosome Activity and Dosage Compensation in Marsupials

Marsupials routinely display nonrandom activity of the maternal X chromosome (Xm) and inactivity of the paternal X chromosome (Xp) in somatic cells of adult females [13,54]. Such a phenotype could result either from the inheritance of an inactive Xp produced by a stable XCI event during spermatogenesis as depicted in Figure 2, or through an imprinted XCI event during embryogenesis. That the former mechanism may be operating in marsupials is supported by the absence to date of any reported observation of two active X chromosomes in a single cell at any stage of development in female marsupial embryos. Thus previous studies based on X-linked isozyme patterns and X-chromosome cytology showed the presence of an inactive Xp in the bilaminar blastocyst which corresponds to the eutherian blastula, and in the trilaminar blastocyst which corresponds to the eutherian gastrula [22,23,45,54]. No published data currently exists regarding X-chromosome activity or inactivity during spermatogenesis in any marsupial species. Pedigree data from several species of Australian Macropodids [52], and two species of American opossums, Didelphis virginiana [47] and Monodelphis domestica [53], indicates that both the Xp and the Xm in oocytes are potentially active when transmitted to progeny in marsupials. Limited evidence based on studies of X-chromosome encoded isozymes in one species of kangaroos suggests that the Xp is inactive in oogonia, and possibly in early oocytes as well [5,24]. Very recent data based on molecular analysis of polymorphic mRNAs showed that the paternal X chromosome in kangaroo oocytes does indeed undergo reactivation, but at a much later stage than in eutherian oocytes [57]. XCI appears to be less stable in marsupial cells than it is in eutherians. Thus reactivation of previously inactive X-linked genes has been observed, such that both the maternal and paternal alleles are often expressed within the same cell [48]. Interestingly, there appear to be locus-specific differences in the likelihood of reactivation of X-linked genes in marsupials. Thus, while biallelic expression of the X-linked G6pd gene [47] and also the Hprt gene [38] (indicating unstable inactivation at these loci) has been frequently observed in Didelphis, another Xlinked gene, Pgk-1, consistently showed monoallelic expression (indicating stable inactivation at this locus) in the same species [47]. This reflects the absence in genes on the inactive X in marsupials of the hypermethylation that stabilizes the repressed transcriptional state of the inactive X in eutherians. One X-linked gene,

66

J. R. McCarrey

G6pd, has been examined in particular detail in both Didelphis and the wallaroo (Macropus robustus), and shows no hypermethylation on the inactive X [27,31]. Other parameters associated with transcriptional repression of genes on the eutherian inactive X, including retardation in replication timing [6], condensed chromatin structure [54], and histone deacetylation [55] have been found shown to be associated with the marsupial inactive X as well. Despite significant efforts, no homologue of the Xist gene, nor any other gene that might regulate XCI, has been identified in any marsupial species to date. PROPOSED X-CHROMOSOME ACTIVITY DURING MARSUPIAL DEVELOPMENT STASE

FEMALE

Gametes

©

MALE

Active X in Egg, Inactive X in Sperm

©^ ©^

Zygote

Morula

STATUS

Non-Random

Non-Random

Gastrula - Adult (Somatic)

• ^ | H |

Non-Random

Fetal Germ Cells

©m?

X Reactivation in Oocytes

Adult Germ Cells

©S

X Inactivation in Spermatocytes

Figure 2: Proposed X-chromosome activity during marsupial development. The proposed transcriptional activity states of X chromosomes in male and female metatherian (marsupial) mammals are depicted at various stages of development and gametogenesis. Open circles indicate transcriptionally active X chromosomes, diagonally hatched circles indicate transcriptionally inactive X chromosomes, horizontally hatched circles indicate Y chromosomes. It is proposed that XCI during spermatogenesis and X-chromosome reactivation during oogenesis lead to imprinted, non-random activity of the maternal X (Xm) and inactivity of the paternal X (Xp) in female embryos at the zygotic stage, and that this state is maintained as such thereafter in all somatic cells. Reactivation of the paternal X during oogenesis is proposed to coincide with establishment of a maternal imprint on this X in oocytes, while inactivation of the maternally-derived X during spermatogenesis establishes the paternal imprint on this X in spermatocytes.

6

Hypothesis

As described above and in Figure 2, and as originally proposed by Cooper [12], the nonrandom XCI phenomenon observed in marsupials is most easily attributed to a mechanism based on stable XCI during spermatogenesis and X-chromosome

X-Chromosome Inactivation During Spermatogenesis

67

reactivation during oogenesis. These events alone could lead to the production of female embryos with one inactive (Xp) and one active (Xm) homologue, and would thus not require any XCI event in the embryo to achieve dosage compensation. Mechanistically this is much easier to conceive of than is the random XCI process that operates in eutherians. Two diagnostic features of this mechanism that would distinguish it from that operating in eutherians would be the stable inactivation of the single X in spermatogenic cells such that the transcriptionally repressed state would persist throughout spermatogenesis and early embryogenesis, and the lack of an XCI event during embryogenesis. Two predicted effects of such a mechanism would be the nonrandom activity of the Xm as is found in marsupials, and the absence of any counting mechanism to preclude supernumerary activity of multiple Xm chromosomes, which has yet to be investigated in marsupials. These characteristics could vary in the extent to which they confer evolutionary advantage or disadvantage relative to the mechanism currently operating in eutherians. Nonrandom XCI results in a functional hemizygosity effect, since only genes on the Xm will be expressed, except in those cases in which genes on the Xp undergo reactivation — a possible selective influence favoring the relatively high incidence of such reactivation observed in marsupials [47]. Nevertheless, this remains a potential problem for those genes that typically remain stably repressed on the marsupial inactive X. The absence of a counting mechanism would lead to functional aneuploidy if more than one Xm were inherited by a single embryo. This could lead to embryonic lethality, and hence an advantageous early elimination of probable non-productive, sterile individuals from the population. On the other hand, in the case of XX/XXX mosaics, a counting mechanism of the sort that exists in eutherians could facilitate embryonic viability such that the euploid XX oogenic cells could then be used for reproduction. The disadvantages associated with nonrandom, non-counted XCI could have favored the evolution of additional mechanistic features leading to the random, counted XCI mechanism found in eutherian species. This could be achieved by reactivation of the Xp in the early embryo to achieve a stage of "equivalency" between the Xp and Xm, followed by counting and random inactivation of all but a single X chromosome in each cell in the late blastula. Indeed this may be mediated by the promoter switching mechanism described in the mouse Xist gene that mediates a switch from equivalent, biallelic expression of unstable Xist RNAs from both X chromosomes in early embryonic cells to monoallelic expression of a stable Xist RNA from the X to be inactivated in later embryonic cells. As noted above, it is at this stage of equivalent, biallelic expression that the counting and random XCI mechanisms function in eutherians. Since biallelic expression of the unstable Xist RNA is mediated by transcription from an upstream promoter, I would suggest that it was the addition of this promoter, which is therefore likely unique to the Xist gene in eutherian species, that allowed eutherians to superimpose counting and random inactivation mechanisms onto the preexisting, more primitive, nonrandom dosage compensation mechanism.

68

J. R. McCarrey

If this hypothesis is valid, then the XCI event we observe during spermatogenesis in extant eutherian species may not represent an indispensable mechanism in those species, but could simply be an evolutionary remnant of what once was an indispensable mechanism of dosage compensation that operated in predecessors of eutherian mammals. Confirmation of this hypothesis will require experiments to test the predictions that in marsupials there is a stable XCI event during spermatogenesis and no XCI event during embryogenesis, along with additional experiments showing that XCI is not strictly required for normal spermatogenesis in eutherians. However, it is also possible that while XCI during spermatogenesis originally served to establish dosage compensation in female offspring, it has also become an important part of the meiotic process and may still be indispensable in eutherians. References: 1.

Ariel, M., Robinson, E., McCarrey, J. and Cedar, H., Gamete specific methylation correlates with imprinting of the murine Xist gene. Nature Genetics 9 (1995) pp. 312-315. 2. Armstrong, S.J., Hulten, M.A., Keohane, A.M., and Turner, B.M., Different strategies of X-inactvation in germinal and somatic cells: histone H4 underacetylation does not mark the inactive X chromosome in the mouse male germline. Exp. Cell Res. 230 (1997) pp. 399-402. 3. Ayoub, N., Richler, C , and Wahrman, J., Xist RNA is associated with the transcriptionally inactive XY body in mammalian male meiosis. Chromosoma 106 (1997) pp. 1-10. 4. Borsani,G., Tonlorenzi, R., Simmier, M.C., Dandolo, L., Arnaud, D., Capra, V., Grompe, M., Pizzuti, A., Muzny, D., Lawrence, C , Willard, H.F., Avner, P., and Ballabio, A., Characterization of a murine gene expressed from the inactive X chromosome. Nature 351 (1991) pp. 325-328. 5. Briscoe, D.A., Robinson, E.S., and Johnston, P.G., Glucose-6-phosphate dehydrogenase and lactate dehydrogenase activity in kangaroo and mouse oocytes. Comp. Biochem. Physiol. 75B (1983) pp. 685-688. 6. Brockdorff, N., Ashworth, A., Kay, G.F., Cooper, P., Smith, S., McCabe, V.M., Norris, D.P., Penny, G.D., Pater, D., and Rastan, S., Conservation of position and exclusive expression of mouse Xist from the inactive X chromosome. Nature 351 (1991) pp. 329-331. 7. Brockdorff, N., Ashworth, A., Kay, G.F., McCabe, V.M., Norris, D.P., Cooper, P.J., Swift, S., and Rastan, S., The product of the mouse Xist gene is a 15 kb inactive X-specific transcript containing nor conserved ORF and located in the nucleus. Cell 71 (1992) pp. 515-526. 8. Brown, C.J., Ballabio, A., Rupert, J.L., Lafreniere, R.G., Grompe, M., Tonlorenzi, R., and Willard, H., A gene from the region of the human X

X-Chromosome Inactivation During Spermatogenesis

9.

10. 11.

12. 13.

14.

15.

16. 17. 18. 19. 20.

21.

22.

69

inactivation center is expressed exclusively from the inactive X chromosome. Nature 349 (1991) pp. 38-44. Brown, C.J., Hendrich, B.D., Rupert, J.L., Lafreniere, R.G., Xing, Y., Lawrence, J., and Willard, H.F., The human XIST gene: analysis of a 17 kb inactive X-specific RNA that contains conserved repeats and is highly localized within the nucleus. Cell 71 (1992) pp. 527-542. Cattanach, B.M., Control of X-chromosome inactivation. Annu. Rev. Genet. 9 (1975) pp. 1-18. Cattanach, B.M., Rasberry, C , Evans, E.P., Dandolo, L., Simmler, M.C., and Avner, P., Genetic and molecular evidence of an X-chromosome deletion spanning the tabby (Ta) and testicular feminization (Tfm) loci in the mouse. Cytogenet. Cell Genet. 56 (1991) pp. 137-143. Cooper, D.W., Directed genetic change model for X chromosome inactivation in eutherian mammals. Nature 230 (1971) pp. 292-294. Cooper, D.W., VandeBerg, J.L, Sharman, G.B., and Poole, W.E., Phosphoglycerate kinase polymorphism in kangaroos provides further evidence for paternal X inactivation. Nature New Biol. 230 (1971) pp. 155-157. Courtier, B., Heard, E., and Avner, P., Xce haplotypes show modified methylation in a region of the active X chromosome lying 3' to Xist. Proc. Natl. Acad. Sci. USA 92 (1995) pp. 3531-3535. Davidson, R.G., Nitowsky, H.M., and Childs, B., Demonstration of two populations of cells in the human female heterozygous for glucose-6-phosphate dehydrogenase variants. Proc. Natl. Acad. Sci. USA SO (1963) pp. 481-485. Eden, S., Hashimshony, T., Keshet, I., Cedar, H., and Thorne, A.W., DNA methylation models histone acetylation. Nature 394 (1998) pp. 842. Gartler, S.M. and Riggs, A.D., Mammalian X-chromosome inactivation. Ann. Rev. Genet. 17 (1983) pp. 155-190. Heard, E., Clerc, P., and Avner, P., X-chromosome inactivation in mammals. Annu. Rev. Genet. 31 (1997) pp. 571-610. Imhof, A. and Wolffe, A.P., Transcription: gene control by targeted histone acetylation. Curr. Biol. 8 (1998) pp. R422-R424. Jeppesen, P. and Turner, B.M., The inactive X chromosome in female mammals is distinguished by a lack of histone H4 acetylation, a cytogenetic marker for gene expression. Cell 74 (1993) pp. 281-289. Johnston, CM., Nesterova, T.B., Formstone, E.J., Newall, A.E.T., Duthie, S.M., Sheardown, S.A. and Brockdorff, N., Developmentally regulated Xist promoter switch mediates initiation of X inactivation. Cell 94 (1998) pp. 809817. Johnston, P.G., Dean, D., VandeBerg, J.L., and Robinson, E.S., HPRT activity in embryos of a South American opossum Monodelphis domestica. Reprod. Fertil. Dev. 6 (1994) pp. 529-532.

70

J. R. McCarrey

23. Johnston, P.G., and Robinson, E.S., Glucose-6-phosphate dehydrogenase expression in heterozygous kangaroo embryos and extra-embryonic membranes. Genet. Res. 45 (1985) pp. 205-208. 24. Johnston, P.G., Robinson, E.S., and Johnston, D.M., Dictyate oocytes of a kangaroo (Macropus robustus) show paternal inactivation at the X-linked Gpd locus. Aust. J. Biol. 38 (1985) pp. 79-84. 25. Jones, P.A., Taylor, S.M., Mohandas, T., and Shapiro, L., Cell cycle-specific reactivation of an inactive X-chromosome locus by 5-azadeoxycytidine. Proc. Natl. Acad. Sci. USA 79 (1982) pp. 1215-1219. 26. Jones, P.L., Veenstra, G.J., Wade, P.A., Vermaak, D., Kass, S.U., Landsberger, N., Strouboulis, J., and Wolffe, A.P., Methylated DNA and MeCP2 recruit histone deacetylase to repress transcription. Nat. Genet. 19 (1998) pp. 187-191. 27. Kaslow, D.C. and Migeon, B., DNA methylation stabilizes X chromosome inactivation in eutherians but not in marsupials: evidence for multistep maintenance of mammalian X dosage compensation. Proc. Natl. Acad. Sci. USA 84 (1987) pp. 6210-6214. 28. Kay, G.F., Barton, S.C., Surani, M.A., and Rastan, S. Imprinting and X chromosome counting mechanisms determine Xist expression in early mouse development. Cell 77: 639-650. 29. Lee, J.T., Davidow, L.S., and Warshawsky, D., Tsix, a gene antisense to Xist at the X-inactivation centre. Nat. Genet. 21 (1999) pp. 400-404. 30. Leppig, K.A., Brown, C.J., Bressler, S.L., Gustashaw, K., Pagon, R.A., Willard, H.F. and Disteche, CM., Mapping of the distal boundary of the Xinactivation center in a rearranged X chromosome from a female expressing XIST. Hum. Mol. Genet. 2 (1993) pp. 883-888. 31. Loebel, D.A.F. and Johnston, P.G., Methylation analysis of a marsupial Xlinked CpG island by bisulfite genomic sequencing. Genome Res. 6 (1996) pp. 114-123. 32. Lyon, M.F., Gene action in the X chromosome of the mouse (Mus musculus) Nature 190 (1961) pp. 373. 33. Marahrens, Y., Loring, J., and Jaenish, Rudolf., Role of the Xist gene in X chromosome choosing. Cell 92 (1998) pp. 657-664. 34. Marahrens, Y., Panning, B., Dausman, J., Strauss, W., and Jaenisch, R., Xistdeficient mice are defective in dosage compensation but not sprermatogenesis. Genes & Devel. 11 (1997) pp. 156-166. 35. McCarrey, J.R., Berg, W.M., Paragioudakis, S.J., Zhang, P.L., Dilworth, D.D., Arnold, B.L., and Rossi, J.J., Differential transcription of Pgk genes during spermatogenesis in the mouse. Developmental Biology 154 (1992) pp. 160-168. 36. McCarrey, J.R. and Dilworth, D.D., Expression of Xist in mouse germ cells correlates with X-chromosome inactivation. Nature Genetics 2 (1992) pp. 200203. 37. Migeon, B.R., X-chromosome inactivation: molecular mechanisms and genetic consequences. TIG 10 (1994) pp. 230-235.

X-Chromosome Inactivation During Spermatogenesis

71

38. Migeon, B.R., DeBeur, S.J. and Axelman, J., Frequent derepression of G6pd and Hprt on the marsupial inactive X chromosome associated with cell proliferation in vitro. Exp. Cell Res. 182 (1989) pp. 597-609. 39. Nan, X., Ng, H.H., Johnson, C.A., Laherty, CD., Turner, B.M., Eisenman, R.N., and Bird, A., Transcriptional repression by the methyl-CpG-binding protein MeCP2 involves a histone deacetylase complex. Nature 393 (1998) pp. 386-389. 40. Norris, D.P., Patel, D., Kay, G.F., Penny, G.D., Brockdorff, N., Sheardown, S.A., and Rastan, S., Evidence that random and imprinted Xist expression is controlled by preemptive methylation. Cell 11 (1994) pp. 41-51. 41. Odorisio, T., Mahadevaiah, S.K., McCarrey, J.R. and Burgoyne, P.S. (1996) Transcriptional analysis of the candidate spermatogenesis gene Ubely and of the closely related Ubelx shows that they are co-regulated in spermatogenic cells. Dev. Biol. 180 (1996) pp. 336-343. 42. Panning, B., Dausman, J., and Jaenisch, R., X chromosome inactivation is mediated by Xist RNA stabilization. Cell 90 (1997) pp. 907-916. 43. Penny, G.D., Kay, G.F., Sheardown, S.A., Rastan, S., and Brockdorff, N., Requirement for Xist in X chromosome inactivation. Nature 379 (1996) pp. 131-137. 44. Rastan, S., Non-random X-chromosome inactivation in mouse X-autosome translocation embryos: location of the inactivation center. J. Embryol. Exp. Morphol. 78 (1983) pp. 1-22. 45. Robinson, E.S., Samollow, P.B., VandeBerg, J.L., and Johnston, P.G., Xchromosome replication patterns in adult, newborn, and prenatal opossums. Reprod. Fertil. Dev. 6 (1994) pp. 533-540. 46. Russell, L.B., X-autosome translocations in the mouse: their characterization and use as tools to investigate gene inactivation and gene action. In: Cytogenetics of the Mammalian X Chromosome, Part A: Basic Mechanisms of X Chromosome Behavior. Prog. Top. Cytogenet. v. 3A. Sandberg, A.A. (ed) (Liss. New York, 1983) pp 205-250. 47. Samollow, P.B., Ford, A.L., VandeBerg, J.L., X-linked gene expression in the Virginia opossum: differences between the paternally derived Gpd and Pgk-A loci. Genetics 115 (1987) pp. 185-195. 48. Samollow, P.B., Robinson, E.S., Ford, A.L., and VandeBerg, J.L., Developmental progression of Gpd expression from the inactive X chromosome of the Virginia opossum. Devel. Genet. 16 (1995) pp. 367-378. 49. Sheardown, S.A., Duthie, S.M., Johnston, CM., Newall, A.E.T., Formstone, E.J., Arkell, R.M., Nesterova, T.B., Alghisi, G.-C, Rastan, S., and Brockdorff, N., Stabilization of Xist RNA mediates initiation of X chromosome inactivation. Cell 91 (1997) pp. 99-107. 50. Simmler, M.C., Cattanach, B.M., Rasberry, C , Rougeulle, C. and Avner, P., Mapping the murine Xce locus with (CCA)n repeats. Mamm. Genome 4 (1993) pp. 523-530.

72

J. R. McCarrey

51. Takagi, N., and Sasaki, M., Preferential inactivation of the paternally derived X chromosome in the extraembryonic membranes of the mouse. Nature 256 (1975) pp. 640-642. 52. VandeBerg, J.L., Developmental aspects of X chromosome inactivation in eutherian and metatherian mammals. J. Exp. Zool. 228 (1983) pp. 271-286. 53. VandeBerg, J.L., Johnston, P.G., Cooper, D.W., and Robinson, E.S., Xchromosome inactivation and evolution in marsupials and other mammals. Isozymes Curr. Top. Biol. Med. Res. 9 (1983) pp. 201-218. 54. VandeBerg, J.L., Robinson, E.S., Samollow, P.B., and Johnston, P.G., X-linked gene expression and X-chromosome inactivation: marsupials, mouse, and man compared. Isoymes: Curr. Topics Biol. Med. Res. 15 (1987) pp. 225-253. 55. Wakefield, M.J., Keohane, A.M., Turner, B.M., and Marshall-Graves, J.A., Histone underacetylation is an ancient component of mammalian X chromosome inactivation. Proc. Natl. Acad. Sci. USA 94 (1997) pp. 96659668. 56. Warshawsky, D., Stavropoulos, N., and Lee, J.T., Further examination of the Xist promoter-switch hypothesis in X inactivation: Evidence against the existence and function of a P0 promoter. Proc. Natl. Acad. Sci. USA 96 (1999) pp. 14424-14429. 57. Watson, D., Whicker, A., Loebel, D.A., Robinson, E.S., and Johnston, P.G. Single nucleotide primer extension (SNuPE) analysis of the G6PD gene in somatic cells and oocytes of a kangaroo (Macropus robustus). Genetical Res. In press. 58. West, J.D., Freis, W.I., chapman, V.M., and Papaioannou, V.E., Preferential expression of the maternally derived X chromosome in the mouse yolk sac. Cell 12 (1977) pp. 873-882.

MOLECULAR EVOLUTION AND ENVIRONMENTAL-STRESS E. N E V O

Institute of Evolution,

University of Haifa, Haifa 31905, Israel

E-mail: nevo @ research.haifa.ac. il The major unresolved problem haunting evolutionary genetics is how much of genome organization and genetic diversity at the protein, DNA, and chromosomal levels is adaptive, processed by natural selection, and contribute to differential fitness? Rephrased, what is the relative importance of selective (natural selection) and seemingly nonselective (mutation pressure in the broadest sense, recombination, migration and genetic drift) factors in genome evolution? Since 1974 the Institute of Evolution at the University of Haifa has tested genetic diversity and spatiotemporal divergence in diverse and numerous natural populations of plants, fungi, and animals at local, regional, and global scales. The results indicate that protein and DNA diversities are heavily structured, and vary nonrandomly in all populations, positively correlated with ecological heterogeneity and stress, and are often negatively correlated with population size. Natural selection appears to be the main evolutionary driving force, overriding other forces in orienting evolution. Neutrality and near-neutrality theories of molecular evolution are rejected by the evidence as prime movers of evolution. Spatiotemporal abiotic and biotic environmental heterogeneity and stress can orient genome architecture and maintain genetic diversity in nature. Modern molecular ecological genomics can highlight the evolution of genetic polymorphism and genome evolution.

1

1.1

Introduction

Evolutionary genetics, environmental heterogeneity and stress

Growing evidence indicates that environmental diversity, or niche-width [55] and stress [3,10] have a profound influence through natural selection on molecular and organismal evolution [2,4,5,8,25,26,31,39] in the fossil and living biota. Climatic diversity, change, and stress are among the major determinants of evolutionary change. Likewise, thermal and chemical stresses are substantial in molecular dynamics. Here, I will briefly overview the research program on genetic diversity at the molecular, protein, and DNA levels, chromosomal patterns, and genome size conducted since 1974 at the Institute of Evolution, University of Haifa, Israel, 73

74

E. Nevo

aiming at shedding light on the enigma of genetic diversity and genome evolution in nature (Fig. 1). For statistical analyses and discrimination between alternative hypotheses, see the individual cited papers. We have studied the structure and dynamics of genetic diversity and divergence at the protein (isozyme) and DNA levels in natural populations of diverse plants and animals across phylogeny at macro-and microgeographic scales, affected by climatic, thermal, and chemical stresses [reviewed in 26-31, 36]. Our ecological theaters involved foremost the entire area of Israel as a natural laboratory [30,32,34], characterized by two gradients of increasing aridity, southward, towards the Negev Desert, and eastwards towards the rift valley, Jordanian and Syrian Deserts, both in above- and underground organisms. Secondly, we studied several microscale settings, including edaphic, topographic, and climatic stresses in both plants and animals with varied mating systems. Thirdly, we tested global patterns, primarily based on literature survey. I will first briefly overview the older and new GENETIC DIVERSITY Studies at the Institute of Evolution Haifa University, Israel 1971-1999 Microgeographic (72) Thermal (2) Isozymes (181) ' DNA (90) Nature (248)

Macrogeographic (197) Stresses: Chemical (42)

Genetic Diversity (397) T Humans (41)

Climatic (123) - Animals (176) . Plants (155) 'Wild progenitors (144) Taboratory (250)

Theoretical (69) Spesiation (68)

-Evolu ion (274) .

Adaptation (220) Genetic mapping (28)

Applied fllO) (I) Breeding programs of cereals (82) (IP Genetic monitoring: Quality of marine environment (28) Figure 1. Research programme of genetic diversity at the Institute of Evolution, University of Haifa, Israel, catagorized according to organism, stress, scale, and genetic probe. Numbers in parenthesis indicates the publications in each category [36].

evidence at all scales, global, regional and local, emphasizing the new evidence at the DNA level, then review theoretical models that might explain genetic diversity and divergence in nature.

Molecular Evolution and Environmental-Stress 2

2.1

75

Evidence

Global analysis of genetic polymorphism

We have used the entire globe as a large-scale ecological-genetic laboratory. We analyzed globally the correlates of biotic factors involving ecological, demographic, and life history variables with the level of genetic diversity in natural populations of animals and plants [39]. This review involved 1111 species studied for allozymic variation for an average of 23 gene loci each, and a biotic profile characterized by 21 variables (7 ecological, 5 demographic, and 9 life history and other biological characteristics). The patterns and correlates of genetic diversity revealed in the global analysis over many unrelated species, subdivided into different abiotic and biotic regimes, strongly implicate selection in the genetic differentiation of species. Natural selection in several forms, but most likely through the mechanisms of spatiotemporally varying environments at the various life cycle stages of organisms, appears to be an important evolutionary force causing change at the molecular level in many species. Other evolutionary forces, including mutation, migration, and genetic drift, certainly interact with natural selection, either directly or indirectly, and thereby contribute differentially, according to circumstance, to population genetic differentiation at the molecular level. The final orientation, however, of the evolutionary process is determined by natural selection. 2.2

Regional analysis of genetic polymorphism in Israel and the Near East

Our regional analyses of genetic diversity extended over Israel and the Near East. We have used Israel, with its remarkable physical and biotic diversities, as a medium- large-scale ecological-genetic laboratory [30-32,34,36] and conducted an ecological test of protein polymorphism in 13 unrelated genera of plants, invertebrates and vertebrates, involving 21 species, 142 populations and 5474 individuals [37], (Fig. 2), following our extensive studies of allozyme diversity in 38 species in Israel [27]. Each individual, population, and species was tested, on average, for 27 enzymatic gene loci. These species varied in population size and structure, life histories and biogeographic origins, but they largely share geographically short (260 km) and ecologically stressful gradients of increasing aridity in Israel, both eastward, towards the Syrian and Jordanian deserts, and (mainly) southward towards the Negev desert. We found genetic parallelism across most taxa, and most loci. Observed average heterozygosity, H, and gene diversity, He, were positively and overall significantly correlated with rainfall variation which is lowest in the northwest and highest in southern deserts, i.e., with wider, fluctuating, climatic

76

E. Nevo

temporal-niche in or near the desert [37] (Fig. 2). Similar trends were found for several DNA systems in subterranean mole-rats, S. ehrenbergi [49]. This result corroborates the environmental theory of genetic diversity, which regards ecological heterogeneity and stress as major determinants of genetic divergence, overriding other operating factors.

Thabo pisona

Dociostourus ctirvicfrcus Bufo viridis

Sphinctorocbilo Gryllotoipo gryllotalpa $

0.1100

AVERAGE ACROSS TAXA Hordoum spontoncum

Gtrbillui ollMbyl Trittcum dicoccoioM Spalax ohronbtrai —

Acomys cohirinus

0A 0.24 0.23 0.26 0.2T 0 2 8 a 2 9 0.30 0.31 0.52 a 3 3 0 3 4 0.35 0.36 0.37 0.38 a 3 9 0.40 Rainfall variation

Figure 2. Parallel genetic patterns in the level of gene diversity, He, of 13 enzymatic systems averaged across all taxa studied here. The average regression line is: He = -0.0556 + 0.4803 RV [37]. Legend: Theba pisana (landsnail); Dociostourus curvicercus (grasshopper); Bufo viridis (toad); Sphincterochila (landsnail); Gryllotalpa gryllotalpa (mole-cricket); Hordeum spontaneum (wild barley); Rana ridibunda (water frog); Hyla arborea (tree frog); Agama stellio (lizard); Garbillus allenbyi (fossorial mouse); Triticum dicoccoides (wild wheat); Spalax ehrenbergi (mole-rat); Acomys chirinus (spiny mouse). The only change is that Gryllotalpa gryllotalpa has been divided into two species: G. tali (2n=19) and G. marismortui (2n=23) and the regression line represents G. tali [37].

Notably, heterozygosity, H, is negatively correlated with population size. H increases, whereas population size declines drastically towards the deserts. In general, population size is highly correlated with effective population size, /Ve (which was not estimated in our studies). This pattern contradicts a basic postulate of neutrality theory which predicts positive correlation between H and Ne [13]. Our results are clearly inconsistent with the genetic drift theory [56] or the neutral theory of molecular evolution, even in its milder form of near neutrality [51]. Our results suggest that natural selection, through environmental range and stress,

Molecular Evolution and Environmental-Stress

77

appears to be an important genetic differentiating evolutionary force at the protein and DNA levels, overriding other factors, i.e., drive molecular evolution [36], Similar results were obtained in extensive ecological-genetic studies in wild wheat [38], wild barley [43], and in subterranean mole rats across the Near East and Asia Minor [44,46]. Genetic diversity either in the wild cereals aboveground, or in mole rats underground, is primarily determined by climatic selection. 2.3

Local population genetics across sharp ecological contrasts: past microgeographical studies in plants and animals

Microgeographical studies in nature, particularly those involving sharp contrasts (climatic, thermal, edaphic, geologic, topographic and chemical) provide remarkable long-term experiments, small-scale natural laboratories, for analyzing ecologic-genetic population dynamics [28]. From 1975, we have conducted several microgeographical studies in nature [36]. We employed thermal (high vs low temperature) and chemical (polluted vs clean areas) stresses in marine balanids, and conducted an extensive research program on inorganic and organic pollution and its effect on genetic diversity in marine organisms [29]. Likewise, we examined edaphic (terra rossa vs basalt soil; rock vs deep soil) and microclimate (sun vs shade; and high vs low solar radition) stresses in wild barley, Hordeum spontaneum, wild emmer wheat, Triticum dicoccoides and Aegilops peregrina. The conclusions of these microscale studies all point inferentially to natural selection as a major differentiating factor of qualitative and quantitative patterns of genetic diversity at single loci, but primarily at multilocus structures and genome organization. 2.3.1

Edaphic climatic and topographical factors select allozymes and microsatellites at three microscales sites

We tested the effects of internal (genetic) and external (ecologic) factors on allelic diversity at 27 dinucleotide microsatellite (simple sequence repeat [SSR]) loci in three Israeli natural populations of Triticum dicoccoides from Ammiad, Tabigha, and Yehudiyya, north of the Sea of Galilee [23]. The results demonstrated that SSR diversity is correlated with the interaction of ecological and genetic factors. Soilunique and soil-s/?ecjyjc alleles were found in two edaphic groups dwelling on terra rossa and basalt soils across macro- and microgeographical scales (Fig. 3). Edaphic stresses may affect the probability of replication errors and recombination intermediates and thus control diversity level and divergence of SSRs. The results may indicate that SSR diversity, largely noncoding and probably largely regulatory, is adaptive, channeled by natural selection and influenced by both internal and external factors and their interactions [24]. Similarly, we have shown that in wild emmer wheat, Triticum dicoccoides, natural selection causes significant genetic divergence at single, multilocus, and

78

E. Nevo

allele association (linkage disequilibria) structure of coded allozymes and largely noncoded microsatellites. This genetic differentiation was demonstrated by climatic [22,41] and topographic [22,41,45] ecological factors. Similar genetic differentiation was also demonstrated in wild barley at several microsites [47,48,52,53]. In sum: in wild cereals (wild barley and wild emmer wheat) studied spatiotemporally since 1975 at the Institute of Evolution, at the protein and DNA levels at micro- and macrogeographical scales, we found significant climatic, soil, and topographic divergence at the foregoing four microsites, as well as in our macrogeographic studies in Israel and the Near East [38,43]. All studies illustrate massive genetic nonrandom divergence at both single- two- and multilocus sturctures of genome organization, with specific and unique alleles and levels of genetic diversity correlated with ecological heterogeneity, niche breadth, and stress, but often negatively correlated with population size. Neither random drift nor gene flow can explain the levels and divergence of genetic diversity in wild cereals at both the protein level, and at the coding and noncoding DNA regions in these microscale sites. The data strongly suggest (after randomization tests) that natural selection overrides gene flow, maintains molecular polymorphisms in accordance with microscale divergent ecologies, and orients molecular evolution. How general is this genetic divergence pattern of wild cereals which are predominantly selfers across life as a whole, primarily in outcrosser plants and nonsedentary animals? 2.3.2

"Evolution Canyon": Evolution in action across phylogeny caused by climatic stresses

Biodiversity and genome evolution, and the relative importance of forces driving evolution are critically tested at "Evolution Canyon", Lower Nahal Oren, Mount Carmel, Israel. A dynamic ongoing microscale research program [35]. Our aim is to draw generalizations across life in genotypes and phenotypes, and to highlight controversial and unresolved problems of biological evolution. The opposite slopes of "Evolution Canyon" display dramatic physical and biotic contrasts at a microscale. Higher solar radiation (up to 600% more) on the South-Facing Slope (SFS) makes it warmer, drier, and spatiotemporally more heterogenous and fluctuating than the North-Facing Slope (NFS), separated by only 100m (at bottom) and 400 m (at top). Consequently, the SFS represents an Afroasian savanna park forest, whereas the NFS represents a dense Euroasian live oak maquis forest (Fig. 4). The tropical Afroasian warm and xeric SFS savanna displays a larger ecological heterogeneity, and higher stress for temperate terrestrial organisms. It is richer than the NFS in species of terrestrial taxa. By contrast, the temperate Euroasian cooler and mesic NSF is richer than the SFS in reproductively water-dependent lower plants and mushrooms, thus displaying global patterns

Molecular Evolution and Environmental-Stress

79

locally, despite the possibility of mixing by easy migration across the canyon (see Fig. 3 in Nevo, 1997 [35]). Until the autumn of 1999, we identified more than 2000 species, in an area of only 7000 m2, many displaying differential qualitative or quantitative inter- and intraslope divergence. As predicted, genetic diversity was higher on the ecologically more heterogeneous (wider-niche) and stressful SFS in 9 of 11 tested temperate model organisms (lichen, wild barley, 2 landsnails, earthworm, diplopod, 3 beetles and 2 rodents). In six species heterozygosity was negatively correlated with population size, i.e., population abundance was lower on the SFS while genetic diversity was higher (Pavlicek and Nevo, unpublished), negating the positive correlation expected between H and population size by neutrality theory [13]. We have shown in wild barley, Hordeum spontaneum that RAPD and STS (Sequence-tagged-site) PCR analysis [52], mirror allozyme interslope patterns, as does BARE-1 retrotransposon elements [11]. We hypothesized that microclimatic diversifying natural selection, overriding migration and stochasticity, appears here a major evolutionary driving force of genotypic and phenotypic evolution. 2.4

The nature of allozyme and DNA diversities

We have shown by critical laboratory experiments on marine model organisms that allozymic diversity is adaptive and responds rapidly to environmental inorganic and organic pollution stresses [1,29,32,34]. Our experiments demonstrated unequivocably that ecological stresses of thermal and chemical pollution affect the levels of allele frequencies and heterozygosity of marine organisms corroborating the environmental theory of genetic diversity. The survivorship values of heterozygotes and homozygotes allozyme genotypes vary in accordance with the pollutant concentration. The broader the ecological niche, the higher the heterozygosity. At medium levels of pollution heterozygotes seem to be superior, whereas at high pollution levels specific homozygotes become superior. Our laboratory results on mercury pollution have been confirmed in the sea [42]. 3 3.1

Theory Maintenance of genetic diversity in mature

Explaining the maintenance of genetic diversity in natural populations has been a central problem of evolutionary genetics since the discovery of abundant protein polymorphisms in nature [20,25,26,31,36]. Heterosis alone is not a mechanism for maintaining many alleles segregating at a locus [21]. Hence, the resort for spatiotemporal driving forces at multilocus structures.

80

E. Nevo

111]

MM

B8 sqojooN

sqoiooN

(8 C i-S-

u

(V

w

J <s (g ^

\

Ter

i

re u

o c re O

1

::;;;;;:::; :';;*\ ^::::::::: :::£ - ^

I

S

sqojooN

Figure 3. Discriminant analysis between two soil types, basalt and terra rossa of wild emmer wheat {Triticum dicoccoides), across three microsites (Yehudiyya, Tabigha, and Ammiad) north of the Lake of Galilee, Israel. This analysis is based on 3, 5, 10 and 21 microsatellite loci, classifying correctly 83, 88, 90, and 91% of plants into their original soil types [24].

Molecular Evolution and Environmental-Stress

81

Figure 4. "Evolution Canyon", Lower Nahal Oren, Mount Carmel, Israel, cross section. Note the plant formation on opposite slopes. The green lush "European", temperate, cool-mesic, north-facing slope (NFS), sharply contrasts with the open park forest of warm-xeric, tropical "Afroasian" savannah on the south-facing slope (SFS) [35].

3.2

Natural selection

Theoretically, spatial and temporal variations of selection ("diversifying selection") could maintain genetic polymorphisms, though the conditions of their applicability are strongly limited (see citations of Levene, Haldane, Karlin,, Lande, Felsenstein, Hedrick, Hoekstra, Maynard Smith, in [36]. Spatial variation appeared more effective than temporal variation, though their coupled action could reinforce the maintenance of polymorphism. Most results related to selection variation in time analyzed the one locus case. However, polymorphism maintenance may be reinforced in the case of two loci [6,14,15,18,57] or multilocus structures [12]. The selective mechanism is much more effective in promoting genetic diversity if carriers of the alternative alleles are able to select the niche in which their fitness is greatest [54]. 3.3

Stabilizing selection in cyclical environments

In contrast to previous conclusions we found in both haploids and diploids, in the case of an additive two-locus model, that stabilizing selection with cyclically moving optimum may be an efficient factor of protecting polymorphisms for linked loci additively affecting the selected trait [15,18,19]. We proved that within the same class of fitness functions, nonequal gene action and/or dominance effect for one or both loci may lead to local polymorphism stability with substantial polymorphism attracting domain. We have demonstrated [16,19] that a multilocus system subjected to stabilizing selection with cyclically moving optimum can generate ubiquitous complex limiting behavior including supercycles, T-cycles, and chaotic-like phenomena. This mode of multilocus dynamics far exceeds the potential of complex dynamics

82

E. Nevo

attainable under ordinary selection models resulting in simple behavior. It may represent a novel evolutionary mechanism increasing genetic polymorphism over long-term periods [15,16]. 3.4

Selection versus genetic drift in small isolated populations

Remarkably, our findings of high level heterozygosity in small (several dozens or a hundred individuals) desert isolates of the subterranean mole rats, Spalax ehrenbergi superspecies (2n=60) in the northern Negev [33] contradicts both the Wrightian notion of genetic drift in small populations [56] and the extreme version of genetic drift, i.e., the neutral theory of molecular evolution [13]. The latter predicts positive correlation between effective population size (Ate) and heterozygosity H, while our finding demonstrated the opposite, i.e., the highest level of H in the smallest populations, or a negative correlation between H and population size. Current theoretical models predict fast gene fixation in small panmictic populations without selection, mutation, or gene inflow. Using simple multilocus models, we demonstrated that moderate stabilizing selection (with stable or fluctuating optimum) for traits controlled by additive genes could oppose random fixation in such isolates during thousands of generations [50]. We also showed that in selection-free models polymorphism persists only for a few hundred generations even under high mutation rates. 3.5

Genetic - ecological diversity and stress

The foregoing discussion suggests several explanations for the maintenance of genetic diversity subjected to ecological diversity and environmental stress. Spatial and temporal ecological variation, which predominate in nature, are of prime importance in maintaining genetic diversity in natural populations. This may be true because different genotypes display varying fitnesses in variable environments and stresses. Recombination frequencies and mutation rates tend to increase under stressful conditions [10,18]. Rates of evolutionary change are therefore enhanced in adverse environments, as we showed under controlled laboratory experiments in the case of mercury pollution [1] and under regional aridity stress across the Israeli physically stressful environment, and locally at "Evolution Canyon" because of high solar radiation on the South-Facing Slope [35]. Ecological heterogeneity and stress appear to cultivate genetic polymorphisms, particularly in dynamically cycling environments. Models of sexual reproduction as an adaptation to resist parasites [9], or abiotic stress, may also contribute to sex evolution, recombination and polymorphism. Finally, our simple model of genetic interaction between multiple species governed by abiotic and biotic selection for multilocus quantitative traits [17] opens wide horizons for the evolution of genetic diversity due to species dynamic interactions in nature.

Molecular Evolution and Environmental-Stress 4

83

Acknowledgements

This work was supported by the Israeli Discount Bank Chair of Evolutionary Biology and the Ancell-Teicher Research Foundation for Genetics and Molecular Evolution. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

12. 13. 14.

15.

Baker, R., Lavie, B. and Nevo, E., Natural selection for resistance to mercury pollution. Experientia 41 (1985) pp. 697-699. Bell, G., Selection. The Mechanism of Evolution. (Chapman and Hall, New York, 1997). Bijlsma, R., and Loeschcke, V., Environmental Stress, Adaptation and Evolution. (Birkhauser Verlag, Basel, 1997). Endler, J.A., Natural Selection in the Wild. (Princeton Univ. Press, New Jersey, 1986). Fisher, R.A., The Genetical Theory of Natural Selection. (Clarendon, Oxford, 1930). Gavrilets, S., and Hastings, A., Maintenance of multilocus variability under strong stabilizing selection. J. Math. Biol. 32 (1930) pp. 287-302. Blank Gillespie, J.H.L., The Causes of Molecular Evolution. (Oxford Univ. Press, Oxford, 1991). Hamilton, W.D., Axelrod, R., and Tanese, R., Sexual reproduction as an adaptation to resist parasites (A review). PNAS 87 (1990) pp. 3566-3573. Hoffmann, A.A., and Parsons, P.A., Evolutionary Genetics and Environmental Stress. (Oxford Univ. Press, Oxford. 1991). Kalendar, R., Tanskanen, J., Immonen, S., Nevo, E. and Schulman, A.H., Genome evolution of wild barley (Hordeum spontaneum) by BARE-1 retrotransposon dynamics in response to sharp microclimatic divergence. PNAS 97 (2000) pp. 6603-6607. Karlin, S., Principles of polymorphism and epistasis for multilocus systems. PNAS 76 (1979) pp. 541-545. Kimura, M., The Neutral Theory of Molecular Evolution. (Cambridge Univ. Press, Cambridge, 1983). Kirzhner, V.M., Korol, A.B. and Ronin, Y.I., The dynamics of linkage disequelibrium under temporal environmental fluctuations. Two-locus selction. Theor. Pop. Biol. 47 (1995) pp. 257-276. Kirzhner, V. M., Korol, A. B., and Ronin, Y. I., Cylical environmental changes as a factor maintaining genetic polymorphism. I. Two-locus haploid selection. J. Evol. Biol. 8 (1995) pp. 93-120.

84

E. Nevo

16.

Kirzhner, V.M., Korol, A.B. and Nevo, E., (1996).Complex dynamics of multilocus systems subjected to cyclical selection. PNAS 93 (1995) pp. 65326535. Kirzhner, V.M., A.B. Korol and E. Nevo., Abundant multilocus polymorphisms caused by genetic interaction between species on trait-for-trait bases. J. Theor. Biol. 198 (1999) pp. 61-70. Korol, A.B., Preygel, LA. and Preygel, S.L., Recombination Variability and Evolution. Algorithms of Estimation and Population Genetics Models. (Chapman and Hall, London, 1994). Korol, A.B., Kirzhner, V.M., Ronin, Y.I. and Nevo, E., Cyclical environmental changes as a factor of maintaining genetic polymorphism. 2. diploid selection for an additive trait. Evolution 50 (1996) pp. 1432-1441. Lewontin, R.C., The Genetic Basis of Evolutionary Change. (Columbia Univ. Press, New York, 1974). Lewontin, R.C., Ginzburg, L.R. and Tuliapurkar, D., Heterosis as an explanation for large amounts of genie polymorphism. Genetics 88 (1978) pp. 149-170. Li, Y.C., Fahima, T., Beiles, A., Korol, A.B. and Nevo, E., Microclimatic stress and adaptive DNA differentiation in wild emmer wheat (Triticum dicoccoides). Theor. Appl. Genet. 98 (1999) pp. 873-883. Li, Y.C., Roder, M.S., Fahima, T., Kirzhner, V.M., Beiles, A., Korol, A.B. and Nevo, E., Natural selection causing microsatellite divergence in wild emmer wheat at the ecologically variable microsite at Ammiad, Israel. Theor. Appl. Genet. 100 (2000) pp. 985-999. Li, Y.C., Fahima, T., Korol, A.B., Peng, J., Roder, M.S., Kirzhner, V.M., Beiles, A. and E. Nevo., Microsatellite diversity correlated with ecologicaledaphic and genetic factors in three microsites of wild emmer wheat in North Israel. Mol. Biol. Evol. 17 (2000) pp. 851-862. Mitton, J. B., Selection in Natural Populations. (Oxford Univ. Press, Oxford, 1997). Nevo, E., Genetic variation in natural populations: Patterns and theory. Theor. Pop. Biol. 13 (1978) pp. 121-177. Nevo, E., Population genetics and ecology: The interface. In Evolution from Molecules to Men, ed. Bendall, D.S. (Cambridge Univ. Press, Cambridge, 1986) pp. 287-321. Nevo, E., Adaptive significance of protein variation. In Protein Polymorphism: Adaptive and Taxonomic Significance, ed. Oxford, G.S. and Rollinson, D. (Academic Press, New York, 1983) pp. 239-282. Nevo, E., Pollution and genetic evolution in marine organisms: Theory and practice. In: Theory and Practice, ed. Dubinsky, B.Z. and Steinberger, Y. (Bar-Ilan Univ. Press, Ramat-Gan, 1986) pp. 841-848. Nevo, E., Genetic resources of wild cereals and crop improvement: Israel, a natural laboratory. Isr. J. Bot. 35 (1986) pp. 255-278.

17.

18.

19.

20. 21.

22.

23.

24.

25. 26. 27.

28.

29.

30.

Molecular Evolution and Environmental-Stress 31. 32.

33.

34.

35. 36. 37.

38.

39.

40.

41.

42.

43.

44.

85

Nevo, E., Genetic diversity in nature: patterns and theory. Evol. Biol. 23 (1988) pp. 217-247. Nevo, E., Molecular evolutionary genetics of isozymes: Patterns, theory and application. In Isozymes: Structure, Function and Use in Biology and Medicine, ed. Ogita, Z.-I. and Markert, C.L. (Wiley-Liss Inc., New York, 1990) pp.701-742. Nevo, E., Evolutionary theory and processes of active speciation and adaptive radiation in subterranean mole rats, Spalax ehrenbergi superspecies. Israel. Evol. Biol. 25 (1991) pp. 1-125. Nevo, E., Evolutionary significance of genetic diversity in nature: Environmental stress, pattern and theory. In Isozymes: Organization and Roles in Evolution, Genetics and Physiology, ed. Markert, C., Scandalios, L.J.G., Lim, H.A. and Serov, O.L. (World Scientific, New Jersey, 1994). pp. 267-296. Nevo, E., Evolution in action across phylogeny caused by microclimatic stresses at "Evolution Canyon". Theor. Pop. Biol. 52 (1997) pp. 231-243. Nevo, E., Molecular evolution and ecological stress at global, regional and local scales: The Israeli perspective. J. Exp. Zoo/. 282 (1998) pp. 95-119. Nevo, E., and Beiles, A., Genetic parallelism of protein polymorphism in nature: Ecological test of the neutral theory of molecular evolution. Biol. J. Linn. Soc. 35 (1988) pp. 229-245. Nevo, E., and Beiles, A., Genetic diversity of wild emmer wheat in Israel and Turkey: Structure, evolution and application in breeding. Theor. Appl. Genet. 77 (1989) pp. 421-455. Nevo, E., Beiles, A., and Ben-Shlomo, R., The evolutionary significance of genetic diversity: Ecological, demographic and life history correlates. Lect. Notes Biomathem. 53 (1984) pp. 13-213. Nevo, E., Beiles, A. and Krugman, T., Natural selection of allozyme polymorphisms: a microgeographic climatic differentiation in wild emmer wheat, Triticum dicoccoides. Theor. Appl. Genet. 75 (1988) pp. 529-538. Nevo, E., Beiles, A. and Krugman, T., Natural selection of allozyme polymorphisms: a microgeographic differentiation by edaphic, topographical and temporal factors in wild emmer wheat Triticum dicoccoides. Theor. Appl. Genet. 76 . (1988) pp. 737-752. Nevo, E., Ben-Shlomo, R. and Lavie, B., Mercury selection of allozymes in marine organisms: Prediction and verification in nature. PNAS 81 (1984) pp. 1258-1259. Nevo, E., Beiles, A. and Zohary, D., Genetic resources of wild barley in the Near East: Structure, evolution and application in breeding. Biol. J. Linn. Soc. 27 (1986a) pp. 355-380. Nevo, E., Filippucci, M. G., and Beiles, A., Genetic polymorphisms in subterranean mammals (Spalax ehrenbergi superspecies) in the Near East revisited: Patterns and theory. Heredity 72 (1994) pp.465-487.

86

E. Nevo

45.

Nevo, E., Noy-Meir, I., Beiles, A., Krugman, T. and Agami, M., Natural selection of allozyme polymorphisms: A micro- geographical spatial and temporal ecological differentiations in wild emmer wheat. Isr. J. Botany 40 (1991) pp. 419-449. Nevo, E., Filippucci, M. G., Redi, C , Simson, S., Heth, G. and Beiles, A., Karyotype and genetic evolution in speciation of subterranean mole rats of the genus Spalax in Turkey. Evol. J. Linn. Soc. 54 (1995) pp. 203-229. Nevo, E., Beiles, A., Kaplan, D., Golenberg, E.M., Olsvig-Whittaker, L. Naveh, Z., Natural selection of allozyme polymorphisms: A microsite test revealing ecological genetic differentiation in wild barley. Evolution 40 (1986) pp. 13-20. Nevo, E., Brown, A. H. D., Zohary, D., Storch, N. and Beiles, A., Microgeographic edaphic differentiation in allozyme polymorphisms of wild barley (Hordeum spontaneum, Poaceae) PL Syst. Evol. 138 (1981) pp. 287292. Nevo, E., Ben-Shlomo, R., Beiles, A., Ronin, Y.I., Blum, S., and Hillel, J., Genomic adaptive strategies: DNA fingerprinting and RAPDs reveal ecological correlates and genetic parallelism to allozyme and mitochondrial DNA diversities in the actively speciating mole rats in Israel. In Gene Families: Structure, Function, Genetics and Evolution, ed. Holmes, R.H. and Lim, H.A. (World Scientific Pub. Singapore, 1996) pp. 55-70. Nevo, E., Kirzhner, V. M., Beiles, A., and Korol, A. B., Selection versus random drift: Long-term polymorphism persistence in small populations (evidence and modelling). Phil. Tran. Roy. Soc. Lond. B352 (1997) pp. 381389. Ohta, T., and Gillespie, J.H., Development of neutral and nearly neutral theories. Theor. Popu. Biol. 49 (1996) pp. 128-142. Owuor, E.D., Fahima, T., Beiles, A., Korol, A.B. and Nevo, E., Population genetics response to microsite ecological stress in wild barley Hordeum spontaneum. Mol. Ecol. 6 (1997) pp. 1177-1187. Owuor, E.D., Fahima, T., Beharav, A., Korol, A.B. and Nevo, E., RAPD divergence caused by microsite edaphic selection in wild barley. Genetica 105 (1999) pp. 177-192. Taylor, C.E. and Powell, J.R., Habitat choice in natural populations of Drosophila. Genetics 85 (1977) pp. 681-695. Van Valen, L., Morphological variation and width of ecological niche. Am. Nat. 99 (1965) pp. 377-390.

46.

47.

48.

49.

50.

51. 52.

53.

54. 55.

Molecular Evolution and Environmental-Stress 56. 57.

Wright, S., Evolution in Mendelian Populations. Genetics 16 (1931) pp. 97159. Zhivotovsky, L.A. and Feldman, M.W., On models of quantitative genetic variability: a stabilizing selection-balance model. Genetics 130 (1992) pp. 947-955.

87

NITRIC OXIDE RELATED ENZYMES AND CORONARY ARTERY DISEASE XING L I W A N G

Department of Genetics, Southwest Foundation for Biomedical Research, P.O. Box 760549, San Antonio, Texas, 78245-0549, USA, and Department of Medicine, University of New South Wales, Sydney, Australia. email:

[email protected]

Endothelial nitric oxide synthase (eNOS) is constitutively expressed and produces constant NO output responsible for the basal vascular tone and vascular wall remodelling. Reduced NO production contributes significantly to vascular diseases. Sequence variants in the eNOS gene were associated with coronary atherosclerosis, hypertension, myocardial infarction and stroke in some studies but not in others. While frequency distributions of the eNOS sequence variations differ considerably among different ethnic populations, the genotype-phenotype associations appear to be cigarette smoking dependent. We suggest that certain genotype related CAD risk is conditional on the coexistence of a specific adverse environmental factor.

1

Nitric Oxide — A Double Edged Sword Effects on Atherogenesis

Coronary artery disease (CAD) has been the leading cause of mortality and morbidity in most western nations and recently emerges as the number one killer in many developing countries. Atherosclerosis, the pathology responsible for CAD, is a chronic process and often takes decades to develop before clinical manifestation presents. However, atherogenesis may start early in life among individuals with genetically determined high susceptibilities to CAD. For example, patients with familial hypercholesterolemia could have detectable fatty streaks during first decade. Atherosclerotic lesions may also progress faster in subjects exposed to adverse environmental factors. For example, heavy cigarette smokers may develop clinical CAD at 3 rd or 4th decade. However, it is also commonly aware that not every smoker develops atherosclerosis and not all familial hypercholesterolemic patients end up with heart attacks. It is the coexistence of the adverse environmental factors on the background of genetic susceptibility determining pathological and clinical courses of atherosclerosis. While extremities of genetic or environmental risk spectrum may be sufficient to convert a normal artery to an atherosclerotic one by its own force, the combination of gene and environment explains the majority of the pathogenesis. Nitric oxide (NO) is recognized as a key player in regulating physiological functions in many systems and has important implications for atherogenesis [1]. The biological effects of NO result either directly from reactions between NO and specific biological molecules through S-nitrosylation [2] or indirectly from reactive 89

90

X. L. Wang

nitrogen oxide species through oxidation [3]. These direct or indirect effects can produce both physiological and pathological outcomes. Physiologically, NO binds to the heme cofactor of soluble guanylate cyclase, forming a metal-nitrosyl adduct, which catalyzes the conversion of GTP to cGMP - an important secondary signal for diversified biological effects [4]. The NO once produced is catabolized by reacting at near diffusion limited rates with superoxide to form a powerful oxidant peroxynitrite, the end products of which are nitrite and nitrate (NOx). Peroxynitrite causes oxidative injury by oxidizing proteins, nonprotein thiols, and deoxyribose and membrane phospholipid [3]. Biological NO is mainly produced enzymatically from L-arginine by a family of three NO synthases (NOS) [5, 6] including neuronal NOS (nNOS or NOS1), inducible NOS (iNOS or NOS2) and endothelial NOS (eNOS or NOS3). The eNOS (7q35-36), confined mostly to endothelium and modestly to platelets [7, 8], plays a vital role in maintaining basal vascular NO production. Integrity of endothelial cells is crucial for appropriate eNOS production and enzyme activity. Constant NO production by eNOS mediates shear stress induced endothelial dependent vasodilatation and is essential for blood flow regulation, particularly coronary flow. NO inhibits vascular smooth muscle cell proliferation, and platelet and monocyte adhesion - actions which are relevant to early onset of atherogenesis [9, 10]. Vascular NO also has a significant role in vascular wall remodelling which is important for restenosis [11-14]. Reduction in basal NO release may predispose to hypertension, thrombosis, vasospasm and atherosclerosis [15, 16]; and restoration of NO activity can induce regression of pre-existing intimal lesions [17]. 2

Nitric Oxide and Oxidative Stress

Over-production of NO, on the other hand, will increase the accumulation of reactive oxygen species, particularly peroxynitrite as being detected abundantly in atherosclerotic lesions in the form of nitrotyrosine, in the vascular wall, and initiate or accelerate atherogenesis [3, 18]. The common cause for the NO over-production is inappropriate iNOS activation. Excessive NO imposes oxidative stress, exerts deleterious effects on functional endothelial cells and makes vascular wall prothrombotic and pro-occlusive. However, under physiological conditions, the human body possesses a highly efficient physiological detoxification system to counteract peroxynitrite toxicity [3, 19]. It includes the availability of superoxide dismutase (SOD), cysteine, methionine, urate and glutathione [3, 19]. SOD plays a central superoxide scavenging role, which preserves biological NO and prevents the generation of peroxynitrite [20-23]. Among three different SOD isoforms - copperand zinc-containing SOD (Cu,Zn-SOD), manganese-containing SOD (Mn-SOD) and extracellular-SOD (EC-SOD), EC-SOD is the principal enzymatic scavenger of superoxide in vascular wall. More than 99% of the total EC-SOD is anchored to

Nitric Oxide Related Enzymes and Coronary Artery Disease

91

heparan sulfate proteoglycans in the interstitium of tissues, especially blood vessel walls, and less than 1% is present in the circulation in equilibrium between the plasma phase and the glycocalyx of the endothelium [24]. 3

Interactive Effects of Cigarette Smoking and Nitric Oxide on Endothelial Dysfunction

Cigarette smoking is a significant risk factor for atherogenesis. Individuals who smoke have reduced endothelial dependent vessel relaxation, accelerated atherosclerosis plaque development and premature CAD [25-29]. Despite extensive research, the actual molecular mechanisms for atherogenic effect of cigarette smoking are still not clear. This is largely complicated by the fact that more than 3500 chemicals are generated during cigarette smoking. Some of these chemicals may be further modified by filtering effects of lung tissues and interactions with blood components before they can reach arteries for the atherogenic actions. In addition to carcinogens, cigarette smoking has a rich source of reactive oxygen species [30, 31]. It consumes endogenous antioxidant agents, exerts oxidative stress on biological systems and damages DNA [32]. Cigarette smoking not only contains a large amount of NO, it also affects endogenous NO production and impairs endothelial dependent vasodilatation [33, 34]. Cigarette smoking has the potential to either up- or down-regulate eNOS and/or iNOS or their necessary cofactors consequently altered NO production. It could directly block NO initiated signal transduction pathways for its biological effects. Cigarette smoking has significant effects on circulating EC-SOD levels [35]. Of 590 Caucasian patients, we showed that current smokers had the lowest circulating EC-SOD, the non-smokers had the highest and ex-smokers had the intermediate EC-SOD levels. The association maintains after sex and CAD disease status were controlled for. We have further demonstrated that the plasma EC-SOD decreases with age [36]. The EC-SOD levels was 50% - 67% lower in adults (67.4 ± 15.8 ug/L in males and 67.5 ± 17.4 u.g/L in females) than those in infants aged between 1 and 5 years (100.9 ± 18.9 iig/L in male infants and 112.8 + 28.0 u.g/L in female infants, p < 0.0001) [36]. This reflects a continuous decline in our SOD related antioxidation capacity, possibly by chronic exposure to oxidative stress, e.g. cigarette smoking. 4

Associations Between eNOS Polymorphisms and Vascular Disease

Sequence variations including single base change, CA repeats and variable number of tandem repeats have been reported in the promoter, exons and introns of the eNOS gene [37-41]. However, studies of the associations between these

92

X. L. Wang

polymorphisms and vascular diseases including coronary stenosis, myocardial infarction (MI), hypertension and stroke have not been consistent. We have explored three eNOS polymorphic markers in relation to coronary stenosis and MI. Firstly we investigated the distribution of the 27bp repeat polymorphism at intron 4 of the eNOS gene in 549 patients with, and 153 without angiographically defined CAD in relation to cigarette smoking [42]. The significant CAD was defined as having 50% or more luminal obstructions in major epicardial coronary arteries. The rest of the patients were shown either to have normal coronary arteries or mild lesions (luminal obstruction < 50%). There was a significant interactive effect between cigarette smoking and the eNOS polymorphism on CAD risk. In smokers, but not in non-smokers, there was an excess number of rare allele homozygous patients with severely stenosed arteries, compared with those with no or only mild stenosis. The odds ratio (OR) for the rare allele homozygous patients to have significant CAD was 1.44 (95%CI: 1.24 - 1.68). The presence of the rare allele was also associated with a history of MI. But these significant associations were only present among cigarette smokers. In search of potential functional changes in linkage with the intron 4 polymorphism, we have discovered a -786T—>C base substitution at the promoter region of the eNOS gene (GenBank ID: D26607) by denaturing gradient gel electrophoresis (DGGE) [43]. We genotyped the base change in 160 healthy Australian Caucasian adults (81 males, 79 females). The frequency distributions for genotype TT, TC and CC were 43.1%, 46.9% and 10.0% respectively, and the rare "C" allele frequency was 0.335. The polymorphism was in linkage disequilibrium with the 27bp repeat polymorphism (%2= 141.830, df = 4, p = 0.0001). In the same Australian Caucasian patient population studied for the intron 4 polymorphism, the frequency distributions of the TT, TC and CC genotypes were 38.0%, 46.9% and 15.2% with the "C" allele frequency 0.387, which were not different from those in healthy population (%2- 3.142, df = 2, p = 0.208). The rare "C" allele frequency was also not in patients with 0 (0.404), 1 (0.362), 2 (0.441) and 3 (0.371) significantly diseased vessels (%2= 1.339, p =0.461). This association was not affected by the status of cigarette smoking. We further examined the role of the 894G—»T (Glu298—>Asp) mutation in exon 7 of the eNOS gene in the same CAD population [44]. The frequencies of the GG, TG and TT genotypes were 46.5, 41.2 and 12.3% in patients with CAD, and were not significantly different from those without CAD (42.1, 44.0 and 13.6%, p 0.582). Nor was the mutation associated with MI (p - 0.469) or with the number of significantly stenosed vessels (p = 0.734). There was no interactive effect between the cigarette smoking and the 894G—>T mutation on CAD risk. This mutation was in linkage disequilibrium with the -786T-»C polymorphism at the promoter region (%2= 113.677, df = 4, p = 0.0001) and the 27bp repeat polymorphism at intron 4 (X2= 25.842, df = 4, p = 0.0001). However, neither the 894G-»T mutation nor the

93

Nitric Oxide Related Enzymes and Coronary Artery Disease

-786T-»C polymorphism was associated with CAD risk in our population with or without controlling the effect of cigarette smoking. 5

Population Diversity in Associating eNOS Polymorphisms and Vascular Disease

Although reports for the associations between vascular disease and eNOS polymorphisms are many, findings are far from consistent. The associations between the 894G-»T mutation in exon 7 and hypertension, MI, CAD and stroke were statistically significant only in some populations but not in others (Table 1). It should be noted that there was a significant difference in the rare allele frequency of the 894G—>T mutation between Japanese [45-49] and Caucasian populations [40, 50-54]. The rare "T" allele frequency in the Japanese population (patients and controls) was between 0.050 and 0.210 with an average of 0.090. In contrast, the rare allele frequency in Caucasians was between 0.278 and 0.440 with an average of 0.390. This was 3 times more common than that in Japanese. Table 1. Associations between the 894G->T (298Glu->Asp) Mutation in Exon 7 of the eNOS Gene and Vascular Disease in Different Populations.

Populations

N (Disease/ Control)

Japanese [46] 226/357 Japanese [45] 218/240 187/223 Japanese [47] 285/607 Japanese [48] 113/100 Japanese [49] 549/513 White [50] 361/236 White [40] 531/610 White [51] 309/123 White [52] 95/479 White [44] 605/158 White [54] 762/357 White [53] 298/138 249/183

Clinical End-points

Statistical findings

CAD MI HT HT MI HT C. spasm HT Stroke MI HT CAD, CAD, MI CAD CAD CAD

P > 0.05 P = 0.0085 P = 0.0017 P = 0.0013 P = 0.003 P > 0.05 P = 0.014 P > 0.05 P > 0.05 P = 0.009 P = 0.004 P > 0.05 P > 0.05 P > 0.05 P < 0.0001 P = 0.02

Rare "T" allele frequency 0.05-0.103 0.050-0.103 0.056-0.120 0.078-0.100 0.090-0.210 0.079-0.095 0.370-0.390 0.345-0.429 0.350-0.440 0.278 0.320-0.360 Not reported 0.312-0.478 0.308 - 0.397

Like the 894G—>T mutation, the published data for associations between the 27bp repeat polymorphism at intron 4 and vascular disease are also lack of

94

X. L. Wang

consistency (Table 2). While the frequency distributions of the polymorphism were similar in Japanese (rare allele: 0.100 - 0.140 with an average of 0.120) [46, 55-59], Koreans [60] and Caucasians (rare allele frequency: 0.134 - 0.144 with an average of 0.139) [42, 61], they were much lower than that in African Americans (0.26 0.36) [62]. Table 2. The 27bp Repeat Polymorphism at Intron 4 of the eNOS Gene and Vascular Disease in Different Populations.

Populations Japanese[46] Japanese[56] Japanese[57] Japanese[58] Japanese[59] Korean [60] African American [62] Turkish [63] White[42]

N (Disease/ control) 226/357 455/550 127/91 123/120 103/122 206/121 185/110

Clinical End-points CAD, MI CAD Stroke HT CAD, MI MI MI

Statistical Findings P > 0.05 P = 0.007 P > 0.05 P = 0.00027 P = 0.125 P = 0.009 P = 0.001

Rare allele frequency 0.109-0.124 0.100-0.140 0.132-0.138 0.037-0.114 0.078-0.130 0.124-0.136 0.260 - 0.360

95/24 401/148

Stroke CAD

P = 0.002 P = 0.0047

0.137-0.313 0.134-0.144

The -786T—»C base substitution at the promoter region was reported to be associated with coronary spasm in a Japanese population after comparing genotype frequencies in 174 patients and 161 controls [41]. However, we and others have found that the -786T—»C change was not associated with CAD in Caucasians [40, 43]. The rare "C" allele (0.335 - 0.387) in our population was about 10 times more frequent than that reported in the Japanese healthy group (0.035) and twice the frequency in the Japanese patient group (0.160). The reasons for the population related differences in allele frequencies remain to be explained. But we hypothesize that they could contribute to the inconsistent findings in associations between the eNOS polymorphisms and CAD in different populations, in addition to population specific environmental factors. In the case of the -786T—>C base change at the promoter region of eNOS, the observed population differences could reflect a population specific gene-environment interaction. It is known that incidence of CAD is lower in Japanese than that in Caucasians. However, comparing to Caucasians, cigarette smoking is more common among Japanese. If the detrimental effect of the -786T—>C substitution is only activated by cigarette smoking, it could explain the findings that the association is only significant in Japanese but not in Caucasians. This interactive effect could act as a natural selection force which makes the polymorphism much less frequent in Japanese. Alternatively, the low frequency of the polymorphism in Japanese could

Nitric Oxide Related Enzymes and Coronary Artery Disease

95

contribute to low CAD incidence. In Caucasians, however, the observed nonsignificant associations between the polymorphisms and CAD could also be because majority of the Caucasians were rare allele carriers. 6

Effect of eNOS Polymorphisms on Gene Expression and Nitric Oxide Production

Comparing to association studies, there are relatively little data for the functional significance of the reported eNOS sequence variants on gene expression. We have detected and quantitated the contribution of the 27bp repeat polymorphism at intron 4 to circulating NOx levels. Approximately 25% of circulating NOx variance was linked to the 27bp repeat polymorphism in a healthy Australian Caucasian population [61]. Individuals with the rare 4 repeats of 27-bp had significantly higher NO levels (64.85 ± 7.79 umol/L) compared to those with one (33.98 ± 3.47 umol/L) or two (30.17 ± 3.07 umol/L) common 5 repeat alleles under physiological conditions. In placental tissue, we have further demonstrated that the rare 27bp allele carrier had lower eNOS mRNA and protein levels, but higher eNOS specific enzyme activities. While cigarette smoking dramatically depressed the eNOS enzyme activity among rare allele carriers, it did not change the enzyme activity in common allele homozygotes [64]. In contrast, Tsukada et al found that the rare allele had lower rather than higher NOx levels in a Japanese population [55]. While we have no explanation for this apparent paradoxical finding, it could reflect a genotype-dependent and cigarette-specific interactive effect on eNOS enzyme activities, hence NO productions as we observed at tissue level. On the other hand, the levels of NOx were not significantly different among different -786T->C genotypes (TT: 35.9 ± 3.1 umol/L, TC: 40.0 ± 2.9 umol/L, CC: 37.7 ± 6.3 umol/L) in the same Caucasian family. We have also investigated the effects of the -786T—>C base change on transcriptional efficiency of the eNOS gene in a reporter vector [43]. The promoter activities of the "T" and "C" alleles, cloned up-stream of the luciferase gene (pGL3 Luciferase reporter vector) and expressed in HepG2 cells, were not different (relative RUL: 41824 + 2185 versus 39648 + 1385). In contrast, the same base substitution was reported to depress the promoter activity assessed by luciferase activity [41]. Both studies cloned the similar section of eNOS promoter region with the same base change. The only difference between two studies was that they cloned the DNA segment from a Japanese subject and we cloned our DNA from a Caucasian subject. Recently, Tesauro et al demonstrated that the 865 G—>T mutation was associated with an increased susceptibility to cleavage at the site of the mutation [65]. While the finding needs to be validated for its in vivo relevance, it nevertheless is consistent with our findings that the rare TT allele homozygotes had a reduced eNOS enzyme activity in placental tissue [64]. Furthermore, Philip et al

96

X. L. Wang

demonstrated an increased vascular response to phenylephrine in patients of the rare "T" allele carriers in 68 French patients undergoing coronary artery bypass or valve surgery [66]. 7

Summary

While there is no doubt that vascular NO production plays a significant role in different types of vascular disease, the associations between the DNA variants at the eNOS gene - the prime source of vascular NO - and vascular diseases are inconsistent. A same sequence variant can be "functional" in associating with vascular diseases and gene expression in one population but completely silent in the other. These inconsistent findings are further complicated by considerable differences in the rare allele frequency distributions among different ethnic populations, and extent of exposure to cigarette smoking - a potentially genotype dependent modifier of eNOS expression. This inconsistency in genotype-phenotype associations has also been reported in many other genes, which is reviewed in details by Dipple and McCabe [67]. Variability in severity of clinical phenotypes, environmental factors, independent allele contributions, gene-gene interactions have all been proposed as potential confounders and modifiers. We suggest that the observed associations between the eNOS polymorphisms and vascular diseases should be interpreted as population specific unless proven otherwise. It is high time that a properly designed study, perhaps with joint international efforts, to control for confounding factors, i.e. ethnicity, cigarette smoking and clinical end-point, should be conducted.

REFERENCES 1. 2.

Schmidt, H. H. & Walter, U. NO at work, Cell. 78 (1994) pp. 919-25. Melino, G., Bernassola, F., Knight, R. A., Corasaniti, M. T., Nistico, G. & Finazzi-Agro, A. S-nitrosylation regulates apoptosis, Nature. 388 (1997) pp. 432-3. 3. Beckman, J. S. & Koppenol, W. H. Nitric oxide, superoxide, and peroxynitrite: the good, the bad, and ugly, Am J Physiol. 271 (1996) pp. C1424-37. 4. Denninger, J. W. & Marietta, M. A. Guanylate cyclase and the .NO/cGMP signaling pathway, Biochim Biophys Acta. 1411 (1999) pp. 334-50. 5. Marietta, M. A. Nitric oxide synthase: aspects concerning structure and catalysis, Cell. 78 (1994) pp. 927-30. 6. Cooke, J. P. & Dzau, V. J. Nitric oxide synthase: role in the genesis of vascular disease, Annu Rev Med. 48 (1997) pp. 489-509. 7. Sase, K. & Michel, T. Expression of constitutive endothelial nitric oxide synthase in human blood platelets, Life Sci. 57 (1995) pp. 2049-55.

Nitric Oxide Related Enzymes and Coronary Artery Disease 8. 9.

10.

11.

12.

13. 14.

15.

16. 17.

18.

19.

20.

21.

97

Christopherson, K. S. & Bredt, D. S. Nitric oxide in excitable tissues: physiological roles and disease, J Clin Invest. 100 (1997) pp. 2424-9. Quyyumi, A. A., Dakak, N., Andrews, N. P., Husain, S., Arora, S., Gilligan, D. M., Panza, J. A. & Cannon, R. 0., 3rd. Nitric oxide activity in the human coronary circulation. Impact of risk factors for coronary atherosclerosis, J Clin Invest. 95 (1995) pp. 1747-55. 10. Rudic, R. D. & Sessa, W. C. Nitric Oxide in Endothelial Dysfunction and Vascular Remodeling: Clinical Correlates and Experimental Links, Am J Hum Genet. 64 (1999) pp. 673-677. Arribas, S. M., Gonzalez, C , Graham, D., Dominiczak, A. F. & McGrath, J. C. Cellular changes induced by chronic nitric oxide inhibition in intact rat basilar arteries revealed by confocal microscopy, J Hypertens. 15 (1997) pp. 1685-93. Novotna, J. & Herget, J. Exposure to chronic hypoxia induces qualitative changes of collagen in the walls of peripheral pulmonary arteries, Life Sci. 62 (1998) pp. 1-12. Gibbons, G. H. & Dzau, V. J. The emerging concept of vascular remodeling, N Engl J Med. 330 (1994) pp. 1431-8. Krams, R., Wentzel, J. J., Oomen, J. A., Schuurbiers, J. C , Andhyiswara, I., Kloet, J., Post, M., de Smet, B., Borst, C , Slager, C. J. & Serruys, P. W. Shear stress in atherosclerosis, and vascular remodelling, Semin Interv Cardiol. 3 (1998) pp. 39-44. Oemar, B. S., Tschudi, M. R., Godoy, N., Brovkovich, V., Malinski, T. & Luscher, T. F. Reduced endothelial nitric oxide synthase expression and production in human atherosclerosis, Circulation. 97 (1998) pp. 2494-8. Harrison, D. G. Cellular and molecular mechanisms of endothelial cell dysfunction, J Clin Invest. 100 (1997) pp. 2153-7. Candipan, R. C , Wang, B. Y., Buitrago, R., Tsao, P. S. & Cooke, J. P. Regression or progression. Dependency on vascular nitric oxide, Arterioscler Thromb Vase Biol. 16 (1996) pp. 44-50. White, C. R., Brock, T. A., Chang, L. Y„ Crapo, J., Briscoe, P., Ku, D., Bradley, W. A., Gianturco, S. H., Gore, J., Freeman, B. A. & et al. Superoxide and peroxynitrite in atherosclerosis, Proc Natl Acad Sci U S A. 91 (1994) pp. 1044-8. Kroncke, K. D., Fehsel, K. & Kolb-Bachofen, V. Nitric oxide: cytotoxicity versus cytoprotection-how, why, when, and where?, Nitric Oxide. 1 (1997) pp. 107-20. Huraux, C , Makita, T., Kurz, S., Yamaguchi, K., Szlam, F., Tarpey, M. M., Wilcox, J. N., Harrison, D. G. & Levy, J. H. Superoxide production, risk factors, and endothelium-dependent relaxations in human internal mammary arteries, Circulation. 99 (1999) pp. 53-9. Isoherranen, K., Peltola, V., Laurikainen, L., Punnonen, J., Laihia, J., Ahotupa, M. & Punnonen, K. Regulation of copper/zinc and manganese superoxide

98

22.

23.

24.

25.

26.

27.

28. 29.

30.

31. 32.

33.

34.

X. L. Wang dismutase by UVB irradiation, oxidative stress and cytokines, J Photochem Photobiol B. 40 (1997) pp. 288-93. Johansson, M. H., Deinum, J., Marklund, S. L. & Sjoquist, P. O. Recombinant human extracellular superoxide dismutase reduces concentration of oxygen free radicals in the reperfused rat heart, Cardiovasc Res. 24 (1990) pp. 500-3. Jackson, T. S., Xu, A., Vita, J. A. & Keaney, J. F., Jr. Ascorbate prevents the interaction of superoxide and nitric oxide only at very high physiological concentrations, Circ Res. 83 (1998) pp. 916-22. Sandstrom, J., Karlsson, K., Edlund, T. & Marklund, S. L. Heparin-affinity patterns and composition of extracellular superoxide dismutase in human plasma and tissues, Biochem J. 294 (1993) pp. 853-7. Poredos, P., Orehek, M. & Tratnik, E. Smoking is associated with dose-related increase of intima-media thickness and endothelial dysfunction, Angiology. 50 (1999) pp. 201-8. 26. Lekakis, J., Papamichael, C , Vemmos, C , Stamatelopoulos, K., Voutsas, A. & Stamatelopoulos, S. Effects of acute cigarette smoking on endotheliumdependent arterial dilatation in normal subjects, Am J Cardiol. 81 (1998) pp. 1225-8. Blann, A. D., Kirkpatrick, U., Devine, C , Naser, S. & McCollum, C. N. The influence of acute smoking on leucocytes, platelets and the endothelium, Atherosclerosis. 141 (1998) pp. 133-9. Nagy, J., Demaster, E. G., Wittmann, I., Shultz, P. & Raij, L. Induction of endothelial cell injury by cigarette smoke, Endothelium. 5 (1997) pp. 251-63. Price, J. F., Mowbray, P. I., Lee, A. J., Rumley, A., Lowe, G. D. & Fowkes, F. G. Relationship between smoking and cardiovascular risk factors in the development of peripheral arterial disease and coronary artery disease: Edinburgh Artery Study, Eur Heart J. 20 (1999) pp. 344-53. Pryor, W. A. & Stone, K. Oxidants in cigarette smoke. Radicals, hydrogen peroxide, peroxynitrate, and peroxynitrite, Ann N Y Acad Sci. 686 (1993) pp. 12-27; discussion 27-8. Cross, C. E., van der Vliet, A. & Eiserich, J. P. Cigarette smokers and oxidant stress: a continuing mystery, Am J Clin Nutr. 67 (1998) pp. 184-5. Muller, T. & Gebel, S. The cellular stress response induced by aqueous extracts of cigarette smoke is critically dependent on the intracellular glutathione concentration, Carcinogenesis. 19 (1998) pp. 797-801. Sarkar, R., Gelabert, H. A., Mohiuddin, K. R., Thakor, D. K. & SantibanezGallerani, A. S. Effect of cigarette smoke on endothelial regeneration in vivo and nitric oxide levels, J Surg Res. 82 (1999) pp. 43-7. Su, Y., Han, W., Giraldo, C , De Li, Y. & Block, E. R. Effect of cigarette smoke extract on nitric oxide synthase in pulmonary artery endothelial cells, Am J Respir Cell Mol Biol. 19 (1998) pp. 819-25.

Nitric Oxide Related Enzymes and Coronary Artery Disease

99

35. Wang, X. L., Adachi, T., Sim, A. S. & Wilcken, D. E. Plasma extracellular superoxide dismutase levels in an Australian population with coronary artery disease, Arterioscler Thromb Vase Biol. 18 (1998) pp. 1915-21. 36. Adachi, T., Wang, J., Wang, X.L. Age-related change of plasma extracellularsuperoxide dismutase., Clin ChimActa. 290 (2000) pp. 169-178. 37. Marsden, P. A., Schappert, K. T., Chen, H. S., Flowers, ML, Sundell, C. L., Wilcox, J. N., Lamas, S. & Michel, T. Molecular cloning and characterization of human endothelial nitric oxide synthase, FEBS Lett. 307 (1992) pp. 287-93. 38. Nakayama, T., Soma, M., Izumi, Y., Kanmatsuse, K. & Esumi, M. CA repeat polymorphism of the endothelial nitric oxide synthase gene in the Japanese, Hum Hered. 45 (1995) pp. 301-2. 39. Miyahara, K., Kawamoto, T., Sase, K., Yui, Y., Toda, K., Yang, L. X., Hattori, R., Aoyama, T., Yamamoto, Y., Doi, Y. & et al. Cloning and structural characterization of the human endothelial nitric- oxide-synthase gene, Eur J Biochem. 223 (1994) pp. 719-26. 40. Poirier, O., Mao, C , Mallet, C , Nicaud, V., Herrmann, S. M., Evans, A., Ruidavets, J. B., Arveiler, D., Luc, G., Tiret, L., Soubrier, F. & Cambien, F. Polymorphisms of the endothelial nitric oxide synthase gene - no consistent association with myocardial infarction in the ECTIM study, Eur J Clin Invest. 29(1999) pp. 284-90. 41. Nakayama, M., Yasue, H., Yoshimura, M., Shimasaki, Y., Kugiyama, K., Ogawa, H., Motoyama, T., Saito, Y., Ogawa, Y., Miyamoto, Y. & Nakao, K. T786~>C mutation in the 5'-flanking region of the endothelial nitric oxide synthase gene is associated with coronary spasm, Circulation. 99 (1999) pp. 2864-70. 42. Wang, X. L., Sim, A. S., Badenhop, R. F., McCredie, R. M. & Wilcken, D. E. A smoking-dependent risk of coronary artery disease associated with a polymorphism of the endothelial nitric oxide synthase gene, Nat Med. 2 (1996) pp. 41-5. 43. Sim, A. S., Wang, J., Wilcken, D. & Wang, X. L. Mspl polymorphism in the promoter of the human endothelial constitutive NO synthase gene in australian Caucasian population, Mol Genet Metab. 65 (1998) pp. 62. 44. Cai, H., Wilcken, D. E. L. & Wang, X. L. The Glu298-Asp (894G-T) mutation at exon 7 of the endothelial nitric oxide synthase gene and coronary artery disease., Journal ofMolecular Medicine. 146(1999) pp.511-514. 45. Miyamoto, Y., Saito, Y., Kajiyama, N., Yoshimura, M., Shimasaki, Y., Nakayama, M., Kamitani, S., Harada, M., Ishikawa, M., Kuwahara, K., Ogawa, E., Hamanaka, I., Takahashi, N., Kaneshige, T., Teraoka, H., Akamizu, T., Azuma, N., Yoshimasa, Y., Yoshimasa, T., Itoh, H., Masuda, I., Yasue, H. & Nakao, K. Endothelial nitric oxide synthase gene is positively associated with essential hypertension, Hypertension. 32 (1998) pp. 3-8. 46. Hibi, K., Ishigami, T., Tamura, K., Mizushima, S., Nyui, N., Fujita, T., Ochiai, H., Kosuge, M., Watanabe, Y., Yoshii, Y., Kihara, M., Kimura, K., Ishii, M. &

100

47.

48.

49.

50.

51.

52.

53.

54.

55.

56.

X. L. Wang Umemura, S. Endothelial nitric oxide synthase gene polymorphism and acute myocardial infarction, Hypertension. 32 (1998) pp. 521-6. Shimasaki, Y., Yasue, H., Yoshimura, M., Nakayama, M., Kugiyama, K., Ogawa, H., Harada, E., Masuda, T., Koyama, W., Saito, Y., Miyamoto, Y., Ogawa, Y. & Nakao, K. Association of the missense Glu298Asp variant of the endothelial nitric oxide synthase gene with myocardial infarction, J Am Coll Cardiol. 31 (1998) pp. 1506-10. Yoshimura, M., Yasue, H., Nakayama, M., Shimasaki, Y., Sumida, H., Sugiyama, S., Kugiyama, K., Ogawa, H., Ogawa, Y., Saito, Y., Miyamoto, Y. & Nakao, K. A missense Glu298Asp variant in the endothelial nitric oxide synthase gene is associated with coronary spasm in the Japanese, Hum Genet. 103 (1998) pp. 65-9. Kato, N., Sugiyama, T., Morita, H., Nabika, T., Kurihara, H., Yamori, Y. & Yazaki, Y. Lack of evidence for association between the endothelial nitric oxide synthase gene and hypertension, Hypertension. 33 (1999) pp. 933-6. Markus, H. S., Ruigrok, Y., Ali, N. & Powell, J. F. Endothelial nitric oxide synthase exon 7 polymorphism, ischemic cerebrovascular disease, and carotid atheroma, Stroke. 29 (1998) pp. 1908-11. Lacolley, P., Gautier, S., Poirier, O., Pannier, B., Cambien, F. & Benetos, A. Nitric oxide synthase gene polymorphisms, blood pressure and aortic stiffness in normotensive and hypertensive subjects, J Hypertens. 16 (1998) pp. 31-5. Cai, H., Wang, X. L., Colagiuri, S. & Wilcken, D. E. A common Glu298->Asp (894G-->T) mutation at exon 7 of the endothelial nitric oxide synthase gene and vascular complications in type 2 diabetes, Diabetes Care. 21 (1998) pp. 2195-6. Hingorani, A. D., Liang, C. F., Fatibene, J., Lyon, A., Monteith, S., Parsons, A., Haydock, S., Hopper, R. V., Stephens, N. G., O'Shaughnessy, K. M. & Brown, M. J. A Common Variant of the Endothelial Nitric Oxide Synthase (Glu(298)— >Asp) Is a Major Risk Factor for Coronary Artery Disease in the UK, Circulation. 100 (1999) pp. 1515-1520. Liyou, N., Simons, L., Friedlander, Y., Simons, J., McCallum, J., O'Shaughnessy, K., Davis, D. & Johnson, A. Coronary artery disease is not associated with the E298—»D variant of the constitutive, endothelial nitric oxide synthase gene, Clin Genet. 54 (1998) pp. 528-9. Tsukada, T., Yokoyama, K., Arai, T., Takemoto, F., Hara, S., Yamada, A., Kawaguchi, Y., Hosoya, T. & Igari, J. Evidence of association of the ecNOS gene polymorphism with plasma NO metabolite levels in humans, Biochem Biophys Res Commun. 245 (1998) pp. 190-3. Ichihara, S., Yamada, Y., Fujimura, T., Nakashima, N. & Yokota, M. Association of a polymorphism of the endothelial constitutive nitric oxide synthase gene with myocardial infarction in the Japanese population, Am J Cardiol. 81 (1998) pp. 83-6.

Nitric Oxide Related Enzymes and Coronary Artery Disease

101

57. Yahashi, Y., Kario, K., Shimada, K. & Matsuo, M. The 27-bp repeat polymorphism in intron 4 of the endothelial cell nitric oxide synthase gene and ischemic stroke in a Japanese population, Blood Coagul Fibrinolysis. 9 (1998) pp. 405-9. 58. Uwabo, J., Soma, M., Nakayama, T. & Kanmatsuse, K. Association of a variable number of tandem repeats in the endothelial constitutive nitric oxide synthase gene with essential hypertension in Japanese, Am J Hypertens. 11 (1998) pp. 125-8. 59. Odawara, M., Sasaki, K., Tachi, Y. & Yamashita, K. Endothelial nitric oxide synthase gene polymorphism and coronary heart disease in Japanese NIDDM [letter], Diabetologia. 41 (1998) pp. 365-6. 60. Park, J. E., Lee, W. H., Hwang, T. H., Chu, J. A., Kim, S., Choi, Y. H., Kim, J. S., Kim, D. K., Lee, S. H., Hong, K. P., Seo, J. D. & Lee, W. R. Aging affects the association between endothelial nitric oxide synthase gene polymorphism and acute myocardial infarction in the Korean male population, Korean J Intern Med. 15 (2000) pp. 65-70. 61. Wang, X. L., Mahaney, M. C , Sim, A. S., Wang, J., Blangero, J., Almasy, L., Badenhop, R. B. & Wilcken, D. E. Genetic contribution of the endothelial constitutive nitric oxide synthase gene to plasma nitric oxide levels, Arterioscler Thromb Vase Biol. 17 (1997) pp. 3147-53. 62. Hooper, W. C , Lally, C , Austin, H., Benson, J., Dilley, A., Wenger, N. K., Whitsett, C , Rawlins, P. & Evatt, B. L. The relationship between polymorphisms in the endothelial cell nitric oxide synthase gene and the platelet GPIIIa gene with myocardial infarction and venous thromboembolism in African Americans, Chest. 116 (1999) pp. 880-6. 63. Akar, N., Akar, E., Cin, S., Deda, G., Avcu, F. & Yalcin, A. Endothelial nitric oxide synthase intron 4, 27 bp repeat polymorphism in Turkish patients with deep vein thrombosis and cerebrovascular accidents, Thromb Res. 94 (1999) pp. 63-4. 64. Wang, X. L., Sim, A. S., Wang, M. X., Murrell, G. A., Trudinger, B. & Wang, J. Genotype dependent and cigarette specific effects on endothelial nitric oxide synthase gene expression and enzyme activity, FEBS Lett. 471 (2000) pp. 4550. 65. Tesauro, M., Thompson, W. C , Rogliani, P., Qi, L., Chaudhary, P. P. & Moss, J. Intracellular processing of endothelial nitric oxide synthase isoforms associated with differences in severity of cardiopulmonary diseases: cleavage of proteins with aspartate vs. glutamate at position 298, Proc Natl Acad Sci U S A. 97 (2000) pp. 2832-5.

102

X. L Wang

66. Philip, I., Plantefeve, G., Vuillaumier-Barrot, S., Vicaut, E., LeMarie, C , Henrion, D., Poirier, O., Levy, B. I., Desmonts, J. M., Durand, G. & Benessiano, J. G894T polymorphism in the endothelial nitric oxide synthase gene is associated with an enhanced vascular responsiveness to phenylephrine, Circulation. 99 (1999) pp. 3096-8. 67. Dipple, K. M. & McCabe, E. R. Phenotypes of Patients with "Simple" Mendelian Disorders Are Complex Traits: Thresholds, Modifiers, and Systems Dynamics, Am J Hum Genet. 66 (2000) pp. 17291735.

PATHWAYS, COMPARTMENTATION AND GENE EVOLUTION C. SCHNARRENBERGER1, AND C.F. MARTIN2 'Freie Universitat Berlin, Institutfiir

Biologic, Konigin-Luise-Str.

12-16a, 14195 Berlin,

Germany E-mail: 2

Heinrich-Heine

[email protected]

Universitat Diisseldorf, Institutfiir Botanik III, Universitdtsstr. Diisseldorf, Germany E-mail:

1, 40225

[email protected]

A gene-for-gene phylogeny approach has been used to study the evolution of the central pathways of carbohydrate metabolism distributed between chloroplasts, the cytosol, mitochondria and glyoxysomes in higher plants: the Calvin cycle, glycolysis/gluconeogenesis, the tricarboxylic acid (TCA) cycle, and the glyoxylate cycle. In general, the gene trees support the view that the nuclear genes for Calvin cycle enzymes were acquired from the cyanobacterial progenitors of plastids and were transferred to the nucleus. But genes for two Calvin cycle appear to have originated from Ot-proteobacterial (mitochondrial) progenitors, the encoded products having been routed to a new organelle subsequent to gene transfer. The genes for glycolytic/gluconeogenetic enzymes in the cytosol also seem to be acquisitions from organellar genomes. Most TCA cycle genes originated from OC-proteobacterial progenitors, with two notable exceptions that seem to represent inheritances from archaebacteria. The genes for glyoxylate cycle enzymes show marked proteobacterial affinities, whereby some have originated through duplications of preexisting genes for mitochondrial isoenzymes. Overall, the trees underscore the role of gene transfer, gene duplication and gene diversification in the evolution of compartmentalized metabolism in higher plants.

1

Introduction

Genome sequencing has produced burgeoning amounts of data for enzymes of primary metabolism for representatives of several higher categories: archaebacteria, eubacteria, animals, fungi, and plants. In the past few years, our interest has focussed on the understanding of how individual genes for the enzymes specific to higher plant cell compartments arose during organismal evolution. Organismal evolution is mostly deduced from a series of morphological and biochemical and characters as well as, more recently, through molecular phylogenetics most notably involving small subunit rRNA sequence comparisons. In studying the evolution of compartmentalized metabolism, we have found it useful to consider the evolution not only of individual enzymes, but to also consider the evolution of pathways in toto, in order to obtain insights that would be unattainable through the consideration of single enzymes alone. Pathways are generally considered as physiological units of function which accomplish the 103

104

C. Schnarrenberger & C. F. Martin

conversion of a substrate through a series of intermediates to a defined product. Conventional wisdom concerning pathway evolution would have us suppose that metabolic pathways were assembled from their constituent building blocks (enzymes) once during the ancient phases of cell evolution and that they were maintained and vertically inherited as intact units from one to the next generation, right up to today's organisms. This view, though appealing in many ways, has gone largely untested over the years. Today, phylogenetic analyses can help to clarify the evolution of individual genes. By performing phylogenetic analyses on a gene-forgene basis for a given pathway, we can retrace the footsteps of evolution for an entire pathway and thus draw one or the other inference about how some biochemical pathways actually have evolved. For pathways specific to chloroplasts and mitochondria, endosymbiosis and gene transfer to the nucleus during the process of organelle genome reduction are very important issues, because they represent the transition from prokaryotic to eukaryotic genes and organisms. For such transitions, gene transfers through endosymbiontic events involving cyanobacteria in the case of chloroplasts and ocproteobacteria in the case of mitochondria are well known (see Figure 1). These organelles still possess some of their own original eubacterial DNA, yet most of it was transferred to the nucleus through a yet unknown mechanisms. The genes of the nuclear-cytoplasmic system are generally supposed to have originated from archaebacteria as deduced from analyses of sequences of 18S RNA and proteins of the transcription and translation systems. In this paper we will provide some of the conclusions that we have obtained while analyzing the evolution of pathways which are typical for various plant cell compartments. Among pathways of sugar metabolism, the Calvin cycle is specific for chloroplasts. These enzymes are interrelated to several homologues in the pathways of glycolysis/gluconeogenesis in the cytosol of plants and other eukaryotes. In a more recent study we have also analyzed the genes of the TCA cycle in mitochondria and of the glyoxylate cycle in microbodies and the cytosol. 2

Methods

Sequence analyses were performed with the GCG program [5]. Amino acid sequences were aligned with ClustalW [27]. The alignments were refined by hand, eliminating amino acid positions which were not present in at least 40% of the sequences. Pairwise distances between sequences were estimated using the Dayhoff matrix option of protdist in PHYLIP [6]. Trees were constructed by the neighborjoining method [23]. The reliability was estimated by taking 100 bootstrap samples.

Pathways, Compartmentation and Gene Evolution

Plants

Animals

Fungi

105

Pietists/ Algae

,Nuc]-|Cyto- MicL'oI I 1 bodies

-3

u W

_g o

Eukaryotic Cell

•u

o

Fig. 1. Model for gene origins and gene transfers during eukaryotic evolution. Bold lines indicate cell partners implicated in the evolution of eukaryotic cells. Solid lines indicate gene transfers. Dotted lines indicate intracellular routing and compartmentation of nuclear-encoded proteins. Triangles indicate organismic diversification and gene deversification within major phyla.

3 3.1

Results The origin of the Calvin cycle genes

There are 11 nuclear-encoded genes and 1 chloroplast-encoded gene for the 13 reactions catalyzed in the Calvin cycle (aldolase and transketolase are bifunctional enzymes and Rubisco is encoded by both a nuclear and a plastidic gene). The trees of the nuclear-encoded genes support the view that the chloroplast enzymes of Rubisco (L8Sg form), 3-P-glycerate kinase, NADP-glyceraldehyde-3-P DH, ribulose-5-P epimerase, transketolse, and P-ribulose kinase originated from cyanobacterial progenitor genes (for summary see [16]). This is deduced from the findings that the respective sequences of the chloroplast proteins have the highest degree of identity with the cyanobacterial proteins. At least two genes (triose-P isomerase and fructose-1,6-bisPase) originate from a-proteobacteria because their sequence identity is much higher with the respective oc-proteobacterial than cyanobacterial genes. They most likely replaced originally present cyanobacterial gene products. In the 2 instances of aldolase and sedoheptulose-l,7-bisPase the origin of the genes cannot yet be determined due to insufficent availability of outgroups (see also Figure 2).

106

C. Schnarrenberger & C. F. Martin

There is a further complication in the gene analyses. Fructose- 1,6-bisPase and sedoheptulose-1,7- bisPase sequences belong to the same gene family. In eubacteria from which all eukaryotic genes of fructose- 1,6-bisPase evolved there are only bifunctional enzymes, active with both substrates. However, there are separate genes encoding the two activities in higher plants. The sequence of the higher plant sedoheptulose-l,7-bisPase shows only very low, but significant identity with bacterial and eukaryotic sequences so that its origin cannot be evaluated at present [15].

ADP

C

°2

^ P Enr>rnes encoded by genes of cyanobacferial origin

©

E nrymes encoded by genes of a-proteobacteri a I on gin C~) Origin of gems yet uncerftln

Fig. 2. Enzymes and isoenzymes of the Calvin cycle and of glycolysis/gluconeogenesis in the chloroplasts and in the cytosol of higher plants. The higest sequence similarity with cyanobacterial (Synechocystis) and a-proteobacterial (Ralstonia) homologues is indicated by different filled patterns of subunits for individual enzymes. FT, Phosphate translocator.

3.2

The problem of class I and class II genes of the Calvin cycle

There are many cases of so-called class I and class II enzymes, i.e. enzymes which do not share significant sequence identity but encode proteins with the same enzymic activity. Among them are the genes for Rubisco [7, 29], aldolase [21], glyceraldehyde-3-P dehydrogenase [3], and P-ribulokinase [7, 28, 29]. The occurrence of such proteins implies a functional convergence of non-related genes in some cases (aldolase) and residual similarity of extremely distantly related proteins in other cases (Rubisco). If two enzymes share more than 25% amino acid sequence identity, we consider this to be significant evidence for common ancestry. For comparison, randomly selected genes have about 18% sequence identity. The

Pathways, Compartmentation and Gene Evolution

107

most prominent example is Rubisco. The class I enzyme is hexadodecameric, the class II enzyme is homodimeric. The reactions mechanism of the two types of enzymes is very similar. Class I Rubisco is typical for cyanobacteria and eukaryotes. Class II Rubisco was originally only found in bacteria, but has recently also been discovered in eukaryotic dinoflagellates [17, 20]. Although the sequence identity is about 29% between the large subunit of class I Rubisco and the subunit of class II Rubisco, they behave like typical class I/II enzymes. There is no report of both classes of Rubisco in one and the same organism, even not in bacteria. It is also remarkable that the gene for the small subunit of the L8Sg Rubisco is usually encoded by nuclear DNA, but by plastidic DNA in the alga Cyanophora paradoxa [12]. The most curious example of class I and class II enzymes is among aldolases. The reaction mechanism is very different for class I and class II enzymes (Rutter 1964). Class I aldolases form a Schiff base as reaction intermediate and class II aldolases an endiol intermediate. Both classes of enzymes have been found in bacteria and eukaryotes (see Figure 3). Some eubacteria and some archaebacteria even possess both classes of enzymes. In eukaryotes, class I and class II aldolases in one organism is only observed in photosynthetic algae which have aldolases in different cellular compartments, i.e. in the plastids and in the cytosol [1, 8, 13, 26,]. As for the evolution of class I aldolases, the chloroplast enzyme of higher plants can hardly descend from a cyanobacterial precursor gene because known cyanobacterial aldolases are all of class II type and, thus, not related to class I aldolases [19]. However, Cyanophora paradoxa (glaucocystophyta) has a cytosolic and a plastidic class II aldolase, indicating that an endosymbiotic transfer of a class II aldolase from cyanabacteria to eukaryotes took place in at least this lineage [8]. The other eukaryotic class II aldolases analyzed so far are the cytosolic enzymes of yeast and Euglena [19].

C. Schnarrenberger & C. F. Martin Higher plants

Fhaeophyte I+11 Xanthophyceae

Vertebrate

I+I Centrales

Archaebacteria

l

1 1 1 n | | I+11 |

Cyanophora II+11 Rhodophyte I+11 |

Chloroplasts from secondary eridosymbiosis

Chbroplasts from primary endosymbiosis

Eubacteria

Cyanobacteria

1 I 1 1 II 1 1 I+H 1

1

II 1

Fig. 3. Distribution of Class I and Class II aldolases in various taxa.

3.3

Multiple origin of chloroplast and cytosol isozymes

Plants contain two sets of isozymes of glycolysis and gluconeogenesis in the plastids and in the cytosol, respectively [4, 26]. Among the Calvin cycle enzymes there are five activities that are involved in glycolysis/gluconeogenesis in the cytosol and in the Calvin cycle in chloroplasts: 3-P-glycerate kinase, glyceraldehyde-3-P DH, triose-P isomerase, aldolase, and fructose-1,6-bisPase. The other enzymes of the Calvin cycle are present as single enzymes in the chloroplasts only [25]. The gene duplications giving rise to isozymes of glycolysis/gluconeogenesis occurred (a) already in bacterial evolution as for glyceraldehyde-3-P DH [14], (b) in early eukaryotic evolution as for aldolase [19], and (c) in early plant evolution as for 3-P-glycerate kinase [3], triose-P isomerase [10, 22], and fructose-1,6-bisPase [16]. Aldolase isozymes provide another surprise. While a chloroplast and a cytosol aldolase of class I type is present in higher plants, the glaucocystophytic alga Cyanophora paradoxa has two such isozymes of the class II type. Since class I and II aldolases do not have any sequence similarity but more importantly have different mechanisms, they can hardly have evolved from a common ancestor. Instead, one has to suppose that the chloroplast and cytosol specific isozymes developed twice independently in evolution. In a sequence tree the cytosolic enzymes of the class II type from fungi and Euglena gracilis cluster together and evolved from a bacterial subcluster different from the sequence of the plastidic class II aldolase from Cyanophora paradoxa which evolved from another bacterial cluster related to cyanobacteria (Schnarrenberger and Martin, unpublished). Therefore, the gene

Pathways, Compartmentation and Gene Evolution

109

the plastidic aldolase of Cyanophora paradoxa documets that a transition of this gene from cyanobacteria to eukaryotes took place at least once in evolution. 3.4

Evolution of TCA cycle and glyoxylate cycle genes

The nine reactions of the TCA cycle reside inside the mitochondria. Because ocoxoglutarate DH showed homology to subunits of two other enzymes, we also included the branched-chain oc-oxoacid DH and pyruvate DH in our analyses. Closely related to the TCA cycle is the glyoxylate cycle, consisting of five enzymes [11]. Two of the reactions are specific for the glyoxylate cycle (isocitrate lyase and malate synthase). The other three reactions (citrate synthase, aconitase, and malate DH) function also in the of TCA cycle and are due to isozymes. In eukaryotes four of the reactions are located inside the microbodies (termed glyxoysomes in plants) and one (aconitase) is located in the cytosol. The gene trees for the enzymes of the TCA cycle, pyruvate DH, and the glyoxylate cycle have not yet been published and would exceed available space here (Schnarrenberger and Martin, in preparation). According to the gene trees, most of the TCA cycle genes evolved from aproteobacterial progenitor genes, however, some also appear to originate from yproteobacterial prgenitors. However, large oc-proteobacterial genomes (such as Rhodonacter or Bradyrhizobium) are not yet available for comparison. An origin from archaebacteria is implied for NAD- and NADP-dependent isocitrate DH. Overall, a complete transfer of genes for the TCA cycle through a mitochondrial endosymbiontic process appears very likely, possibly with the exception of the two isocitrate dehydrogenase genes from archaebacteria. The situation for glyoxylate cycle genes is very different. Some of the genes originate from y-proteobacteria, others from gene duplications of eukaryotic nuclear genes encoding mitochondrial TCA cycle enzymes. Mitochondrial gene duplications may have taken place several times and independently in evolution, i.e. in early eukaryotic evolution or in early plant, early animal and early fungal evolution, respectively. Thus, channeling of gene products into microbodies has often taken place in evolution. There is no preference for such processes with respect to PTSl (protein transfer system) signals (N-terminal targeting sequence) or PTS2 signals (C terminal SKL motif for organelle import). 3.5

Class I and class II enzymes for the TCA cycle

There are also genes of class I and class II type for the TCA cycle as for the genes of the Calvin cycle (see above). This has been recognized for fumarase (two separate gene classes for class I enzymes in bacteria only). The present sample suggests that only the class II fumarase gene was inherited from bacteria to eukaryotes. A similar situation was recognized for isocitrate DH. In eukaryotes there is an NAD-dependent and an NADP-dependent isozyme, both being localized in

110

C. Schnarrenberger & C. F. Martin

mitochondria. The sequences for the two enzymes are not related and are typical class I and class II enzymes. Both genes were transmitted from bacteria to eukaroytes. Most surprising is the observation that the eubacterial sequences of gene family for eukaryotic NAD-dependent isocitrate DH encode only NADP-specific enzymes. The tree for the eukaryotic NADP-dependent isocitrate DH has only a single archaebacterial sequence from which it originates. The mitochondrial NADPdependent, but not the NAD-dependent isocitrate DH has an additional isoenzme in the chloroplasts. This is the result of a gene duplication of the mitochondrial gene of plants. Animals have an NADP-dependent isozyme in the cytosol, also the product of a gene duplication of the mitochondrial gene of animals. 3.6

Isozymes of the TCA cycle

There are many cases of compartment-specific isozymes for TCA cycle. Most abundant are the isozymes of malate DH. In plants there are genes encoding NADspecific isozymes for mitochondria, the cytosol, microbodies, and chloroplasts [2, 30]. In chloroplasts there is an additional NADP-specific malate DH [9]. In fungi and animals there are only isozymes in mitochondria, microbodies and in the cytosol. In a gene tree the mitochondrial, microbody and chloroplast sequences for the NAD-specific enzymes originate from one eubacterial root and form a large cluster, implying gene duplications for isozymes within the eukaryotic evolution. Cytosolic isozymes evolved from a different bacterial root and show high identity to some other sequences of eubacteria. From this latter cluster the sequences for the NADP-dependent malate DH evolved. This is particularly surprising as the sequence of the NAD-dependent malate DH in chloroplasts is associated with the cluster of mitochondrial and microbody sequences. Citrate synthase isozymes are found in mitochondria and in glyoxysomes of plants [24]. The gene for glyoxysomal enzymes of plants evolved from the same eubacterial progenitor as the mitochondrial enzymes. In fungi the genes for the mitochondrial and microbody citrate synthase are all products of gene duplications within fungal evolution. Finally, the genes for mitochondrial and cytosol aconitase evolved from different eubacterial roots. While the cytosolic plant enzyme has been shown to have aconitase activity, the cytosolic enzyme of animals seems to encode for an ironresponsive element-binding protein. 3.7

Evolution of enzymes with heteromeric subunit structure

(X-Ketoglutarate DH is composed of at least three different subunits. The subunit structure is very similar to that of pyruvate DH and branched-chain a-ketoacid DH. Subunit E3 is one and the same protein for all 3 enzymes. Subunit E2 encodes the substrate specificity of the 3 enzymes. All these subunits belong to one genes family, but the sequences encoding the different substrate specificities form separate subfamilies. Each of these subfamilies has representatives that were transferred

Pathways, Compartmentation and Gene Evolution

111

from eubacteria to eukaryotes. The subunits of El are only related among aketoglutarate DH and branched-chain a-ketoacid DH while the El subunit of pyruvate DH is not related to the other El subunits. Finally, both succinate DH and succinyl-CoA synthase consist of two nonrelated subunits. Both subunits of succinyl-CoA synthase share homologies to different parts of ATP-citrate lyase, one to the N-terminal and the other to the C terminal part of the ATP-citrate lyase. Therefore, the separation of the ATP-citrate lyase into two parts could have led to the creation of another enzyme, succinyl-CoA synthase, although the converse scenario is also imaginable. 4

Discussion

The gene trees for the individual genes of four pathways (Calvin cycle, glycolysis/gluconeogenesis, TCA cycle, and glyoxylate cycle) produced evidence for extensive gene transfer from eubacteria to eukaryotes and very little evidence for enzymes that were directly inherited from the archaebacterial contributor of eukaryotic genes. Extensive structural and functional diversification is also observed. This includes functional convergence of class I and II enzymes in several cases. Also, gene duplications for isozyme formation in different cellular compartments can easily be followed by gene tree analysis and give a much better understanding about the relationship among such isozymes. One can envisage a continuous process of gene duplications and new compartmentation variants during evolution, whereby selection has produced the patterns we observe today. The origin of compartment-specific isozymes is certainly not the product of a one-time event in evolution and in different lineages, pathways can underlie very different patterns of compartmentation. Another aspect is that while the majority of genes support the origin of chloroplast proteins from cyanobacterial progenitors and of mitochondrial proteins from a-proteobacterial progenitors, the glyoxysome organelle itself represents a case in which a novel organelle that probably does not descend from a free living eubacterium has secondarily imported a number or preexisting enzymes. In this context it should be pointed out that glycolysis, usually compartmented in the cytosol of eukaryotic cells, is almost exclusively located in the microbodies (glycosomes) of trypanosomes [18]. These genes are related to those of glycolysis in other eukaryotes and have been differently compartmented in this lineage through the addition of the respective targeting signals.

5

Acknowledgements

The authors thank the Deutsche Forschungsgemeinschaft for finacial suport.

112

C. Schnarrenberger & C. F. Martin

References 1. Antia N.J., Comparative studies on aldolase activity in marine planktonic algae and their evolutionary significance. J. Phycol. 3 (1967) 81-84. 2. Berkemeyer M., Scheibe R. and Ocheretina O. A., novel, non-redox-regulated NAD-dependent malate dehydrogenase from chloroplasts of Arabidopsis thaliana. J. Biochem. 273 (1998) pp. 27927-27933. 3. Brinkmann H. and Martin W., Higher plant chloroplast and cytosolic 3-phosphoglycerate kinases: A case of endosymbiotic gene replacement. Plant Mol. Biol. 30 (1996) pp. 65-75. 4. Dennis D.T. and Miernyk J.A., Compartmentation of nonphotosynthetic carbohydrate metabolism. Ann. Rev. Plant Physiol. 33 (1982) pp. 27-50. 5. Devereux J., Haeberli P. and Smithies O.A., comprehensive set of sequenceanalysis programs for the vax. Nucl. Acid. Res. 12 (1984) pp. 387-395. 6. Felsenstein J., Phylip manual version 3.5c. (Distributed by the author, Department of Genetics, University of Washington, Seattle, 1993). 7. Gibson J.L. and Tabita F.R., The molecular regulation of the reductive pentose phosphate pathway in proteobacteria and cyanobacteria. Arch. Microbiol. 166 (1996) pp. 141-150. 8. Gross W., Bayer, M.G. Schnarrenberger C , Gebhart U.B., Maier T.L. and Schenk H.E.A., Two distinct aldolases of class II type in the cyanoplasts and in . the cytosol of the alga Cyanophora paradoxa. Plant Physiol. 105 (1994) pp. 1393-1398. 9. Hatch M.D. and Slack C.R., NADP-specific malate dehydrogenase and glycerate kinase in leaves and evidence for their location in chloroplasts. Biochem. Biophys. Res. Commun. 34 (1969) pp. 589-593. 10. Henze K., Schnarrenberger C , Kellermann J. and Martin W., Chloroplast and cytosolic triosephosphate isomerases from spinach: purification, microsequencing and cDNA cloning of the chloroplast enzyme. Plant Mol. Biol.. 26 (1994) pp. 1961-1973. 11. Kornberg H.L. and Krebs H.A., Synthesis of cell constituents from C2-units by a modified tricarboxylic acid cycle. Nature 179 (1957) pp. 988-991. 12. Lambert D.H., Bryant D.A., Stirewalt V.L., Dubbs J.M., Stevens S.E.Jr. and Porter R.D., Gene map for the Cyanophora paradoxa cyanelle genome. J. Bacterioi. 164 (1985) pp. 659-664. 13. Marsh J.J. and Lebherz H.G., Fructose-bisphosphate aldolases: an evolutionary history. Trends Biochem. Sci. 17 (1992) pp. 110-113. 14. Martin W, Brinkmann H., Savona C. and Cerff R., Evidence for a chimaeric nature of nuclear genomes: Eubacterial origin of eukaryotic glyceraldehyde3-phosphate dehydrogenase genes. Proc. Natl. Acad. Sci. USA 90 (1993) pp. 8692-8696.

Pathways, Compartmentation and Gene Evolution

113

15. Martin W., Mustafa A.-Z., Henze K. and Schnarrenberger C , Higher plant chloroplast and cytosolic fructose- 1,6-bisphophosphatase isozymes: Origins via duplication rather than prokaryote-eukaryote divergence. Plant Mol. Biol. 32 (1996) pp. 485-491. 16. Martin W. and Schnarrenberger C , The evolution of the Calvin cycle from prokaryotic to eukaryotic chromosomes: A case study of functional redundancy in ancient pathways through endosymbiosis. Curr. Genet. 32 (1997) pp. 1-18. 17. Morse D., Salois P., Markovic P. and Hastings J.W., A nuclear-encoded form II RuBisCO in dinoflagellates. Science 268 (1995) pp. 1622-1624. 18. Opperdoes F.R. and Michels P.A., The glycosomes of the Kinetoplastida. Biochimie 75 (1993) pp. 231-234 19. Plaumann M., Pelzer-Reith B., Martin W. and Schnarrenberger C , Multiple recruitment of class-I aldolase to chloroplasts and eubacterial origin of eukaryotic class-II aldolases revealed by cDNAs from Euglena gracilis. Curr. Genet. 31 (1997) pp. 430-438. 20. Rowan R., Whitney S.M., Doolittle R.F. and Doolittle W.F., Ribusco in marine symbiotic dinoflagellates: form II enzymes in eukaryotic oxygenic phototrophs encoded by a nuclear multigene family. Plant Cell 8(1996) pp. 539- 553. 21. Rutter W.J., Evolution of aldolase. Fed. Proc. 23 (1964) pp. 1248-1257. 22. Schmidt M., Svendsen I. and Feierabend J., Analysis of the primary structure of the chloroplast isozyme of triosephosphate isomerase from rye leaves by protein and cDNA sequencing indicates a eukaryotic origin of its gene. Biochim. Biophys. Acta 1261 (1995) pp. 257-264. 23. Saitou N. and Nei. M., The neighbor joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4 (1987) pp. 406-425. 24. Schnarrenberger C , Fitting K.-H., Tetour M. and Zehler H., Inactivation of the glyoxysomal citrate synthase from castor bean endosperm by 5,5v-dithiobis(2nitrobenzoic acid) (DTNB). Protoplasma 103 (1980) pp. 299-307. 25. Schnarrenberger C , Flechner A. and Martin W., Enzymatic evidence indicating a complete oxidative pentose phosphate pathway in the chloroplasts and an incomplete pathway in the cytosol of spinach leaves. Plant Physiol. 108 (1995) pp. 609-614. 26. Schnarrenberger C , Gross W., Pelzer-Reith B., Wiegand S. and Jakobshagen S., The evolution of isozymes of sugar phosphate metabolism in algae. In Phylogenetic Changes in Peroxisomes of Algae, ed by H. Stabenau and N.E. Tolbert (University of Oldenburg, 1992) pp. 310-329. 27. Thompson J.D., Higgins D.G. and Gibson T.J., CLUSTAL W: improving the sensitivity of progressive multiple-sequence alignment through sequence weighting, position-specific gap penalties and weight-matrix choice. Nucl. Acid. Res. 22 (1994) pp. 4673-4680. 28. Tibata F.R., Molecular and cellular regulation of autotrophic carbon-dioxide regulation in microorgnisms. Microbiol. Rev. 52 (1988) pp. 155-189.

114

C. Schnarrenberger & C. F. Martin

29. Tabita F.R., The biochemistry and molecular regulation of carbon dioxide metabolism in cyanobacteria. In The Molecular Biology of Cyanobacteria, ed. by D.A. Bryant (Kluwer, Dordrecht, 1994) pp. 437-467. 30. Yamazaki R.K. and Tolbert N.E., Malate dehydrogenase in leaf peroxisomes. Biochim. Biophys. Acta 178 (1969) pp. 11-20.

TOMATO CF GENES FOR RESISTANCE TO CLADOSPORIUM FULVUM COLWYN M. THOMAS 1 , M A R K S. D I X O N 2 , AND JONATHAN D. G. J O N E S '

Sainsbury Laboratory, John Innes Centre, Norwich Research Park, Colney Lane,

Norwich

NR4 7UH, UK 2

School of Biological Sciences, University of Southampton, E-mail: colwyn.

Southampton,

S016 7PX, UK

[email protected]

In many plant-pathogen interactions resistance to disease is determined by the interaction of plant resistance (R) genes and pathogen avirulence (Avr) genes. To determine the molecular basis of pathogen perception by plants, we study the interaction between tomato and the leaf mould pathogen Cladosporium fulvum. Tomato Cf genes confer resistance to C. fulvum infection through recognition of distinct Avr proteins. Cf proteins are predicted extracellular membrane-anchored leucine-rich repeat (LRR) glycoprotein receptors. Cf loci are genetically complex and genes from the same locus encode proteins that are more than 90% identical. Most amino acid sequence differences correspond to putative solvent-exposed residues within a conserved structural motif of LRR proteins that determines recognition specificity. Sequence analysis of C/gene loci has also provided insight to the molecular mechanisms that generate R gene diversity.

1

Introduction

Plants can activate localized and systemic defense mechanisms in response to pathogen infection [1]. Resistance to disease is often determined by the interaction of dominant plant resistance (R) genes and dominant pathogen avirulence (Avr) genes, the so-called "gene-for-gene" interaction. It has been postulated that R genes encode receptors for Avr gene products [2]. Plant defences are activated in cells targeted by the pathogen and is often manifested as the hypersensitive response (HR), a localized cell death resulting in the arrest of pathogen ingress [1]. This contrasts with the situation in mammals where circulating defender cells migrate to the sites of pathogen attack. In mammals the elaboration of recognition capacity, and resistance to an array of potential pathogens occurs somatically through the adaptive immune system and germinally in genes encoding the major histocompatibility (MHC) complex [3]. In plants however, R gene sequence variation is transmitted exclusively through the germline. Analysis of many R gene loci, such as tomato C/loci [4,5,6] the L and M loci in flax [7], the Rpl locus in maize [8], and various RPP loci in Arabidopsis [9], have shown they are genetically complex. A major goal in plant pathology is to understand the molecular basis for pathogen perception by R proteins and to determine the underlying molecular mechanisms that generate R gene diversity and novel recognition specificities. This understanding may enable the rational design of 115

116

C. M. Thomas et al.

R proteins that recognize defined pathogen molecules, and the deployment of durable disease resistance genes in agriculture. Most isolated R genes appear to encode cytoplasmic proteins that contain a central nucleotide binding site (NB) domain and a C-terminal domain comprised of leucine-rich repeats (LRRs) that is thought to confer recognition specificity in these proteins [10]. This class is further subdivided into genes encoding either an Nterminal leucine zipper (LZ) domain (LZ/NB/LRR), or an N-terminal region showing homology to the cytoplasmic signalling domain of the Drosophila Toll and mammalian interleukin-1 receptors (TIR), the TIR/NB/LRR class [10]. Recently, homologies between plant NB/LRR proteins and proteins regulating apoptosis in mammals (Apaf-1) and worms (CED4) were identified [11]. Therefore, NB/LRR proteins may function as components of cellular complexes that activate plant defence responses in an Avr-dependent manner [12]. The first R gene to be isolated (of the "gene-for-gene" class) was the tomato Pto gene. Pto encodes a cytoplasmic serine/threonine protein kinase that confers resistance to Pseudomonas syringae that expresses avrPto [13]. Pto interacts physically with avrPto in yeast (14,15) consistent with the proposed receptor-ligand mechanism for pathogen perception in plants. Another class of R genes includes Cf2, Cf-4, Cf-5 and Cf-9 from tomato that confer resistance to Cladosporium fulvum [16,17,18,19]. These genes encode extracytoplasmic membrane-anchored glycoproteins comprised predominantly of LRRs. The rice Xa-21 gene for resistance to Xanthomonas oryzae pv. oryzae encodes a similar protein but is distinguished from Cf proteins by an additional cytoplasmic serine/threonine protein kinase domain [20]. In order to determine the molecular basis for R protein perception of Avr products, we have isolated and characterised tomato Cf genes, from two chromosomal loci [16-19]. Our results have identified sequences that determine recognition specificity in these proteins and provided insight to the molecular mechanisms that generate sequence variation at Cfloci. 2

The Tomato-Cladosporium fulvum Interaction

Cladosporium fulvum is a biotrophic fungus that causes leaf mould in tomato, its only known host. In compatible interactions (disease sensitive), fungal spores germinate on the abaxial surface of leaves and enter the leaf apoplast through stomata where they grow intercellularly in the lower mesophyll cells. Spore-bearing structures, the conidiophores, emerge through the stomata 10 to 12 days postinfection completing the life cycle of the fungus [21]. In incompatible tomato/C. fulvum interactions (disease resistant) hyphal development is arrested soon after penetration of the sub-stomatal cavity, according to the particular Cf genotype [21].

Tomato CF Genes for Resistance to Cladosporium fulvum

117

Crude preparations of fungal elicitors (isolated from leaves infected with C. fulvum), can also induce an HR when infiltrated into tomato leaves expressing the cognate Cf gene. Genes for the C. fulvum Avr4 and Avr9 proteins, that are recognized by the Cf-9 and Cf-4 tomato R proteins, have also been cloned [22,23]. Both Avr4 and Avr9 are small cysteine-rich peptides. The structure of Avr9 (28 amino acids) is comprised of antiparallel P-strands cross-linked by disulphide bridges to form a cystine-knot protein [24]. The tertiary structure of Avr4 has not been determined, but it may also form a cystine-knot. This may provide a stable structure within the leaf apoplast for the presentation of specific interacting residues to their respective 'pathogenicity targets'. Each Avr gene can be expressed in transgenic tomato plants (that lack the cognate R gene), and secreted into the apoplastic space by fusing to an appropriate plant signal peptide sequence [19,25]. When transgenic plants expressing either Avr9 or Avr4 are crossed to tomato lines containing the corresponding Cf gene Fl progeny undergo developmental^ regulated seedling death. Avr9 and Avr4 can also be expressed in the form of recombinant Potato Virus X (PVX::Avr4 and PVX::Avr9). Inoculated tomato plants containing the cognate C/gene exhibit leaf necrosis as a result of systemic virus infection and activation of an HR [19,26]. 3

Tomato Cf Genes

In the breeding of cultivated tomato (Lycopersicon esculentum) several genes for resistance to C. fulvum have been introgressed from wild tomato species [27]. Cf-9 and Cf-2 were introgressed from L. pimpinellifolium, Cf-4 from L. hirsutum and Cf5 from L. esculentum var. cerasiforme [27]. Cf-2, Cf-4, Cf-5 and Cf-9 (that recognize the C. fulvum Avr2, Avr4, Avr5 and Avr9 avirulence determinants) were introgressed into the disease sensitive tomato cultivar 'Moneymaker' (CfO) to generate a series of near isogenic lines (NILs) containing single introgressed Cf genes (Cf2, Cf4, Cf5 and Cf9). Classical and RFLP mapping showed these Cf genes were located at two complex loci. Cf-9 and Cf-4 are located on the short arm of tomato chromosome 1 [4,6] while Cf-2 and Cf-5 are located on the short arm of tomato chromosome 6 [5,6]. 4

Cf Gene Isolation

Cf-9 was isolated by transposon tagging with the non-autonomous maize transposon Dissociation (Ds) that was introduced into Cf9 by transformation. A line containing Cf-9 and a genetically linked Ds was crossed to Cf-9 plants containing an Activator transposase source to induce Ds transposition. Ds-tagged alleles of Cf-9 were identified as survivors after testcrossing to a CfO transgenic line expressing Avr9

118

C. M. Thomas et al.

[18]. Gel blot analysis also showed Cf-9 is a member of a linked multigene family that are polymorphic between the Cf4 and Cf9 NILs. The additional members of the Cf-9-homologous multigene family in Cf9 and Cf4 NILs were designated Hcr9-9 and Hcr9-4 genes respectively (for homologs of Cladosporium resistance gene Cf9). A physical map of the Cf-9 locus was constructed from overlapping binary vector cosmid clones (Fig. 1). The observation that Cf-4 maps at an identical chromosomal location to Cf-9 suggested it may be a homolog of Cf-9. Since members of the Hcr9 multigene family are polymorphic, a genetic strategy was employed to identify Cf-4. Cf-4/Cf-9 transheterozygous plants were test crossed to CfO plants, and therefore, all progeny should inherit either Cf-9 or Cf-4. In the case of allelic genes, disease sensitive recombinants could arise through rare Cf-4ICf-9 intragenic recombination events. In the case of closely linked genes, recombinants lacking Cf-4 or Cf-9 (disease sensitive), or containing both R genes in cis configuration, could occur through recombination between intergenic sequences. Disease sensitive plants could also arise through unequal crossing over or chromosomal deletions. Testcross progeny were inoculated with C. fulvum race 5 (that expresses Avr4 and Avr9) to identify disease sensitive plants. Five disease sensitive individuals were identified after screening 7,400 testcross plants [27]. A cro C»

L

LoxL

Hcr9-0A

I • •

CZZj

«xK

LoxL

Ha9-9A

Her9-9B

|-fr+

—p—•

—g

LoxL

Hvr9-4A

«— Cf-9

»——3

Hcr9-9D

Hcr9-9E

•-—q

LoxR

13—

Cf4-HHHCen

Tel

D ...

Herl-OB

^^i^_

c=3 C/2.2

CP.1

Ucr2-UA

—-C=3

CfO

Hcr2-2A

Cfi Cf-S

i Cen

Ha-2-SB

1

1

Hcr2-SA

1—-&—r—i

CfS Tel

Figure 1. Physical organization of Hcr9 and Hcr2 genes in different Cfhaplotypes. The transcriptional polarity of Hcr9s and Hcr2s is indicated by arrows and their orientation relative to the centromere (Cen) and telomere (Tel) on chromosome 1 and chromosome 6. (a) Organization of Hcr9 genes in CfO, Cf9 and Cf4 NILs. Hcr9s in each genotype are flanked by lipoxygenase genes (LoxL and LoxR). Sequences derived from the 3' exon of LoxL present in the Hcr9 intergenic regions are also shown, (b) Organization of Hcr2s in CfO, Cf2 and Cf5 NILs. The physical contig encompassing Cf-5, Hcr2-5D and Hcr2-5B does not include Hcr2-5A that was detected by DNA gel blot analysis [17] and a discontinuity is shown in the physical map.

Tomato Cf Genes for Resistance to Cladosporuim fulvum

119

From molecular analysis three Hcr9-4s were identified as Cf-4 candidates (Hcr9-4C, Hcr9-4D and Hcr9-4E). A physical map of the Cf-4 locus was constructed from binary vector cosmid clones using Cf-9 as a probe (Fig. 1). The identity of Cf-4 was determined by transformation of overlapping cosmids containing various combinations of the three candidate Cf-4 genes [19]. Transgenic CfO plants expressing Hcr9-4D (Cf-4) were resistant to infection by C. fulvum race 5 (that expresses the Avr4 avirulence determinant) but not C. fulvum race 4 isolates which can overcome Cf-4 mediated resistance [28]. Transgenic plants expressing Cf-4 also exhibited systemic necrosis when infected with PVX::Avr4 [19]. Cf-2 was isolated by positional cloning [16]. Cf-2 cosegregates with a molecular marker that identified polymorphic sequences between the NILs CfO, Cf2 and Cf5 [29]. Two near identical genes (Cf-2.1 and Cf-2.2) that conferred Avr2dependent resistance were identified by complementation experiments [16]. Molecular analysis of the CfO, Cf2 and Cf5 NILs, using Cf-2 as probe, also revealed a genetically linked polymorphic multigene family [16,17]. Additional members of the Cf-2-homologous multigene family in the NILs Cf2 and Cf5 were designated Hcr2-2 and Hcr2-5s respectively (for ftomolog of Cladosporium resistance gene Cf2). Cf-5 was identified using a similar genetic strategy to that used to identify Cf-4. In 12,000 progeny of a Cf-2/Cf-5 x CfO testcross one disease sensitive plant was identified [16]. Analysis of the single Cf-2/Cf-5 recombinant identified several Hcr2-5s as candidates for Cf-5 [17]. Analysis of the Hcr2 composition of the disease sensitive recombinant confirmed it lacked both Cf-2.1 and Cf-2.2 as predicted. Restriction fragments corresponding to two Hcr2-5 genes (Hcr2-5C and Hcr2-5D) were also absent (Fig. 1) and some non-parental restriction fragments were also observed. The Hcr2-5C gene was identified as Cf-5, a gene conferring Avr5-dependent resistance to C. fulvum [17]. 5

Chromosomal Organization of the Cf-4ICf-9 and Cf-2/Cf-5 Loci

Ten to 15 //cr9-specific fragments were identified on DNA gel blots of the CfO, Cf4 and Cf9 NILs [18]. Several members of this family are polymorphic demonstrating some of them are present on the introgressed DNA [4] and must be genetically linked. Mapping studies have shown the non-polymorphic members of the Hcr9 multigene family are located outside the region of introgressed DNA at several loci on the short arm of chromosome 1 [30]. Physical maps of the Cf-4 and Cf-9 locus were compared to the corresponding region in the disease sensitive CfO line [31]. Physical mapping and DNA sequence analysis of each haplotype demonstrated the Hcr9s are tandemly duplicated transcription units in Cf-4 and Cf-9 haplotypes [19,31]. In contrast, the disease

120

C. M. Thomas et al.

sensitive Cf-0 haplotype contains a single homolog, Hcr9-0A (Fig. 1). The Cf-4 and Cf-9 haplotypes each contain five Hcr9 genes (Hcr9-9A to E, and Hcr9-4A to E) within a 36 kb physical interval (Fig. 1). The Hcr9s are flanked by convergently oriented lipoxygenase genes LoxL and LoxR [31]. A number of these Hcr9s from the Cf-4 and Cf-9 haplotypes also confer resistance to C. fulvum through recognition of additional Avr determinants [31,32]. All Hcr9s consist of an uninterrupted open reading frame with a single intron in the 3' untranslated region [19,31]. A single intron is present at a similar location in Hcr2s suggesting Hcr9s and Hcr2s may be derived from a common ancestral gene. DNA sequence analysis of the Hcr9 intergenic regions revealed extensive blocks of sequence homology upstream of Hcr9s including sequences derived from the 3' exon of LoxL (Fig. 1)[31]. The composition and length of the Hcr9 intergenic blocks varies between intergenic regions indicative of sequence rearrangements during their evolution. Within the Cf-4 and Cf-9 haplotypes the intergenic regions are largely distinct with respect to their sequence block composition. Hcr9 intergenic regions containing similar, but non-identical sequence block patterns, are present in Cf-4 and Cf-9 haplotypes but are non-syntenic. Furthermore, DNA sequence analysis of eleven Hcr9s [31] revealed that Hcr9s within the Cf-4 and Cf-9 haplotypes contained regions of extensive DNA sequence homology. Therefore, Hcr9s appear to be patchworks of different homolog sequences suggesting frequent intragenic sequence exchange has occurred during their evolution (Fig. 1). Physical maps of the Cf-2 and Cf-5 locus and the corresponding locus in the CfO NIL were also compared (ref). The Cf-2 haplotype contains tandemly duplicated copies of Cf-2 (Cf-2.1 and Cf-2.2), and a third gene (Hcr2-2A) that encodes a protein of the same class as Cf-2.1 and Cf-2.2 (Fig. 1). The three genes are located within a 30 kb region [16]. Analysis of the Cf-5 locus identified four tandemly oriented Hcr2-5s (Fig. 1) [17]. 6

Recombination at Cf Gene Loci

Molecular analysis of the five Cf-4ICf-9 disease sensitive recombinants identified two distinct recombinant classes based upon their Hcr9 composition [19]. In the first class four disease sensitive plants contained Hcr9-4A and Hcr9-4B from the Cf4 parental line and Hcr9-9E from the Cf9 parent (Fig. 2) confirming these plants were generated by recombination at the Cf-4ICf-9 locus. The other recombinant retained Hcr9-9D and Hcr9-9E from the Cf9 parental line and Hcr9-4A from the Cf4 parent (Fig. 2). The locations of recombination break points in the five recombinants were determined by DNA sequence analysis that confirmed their chimeric and independent nature (Fig. 2). All five crossovers occurred between the Hcr9 intergenic regions and were generated through chromosomal mispairing and unequal

Tomato C? Genes for Resistance to Cladosporium fulvum

121

crossing over, i.e. each recombinant chromosome retained three Hcr9 genes at the locus compared to five in the parental lines. DNA sequence analysis of the single Cf-2ICf-5 recombinant showed this was also generated through chromosomal mispairing and unequal crossing over. However, in this recombinant crossing over occurred within Hcr2 coding sequences to generate a recombinant Hcr2 (Fig. 2). The region of recombination was located to a 107 bp interval encoding part of domains A and B in the C/-2.2 and Hcr2-5B genes [17]. 7

Cf Genes Encode Extracellular Membrane-Anchored LRR Glycoproteins

Cf-2, Cf-4, Cf-5 and Cf-9 were predicted to encode plasma membrane-anchored extracellular LRR glycoproteins [16,18] and this was recently confirmed for Cf-9 [33]. This is consistent with their proposed role as extracellular receptors for C. fulvum Avr products. Cf proteins are predicted to contain a number of structural domains (Fig. 3) but are comprised mainly of extracellular LRRs. All Hcr9s appear to encode proteins with 27 LRRs with the exception of Cf-4 and Hcr9-4B that encode proteins with 25 and 23 LRRs respectively [19,31]. The number of LRRs in Hcr2s ranges from 25 in Hcr2-2A, to 38 in the Cf-2 proteins [17]. Cf-9 encodes a protein of 863 amino acids mainly comprised of approximately 24 amino acid LRRs (27 in total) that show extensive homology to the consensus sequence for extracellular LRR proteins (LxxLxxLxxLxLxxNxLxGxIPxx, where x= any amino acid) [34]. The Cf-2.1 and Cf-2.2 genes encode proteins of 1112 aa which differ by only three amino acids at their C-termini (16). Cf-2 proteins are considerably larger than Cf-9 and contain 38 LRRs that, in contrast to Cf-9, are of a uniform 24 amino acid length (16). Also, in domain CI, a sequence of 20 LRRs is comprised predominantly of two distinct alternating LRR types (Fig. 3). Cf proteins also contain a number of potential targets for N-glycosylation. With the exception of LRR consensus residues, the N-terminal portions of Cf-9 and Cf-2 display little amino acid sequence homology. However, significant levels of amino acid homology are observed in their C-termini in a region encompassing the last nine LRRs and domains D, E, F and G [10,16].

8

Cf Protein Structure

Part of the plant extracytoplasmic LRR consensus (xxLxLxx where L=leucine or any aliphatic amino acid, and x=any amino acid) corresponds to a structural motif in LRR proteins that forms a repeated fj-strand/p-turn structure. In this motif the conserved sidechains of leucine residues (and other aliphatic amino acids), project

122

C. M. Thomas et al.

into the hydrophobic core of the protein and perform a structural role. The P-strands of successive LRRs interact to form an extensive P-sheet. The side chains of interstitial residues are solvent-exposed and form an extensive ligand binding surface, as proposed for porcine ribonuclease inhibitor (PRI) that binds ribonuclease A [34,35]. This structural motif appears to be conserved in LRR proteins and probably functions in mediating protein-protein interactions [34]. Sequence variation in the solvent-exposed amino acids of this motif are predicted to affect recognition specificity in LRR proteins.

123

Tomato CF Genes for Resistance to Cladosporium A

Ha9-9A

Hcr*-9B

-wgB—•—«

Cf-9

•——m

Hcr9-4A

Hcr9-9D

Hcrf-tB

—"™

—~

•

. Hcr9-4B

Mcr9-4C

1

Cf-4

HcrH4£

(Class I) Hcr9-4A

B

Hcr9-9A

Ha9-9B

HBBO—•—•••

Her9-4B

Cf-9

•——o

Her9-9E

Hcr9-9B

Hcr9-9E

•-

Cf-4

Hcr9-9A

Hcr9-9B

-¥

—B

Hcr9-4E

Cf-9

•——Q

»

EZ=1

r—T-MZZ3—M

/fcr»-*B

ffcrSWC

C/-<

JHcr»-4B

(Class II) Hcr9-4A

Hcr9-9B

Hcr9-9E

CfX2

Cf2.1

-BHB

••B-

»

p

-7$*Cf-S

Hcri-SB

UCT2"

BtrZ-SA

'

Ql/

-r—j Hcr2*

H&2-SA

Figure 2. Chromosomal mispairing and unequal crossing over at the Cf-4/Cf-9 and Cf-2/Cf-5 loci. Misalignment of Cf4 and Cf9 chromosomes generated two distinct disease sensitive recombinant classes. (A) In four recombinants (class I) crossing over occurred within a 3.0 kilobase interval upstream of Hcr9-9E in the Cf9 chromosome, and Hcr9-4C in the Cf4 chromosome. The Hcr9 composition of class I disease sensitive plants is shown. The Hcr9 composition of the reciprocal "double resistant" recombinant chromosome is also shown to illustrate the variation in Hcr9 copy number.

C. M. Thomas et al.

124

(B) Recombination in the single class II individual was mapped to a different location. This recombinant also retained three Hcr9s. The Hcr9 composition of the reciprocal "double resistant" recombinant chromosome is also shown. (C) Chromosomal mispairing at the Cf-2/Cf-5 locus deduced from molecular analysis of a single disease sensitive recombinant. Crossing over occurred within the coding sequences of Cf-2.2 and J/cr2-5B. The Hcr2 composition of the disease sensitive recombinant and the reciprocal "double resistant" chromosome are shown. In both recombinant chromosomes novel Hcr2 genes (Hcr2*) were generated.

Cf-5

Cf-4 A

B

MSCVKLVFIMlJ'SfFLini.XSSSS I,PHI,CPEDOAI*l,HFKNMFTYNPNASDYCYD. , RRTLSWNKSTSCCSWDGVHCDETyGQ

V LFQI-SKL FTIF^DL

Sftl , r i h u r ^

CI

r *- C SF* V r H i _DL fi^-n A - L >„HLt — LWNLTN

JJ

MMMVXSKVJf 3 SI-Q F F T V f Y I, t'TVA FA STEEATAX.tKWKATFKNQNNSFL.ASS^tarsSWACKDWY GWC&NGHV FPFSSLPgl, HKlGNl.'i'NL FQTGSI.AKL.

IQLQGKFHS^SS KJBFTGSS1SPK e ^fH1[I»SP •KiIff9t«Nl'fr INI 5 vi*>h T L&C Il,rE*» NPV " t o f !•< T^Tp R I F E " §N1 T ->i*l« t iLsr

EEiaYT,RSJ A.SLGNX.NNJEEIGYLRSL A^LGNLNNL FRIGYI.RS1.

NTLKUTN EKX.Dl.SII QTXBIF8 1KLS1.G 31-UTJ.Y* *:'«!.» L p j

ASVIGTT/YA KNIgGTIP NO 3 SfiT I r NRI/N'GFI P . MO:.-~GgT P

NFLgGSIF

NAI.TIGiiTP SM-LGNLNNT

Y < B^

0 £ gl "

Tr

IS~LL-\ ^ 1>G i H v n *T DLG™ * ~JLSH H r i i 3 1 J ~N K L-L^^ 0 lrfK "• f »,

tE K^K i NQKNw 1 n hT V^BH?\ F JCN M i N K:f

L r

(^2

IjgLDLRC KRLDLSX T ' l l DI "H TTPTgl* B P !>*-<• % WIFX XTLp^g MXLIiYN f K*. iM£R V^SLtSS

A,

YisfQ^ iif f V J **

KHT NC 1 =""W Mr KT^FNK1 rf«.£irN" KNI^C U rt MJL_G ^u V \KJ__CTINTT * Kl U KV 5* J-M J T r *•& f\L-i ^TftS^GN M ^ N3 J L K l

CI iLKi^YlgSJ. AS&GKI.NNI £E1G¥J,£SI AS&GKIJSNI, eKIGS't.SSL ASi,
NALfi I P

gMLYLYN SSJ.Y3.YN

AMt-GtOmNi EVI.YMgB gCLGNlSI?!. SSiSNL'iSL | QKFGNTSSX. i QVFDMQTJ f i J F S I G C S J . ' ISLNLHG BSLDNCKKL QVLDLGD MW1>GTLP£L RVLBLTS GAEIMFPDL R I I D1.SR

HS'i'Gfc-pEYiSDPYDIYYKyLTTtSTKGQDYDS

VRII.D5N s MIINLSK i NRFKGHIPSI ( j j SGDX.VGX. I R1%NLSH : HVLSOHIPAS FQNLSVL ' ESLDLSS : NKISGBJPQQ LASLTFI. = EVLNLSH NHLVGCIPKG

NSLKGL-lf' NQI.HG'-U V NgJ.NGf; 11 NC>i.sr.r,ii' NDLlGEi H NNI.KGKVP NnFiyjr;;.!' JfMi.fc.GAIP NKX.SGTX,P NQT^NDTrP NKLHGE-1KS NAFSQDLFT

TJt.IDLSJ5 i N K F E G H I P R^fLNVSH NALQGYIF E S L D L S F | NQI.SGETP E5CI-NLSH ' KYI.OGCIP

1J

KQFDSFGNTSYQGHDGLRGFPLSKLCGG

EIVRILSiY

]3

EDQVTTPAELDQEEEEBD

SSLGSiLSII, QO^ASI^TFL

P

S PMISWQGVL VG YGCGLV T G LS VIYIMWSTQY PAWFi

O

OGPQFRTFF.SKSYKGNDG^ftGYPVSKGCGK

Ci

RKOLKtKHIITTKMKKHKKRY

£

DFV3EKNYTVSALEDQESK3EFFKDFWK

P

AAL-MGYGSGLCiGiSIlYILISTG

(jE

Nt^WLAJRIXSEl.EHKII'SrQKRKKORGOB.NYR^RNNRF

LXKLKXL

XXLXLXX

"SXT-XGXIPX

NxLxGxIF

Figure 3. Sequences of Cf-4 and Cf-5 to illustrate amino acids that distinguish them from Cf-9 and Cf-2 proteins respectively. Both sequences are divided into domains to emphasize structural similarities with Cf-9 and Cf-2 [10]. Domain A is a predicted signal peptide sequence, domain B the predicted mature Nterminus shows homology to the N-terminal sequences in a number of plant LRR receptor like kinases, and to PGIPs [10]. Domains CI and C3 contain approximately 24 amino acid LRRs that show homology to the consensus sequence for extracellular LRR proteins (shown boxed and aligned below each sequence). Domain C2 bisects domains CI and C3 and shows poor homology to the LRR consensus as does domain D. Domains E, F and G are proposed to anchor and orientate the protein within the plasma membrane. Domain E contains a high proportion of negatively charged acidic residues. Domain F is rich

Tomato Cf Genes for Resistance to Cladosporium fulvum

125

in aliphatic amino acids and is predicted to form a membrane-spanning alpha helix. Domain G comprises a short cytoplasmic domain rich in positively charged amino acids. In both sequences residues in the predicted LRR structural motif (xxLxLxx) are delimited by vertical lines. Variant amino acids are shown in bold and underlined. Deletions in Cf-4 and Cf-5 relative to Cf-9 and Cf-2 are indicated by dots. The 14 alternating type A and type B LRRs in Cf-5 are also indicated.

In PRI, the LRRs are comprised of alternating 28- and 29-amino acid units that form ot/|3 coil structures [34]. Cf proteins are unlikely to adopt a/p coil structures since they contain residues that are not common in cc-helices (10,16,27). Alternative protein folds would also facilitate parallel stacking of the P-strand/P-turn structural motif. Recently, a model for the structure of plant extracellular LRR proteins, was proposed [36]. Plant polygalacturonase inhibiting proteins (PGIPs) bind fungal polygalacturonases (PGs) that degrade plant cell walls during infection. The LRR consensus sequence of PGIPs is similar to that of Cf proteins [10] and they are likely to adopt similar tertiary structures. In PGIPs, the LRRs also form a parallel P-sheet structure with an extensive ligand binding surface. Furthermore, the recognition specificity of PG binding by PGIPs was shown to reside in a number of solventexposed residues of the LRR structural motif [36]. 9

Recognition Specificity of Cf Proteins

Cf-4 encodes a protein of 806 amino acids with 25 LRRs 91.5% identical to Cf-9 (Fig. 3). The reduction in length compared to Cf-9 is due to a 10 amino acid deletion near the mature N-terminus of domain B and a larger deletion of 46 amino acids comprising two complete LRRs (Fig. 3). A total of 67 amino acids distinguish Cf-4 from Cf-9 of which six are located within domain A, the putative signal peptide sequence [19]. Cf-5 consists of 968 amino acids and contains a total of 32 LRRs, six fewer than in Cf-2 proteins. The overall level of sequence identity between Cf-5 and Cf-2 proteins is 90% [17]. In Cf-2 proteins, 20 of the 34 LRRs in domain CI are of two distinct types, type A or type B (Fig. 3). With the exception of a block of four consecutive type B LRRs in the middle of this domain, the repeats exhibit a strict alternating pattern [16]. In Cf-5 fourteen consecutive LRRs in domain CI exhibit a strictly alternating pattern of type A and type B repeat units (Fig. 3). The amino acids that distinguish Cf-4 from Cf-9 and Cf-5 from Cf-2 are not distributed at random (Fig. 3). The variant amino acids in Cf-4 are located in the Nterminal half of the protein delimiting a domain that determines Avr4 recognition specificity. In Cf-4, 33 of the 57 variant amino acids in LRRs are located within the conserved LRR motif (xxLxLxx) of which 32 correspond to putative solventexposed residues (Fig. 3). A number of amino acids distinguishing Cf-5 from Cf-2 are located in the C-terminal half of the protein, but the majority are also located in the N-terminal half (Fig. 3). In Cf-5, 56 of the 88 variant amino acids within LRRs, that must confer Avr5 recognition specificity, also correspond to interstitial residues

126

C. M. Thomas et al.

(Fig. 3). Therefore, amino acids within the N-terminal half of Cf proteins contain the critical sequence determinants that confer recognition specificity. A comparison of eleven predicted Hcr9 sequences from the Cf-0, Cf-4 and Cf-9 haplotypes showed that most sequence variation was observed in sequences encoding the solvent-exposed amino acids of the LRR structural motif in the 16 Nterminal LRRs. A number of hypervariable amino acid sequence positions were also identified [31]. Analysis of the corresponding Hcr9 coding sequences in this region revealed a higher ratio of nonsynonymous to synonymous nucleotide substitutions than in the remainder of the Hcr9 coding sequence [31]. This demonstrated there has been selection for sequence diversification in this region consistent with the proposal that it encodes a potentially variable ligand recognition domain in Cf proteins. Most of the sequences in Cf proteins, at least at their N-termini, appear to play a structural role and only a small proportion of critical sequence differences are required to generate a distinct recognition specificity. Amino acid differences between Cf proteins from the same locus in the conserved structural motif of their N-terminal LRRs, and variation in LRR number, provides an explanation for the molecular basis of recognition specificity in these proteins.

10 Discussion

10.1 Gene organisation and recombination at Cfloci The Cf-4/Cf-9 and Cf-2/Cf-5 loci are comprised of tandemly duplicated homologus genes (Fig. 1). Molecular analysis of R gene loci such as RPP5 and RPP1 in Arabidopsis [37,38], N in tobacco [39], Xa-21 in rice [40], M in flax [41], Pto and 12 in tomato [13,42], and Dm3 in lettuce [43,44], has shown they are also comprised of genetically linked multigene families. This type of gene organization was predicted from genetic analysis ofR gene loci to account for the generation of novel R genes through intragenic recombination between tandemly duplicated homologous genes [27]. Molecular analysis of Cf loci has demonstrated that unequal crossing over and/or gene conversion have played an important role in their evolution (31). The presence of two near identical copies of Cf-2 [16] is probably the result of unequal crossing over. Also Cf-9 and Cf-4 coding sequences differ by one nucleotide in 1057 bases at their 3' ends and are identical for a further 5.2 kilobases downstream. This suggests that one of these genes was generated from an unequal crossover or gene conversion event [19,31]. One potential consequence of tandemly duplicated homologous genes is sequence homogenization through frequent intragenic sequence exchange as a result of chromosomal mispairing. This would be undesirable at R gene loci where generation of sequence variants is required to combat constantly changing pathogen populations. At R gene loci, such as the maize Rpl locus, meiotic instability is apparent due to chromosomal mispairing and unequal crossing over [8]. Genetic

Tomato CF Genes for Resistance to Cladosporium fulvum

127

experiments have shown this occurs infrequently at the Cf-9 locus. In plants homozygous for Cf-9 the level of Hcr9 mispairing resulting in loss of Cf-9 function was less than 1 in 22,000 gametes [31]. It was suggested that the unique sequence block composition of Hcr9 intergenic regions may suppress mispairing at meiosis [31]. In Cf-4/Cf-9 transheterozygous plants the loss of Cf-4 and Cf-9 function was significantly higher (Fig. 2). Therefore, sequence differences between haplotypes may increase the frequency of mispairing and unequal crossover events. 10.2 Cf gene evolution As in studies on the evolution of the mammalian MHC locus [3], the relative contribution of intraallelic recombination, gene conversion, unequal crossing over and the accumulation of point mutations in generating R gene diversity is controversial. Classical genetic experiments suggested that intraallelic sequence exchange at the flax L locus [47] or chromosomal mispairing and unequal crossing over between homologs at the maize Rpl locus [8] could generate R gene variation. This has been verified by molecular analyses of the L and Rpl loci [45,46]. At some R gene loci a "birth and death" model of R gene evolution has been proposed [47], based on a model for evolution of the mammalian MHC locus [48]. In this model, expansion or contraction of members of a multigene family is due to unequal crossing over. Evolution of alleles proceeds by sequence exchange exclusively between orthologous members of a gene family, and also random mutation and selection. In the Cf-4 and Cf-9 haplotypes evidence for extensive intergenic sequence exchange between Hcr9 paralogs was shown that generated recombinant Hcr9s with novel LRR composition and is not consistent with the proposal that no sequence exchange occurs between gene paralogs. Hcr9s have also accumulated multiple point mutations that generate sequence diversity in the solvent-exposed residues of their LRRs, and this appears to be another important source of sequence variation. The relative contribution of these processes to the generation of R gene diversity can only be determined by further analysis of R gene loci. Hcr2s contain variable numbers of LRRs, possibly due to unequal crossover events, and this may be an important mechanism to generate novel recognition specificities [16,17]. Hcr9s contain similar numbers of LRRs and most sequence variation occurs in nucleotides encoding putative solvent-exposed amino acids of the conserved LRR structural motif. Sequence comparisons of several NB-LRR R gene families such as RPPl, RPP5 and Dm3 have revealed similar high ratios of nonsynonymous to synonymous substitutions in sequences encoding the corresponding amino acids of their LRRs [38,43,44,49]. These sequences within LRRs have undergone diversifying selection whereas sequences encoding putative signalling domains (NB, TIR and LZ) appear to have undergone purifying selection.

128

C. M. Thomas et al.

Population geneticists have proposed that in natural populations, where plants and their pathogens coevolve, disease is kept in check by "balancing polymorphisms" at R gene loci i.e. no R gene allele is present at high frequency since it may be overcome by a novel pathogen variant [50]. Numerous examples of this have been reported in agriculture where crop R gene monocultures eventually succumb to pathogen infection. Overdominance, or heterozygote advantage, contributes to the maintenance of sequence polymorphism at the human MHC locus. In a number of wild Lycopersicon species self-pollination is limited by a gametophytic self-incompatability (SI) system and this would also promote the mixing of haplotypes that might generate new Cf gene variants through intragenic recombination mechanisms [27]. Overdominance cannot explain polymorphism at the R gene loci of inbreeding species such as Arabidopsis. Therefore, it was proposed that frequency-dependent selection may account for the maintenance of sequence polymorphism at the R gene loci of inbreeding species [49,51]. 10.3 Cfprotein function The most likely ligands for Cf proteins are their cognate C. fulvum Avr proteins. However, we have failed to demonstrate physical interaction between Avr9 and Cf9 or Avr4 and Cf-4 proteins in vitro, suggesting other factors are required for Avr perception. Studies using labeled Avr9 have identified a high-affinity binding site (HABS) in tomato plasmamembranes irrespective of their Cf-9 genotype [52]. This HABS may represent the extracellular 'pathogenicity target' of Avr9. It is possible that Cf-9, in conjunction with a signalling partner protein, may recognize the Avr9/HABS complex to induce a plant defense response. A paradigm for Cf protein function was suggested from recent studies on three genes (CLV1, CLV2 and CLV3) that condition the "clavata" mutant phenotype in Arabidopsis [53,54,55]. CLV1 is an Xa21-like extracellular LRR receptor kinase and CLV2 is a Cf-like protein. CLV1 and CLV2 are proposed to form a heterodimer that is activated upon binding of a peptide ligand (CLV3) resulting in phosphorylation of the kinase domain of CLV1 and association with downstream factors, a kinase-associated protein phosphatase (KAPP) and a GTPase (Rop). We are currently attempting to identify plasma membrane proteins that interact with epitope-tagged Cf-9 [33]. Our analysis has shown that Cf proteins comprise two functional domains, a variable N-terminal domain that determines recognition specificity, and a conserved C-terminal region, that includes part of domain CI, and domains C2, C3, D, E, F and G (Fig. 3) [16,18,19]. Our analysis demonstrated selection for sequence conservation in the C-terminal region. In the absence of a substantial cytoplasmic, or other obvious signalling domain, we proposed this region of Cf proteins may interact with another transmembrane protein to transduce a signal after ligand binding that activates the plant defense response (10,16,19). We suggested Cf-4 and Cf-9, that have identical C-termini, may interact with a common signalling partner,

Tomato CFGenes for Resistance to Cladosporium fulvum

129

as would Cf-2 and Cf-5 [16,27]. However, recently we identified a gene required specifically for Cf-2 function (Rcr3) that was identified by mutational analysis (56). Since Rcr3 is not required for Cf-5 function, the Rcr3 gene product cannot be a Cf2/Cf-5 common interacting partner. We have suggested that Rcr3 may be the extracellular 'pathogenicity target' for Avr2 forming a complex that is recognized by Cf-2 to trigger an Avr2-dependent plant defense response. Alternatively Rcr3 may be a Cf-2-specific signalling partner or downstream component of a Cf-2specific signalling pathway [56]. Our understanding of the nature, genome organization and evolution of R gene loci has increased substantially in recent years. A number of Avr9-dependent biochemical responses in tobacco Cf-9 plants and cell cultures have also been defined. These include stimulation of ion fluxes [57], production of active oxygen species [58], activation of two mitogen-activated protein kinases [59], and a calcium-dependent protein kinase [60]. The next major challenge is to determine the molecular mechanism of Avr perception by Cf proteins, and to dissect the signal transduction pathway that activates plant defences. 11 Acknowledgements We thank David Jones, Martin Parniske and all of our colleagues for their contributions to this work and for their many useful suggestions and discussions. Research in the Sainsbury Laboratory is funded by the Gatsby Charitable Foundation. References 1. 2. 3. 4. 5. 6. 7.

8.

Hammond-Kosack, K. E. and Jones, J. D. G., Ann. Rev. Plant Physiol. Plant Mol. Biol. 48 (1997) pp. 575-607. Staskawicz, B.J., Ausubel, F.M., Baker, B.J., Ellis, J.G., and Jones, J.D.G., Science 268 (1995) pp. 661-667. Hughes, A. L. and Yeager, M., Ann. Rev. Genet. 32 (1998) pp. 415-435. Balint-Kurti, P.J., Dixon, M.S., Jones, D.A., Norcott, K.A., and Jones, J.D.G., Theor. Appl. Genet. 88 (1994) pp. 691-700. Dickinson, M.J., Jones, D.A., and Jones, J.D.G., Mol. Plant-Microbe Interact. 6 (1993) pp. 341-347. Jones, D.A., Dickinson, M.J., Balint-Kurti, P.J., Dixon, M.S., and Jones, J.D.G., Mol. Plant-Microbe Interact. 6 (1993) pp. 348-357. Ellis, J., Lawrence, G., Ayliffe, M., Anderson, P., Collins, N., Finnegan, J., Frost, D., Luck, J., and Pryor, T., Ann. Rev. Phytopathol. 35 (1997) pp. 271291. Hulbert, S.H., Ann. Rev. Phytopathol. 35 (1997) pp. 293-310.

130 9.

10. 11. 12. 13.

14. 15. 16. 17. 18. 19. 20.

21. 22. 23. 24. 25. 26. 27. 28.

C. M. Thomas et al. Holub, E.B., In The Gene-For-Gene Relationship in Plant-Parasite Interactions, eds. I.R. Crute, E.B. Holub, J.J. Burdon (New York, USA: CAB International, 1997), pp. 5-26. Jones, D.A., and Jones, J.D.G., Adv. Botanical Res. Incorporating Adv. Plant Pathol. 24 (1997) pp. 89-167. van der Biezen, E. A. and Jones, J. D. G., Curr. Biol. 8 (1998) pp. 226-227 van der Biezen, E. A. and Jones, J. D. G., Trends Biochem. Sci. 23 (1998) pp. 454-456. Martin, G.B., Brommonschenkel, S.H., Chunwongse, J., Frary, A., Ganal, M.W., Spivy, R., Wu, T., Earle, E.D., and Tanksley, S.D., Science 262 (1993) pp. 1432-1436. Scofield, S.R., Tobias, CM., Rathjen, J.P., Chang, J.H., Lavalle, D.T., Michelmore, R.W., and Staskawicz, B.J., Science 274 (1996) pp. 2063-2065. Tang, X., Frederick, R.D., Zhou, J., Halterman, D.A., Jia, Y., and Martin, G.B., Science 274 (1996) pp. 2060-2063. Dixon, M.S., Jones, D.A., Keddie, J.S., Thomas, CM., Harrison, K., and Jones, J.D.G., Cell 84 (1996) pp. 451-459. Dixon, M.S., Hatzixanthis, K., Jones, D.A., Harrison, K., and Jones, J.D.G., Plant Cell 10 (1998) pp. 1915-1925. Jones, D.A., Thomas, CM., Hammond-Kosack, K.E., Balint-Kurti, P.J., and Jones, J.D.G., Science 266 (1994) pp. 789-793. Thomas, CM., Jones, D.A., Parniske, M., Harrison, K., Balint-Kurti, P.J., Hatzixanthis, K., and Jones, J.D.G., Plant Cell 9 (1997) pp. 2209-2224. Song, W.-Y., Wang, G.-L., Chen, L.-L., Kim, H.-S., Pi, L.-Y., Holsten, T., Gardner, J., Wang, B., Zhai, W.-X., Zhu, L.-H., Fauqet, C , and Ronald, P., Science 270 (1995) pp. 1804-1806. Hammond-Kosack, K.E., and Jones, J.D.G., Mol. Plant-Microbe Interact. 7 (1994) pp. 58-70. Joosten, M.H.A.J, Cozijnsen, T.J., and de Wit, P.J.G.M., Nature 367 (1994) pp. 384-386. van den Ackerveken, G.F.J.M, van Kan, J.A.L., and de Wit, P.J.G.M., Plant J. 2 (1992) pp. 359-366. Vervoort, J., van den Hooven, H.W., Berg, A., Vossen, P., Vogelsang, R., Joosten, M.H.A.J., and de Wit, P.J.G.M., FEBS Lett. 404 (1997) pp. 153-158. Hammond-Kosack, K.E., Harrison, K., and Jones J.D.G., Proc. Natl. Acad. Sci. USA 91, (1994) pp. 10445-10449. Hammond-Kosack, K.E., Staskawicz, B.J., Jones, J.D.G., and Baulcombe, D.C., Mol. Plant-Microbe Interact. 8 (1995) pp. 181-185. Thomas, C M., Dixon, M. S., Parniske, M., Golstein, C. and Jones, J. D. G., Phil. Trans. R. Soc. Lond. B 353 (1998) pp. 1413-1424. Joosten, M.H.A.J., Vogelsang, R., Cozijnsen, T.J., Verberne, M.C., and de Wit, P.J.G.M., Plant Cell 9 (1997) pp. 367-379.

Tomato Cp Genes for Resistance to Cladosporium fulvum 29. 30. 31.

32.

33. 34. 35. 36. 37.

38.

39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49.

131

Dixon, M.S., Jones, D.A., Hatzixanthis, K., Ganal, M.W., Tanksley, S.D., and Jones, J.D.G., Mol. Plant-Microbe Interact. 8 (1995) pp. 200-206. Parniske, M., Wulff, B.H., Bonnema, G., Thomas, CM., Jones, D.A. and Jones, J.D.G., Mol. Plant-Microbe Interact. 12 (1999) pp. 93-102. Parniske, M., Hammond-Kosack, K.E., Golstein, C , Thomas, CM., Jones, D.A., Harrison, K., Wulff, B.B.H., and Jones, J.D.G., Cell 91 (1997) pp. 821832. Takken, F. L. W., Thomas, C. M., Joosten, M., Golstein, C , Westerink, N., Hille, J., Nijkamp, H. J. J., De Wit, P. and Jones, J. D. G., Plant J. 20 (1999) pp. 279-288. Piedras, P., Rivas, S., Droge, S., Hillmer, S. and Jones, J. D. G. (2000) Plant J., (In press). Kobe, B., and Deisenhofer, J., Trends Biochem. Sci. 19 (1994) pp. 415-421. Kobe, B., and Deisenhofer, J., Nature 374 (1995) pp. 183-186. Leckie,F., Mattei, B., Capodicasa, C , Hemmings, A., Nuss, L., Aracri, B., DeLorenzo, G., Cervone, F., EMBO J 18 (1999) pp. 2352-2363. Parker, J.E., Coleman, M.J., Szabo, V., Frost, L.N., Schmidt, R., van der Biezen, E., Moores, T., Dean, C , Daniels, M.J., and Jones, J.D.G., Plant Cell 9 (1997) pp. 879-894. Botella, M. A., Parker, J. E., Frost, L. N., BittnerEddy, P. D., Beynon, J. L., Daniels, M. J., Holub, E. B. & Jones, J. D. G., Plant Cell 10 (1998) pp. 18471860. Whitham, S., Dinesh-Kumar, S.P., Choi, D., Hehl, R., Corr, C , and Baker, B., Cell 78 (1994) pp. 1101-1115. Song, W.-Y., Wang, G.-L., Gardner, J., Holsten, and Ronald, P., Plant Cell 9 (1997) pp. 1279-1287. Anderson, P.A., Lawrence, G.J., Morrish, B.C., Ayliffe, M.A., Finnegan, E.J., and Ellis, J.G., Plant Cell 9 (1997) pp. 641-651. Ori, N., Eshed, Y., Paran, I., Prestig, G., Aviv, D., Tanksley, S., Zamir, D. and Fluhr, R., Plant Cell 9 (1997) pp. 521-532. Meyers, B. C , Chin, D.B., Shen, K. A., Sivaramakrishnan, S., Lavelle, Zhang, Z., D.O., and Michelmore, R. W., Plant Cell 10 (1998) pp. 1817-1832. Meyers, B. C , Shen, K. A., Rohani, P., Gaut, B. S. and Michelmore, R. W., Plant Cell 10 (1998) pp. 1833-1846. Ellis, J.G., Lawrence, G.J., Luck, J.E. and Dodds, P.N., Plant Cell 11 (1999) pp. 495-506. Collins, N., Drake, J., Ayliffe, M., Sun, Q., Ellis, J., Hulbert, S. & Pryor, T., Plant Cell 11 (1999) pp. 1365-1376. Michelmore, R. W. and Meyers, B. C , Genome Res. 8 (1998) pp. 1113-1130. Nei, M., Gu, X. and Sitnikova, T., Proc. Natl. Acad. Sci. USA 94 (1997) pp. 7799-7806. Noel, L„ Moores, T. L., Van der Biezen, E. A., Parniske, M., Daniels, M. J., Parker, J. E. and Jones, J. D. G., Plant Cell 11 (1999) pp. 2099-2111.

132

C. M. Thomas et al.

50.

Leonard, K. J., In The Gene-for-gene relationship in plant-parasite interactions, eds. Crate, I. R., Holub, E.B., Burdon, J.J. (CAB International, Wallingford UK, 1997), pp. 211-230. Stahl, E. A., Dwyer, G., Mauricio, R., Kreitman, M. and Bergelson, J., Nature 400 (1999) pp. 667-671. Kooman-Gersmann, M., Honee, G., Bonnema, G., and de Wit, P.J.G.M., Plant Cell 8 (1996) pp. 929-938. Clark, S. E., Williams, R. W. and Meyerowitz, E. M., Cell 89 (1997) pp. 575585. Jeong, S., Trotochaud, A. E. and Clark, S. E., Plant Cell 11 (1999) pp. 19251933. Fletcher, L. C , Brand, U., Running, M. P., Simon, R. and Meyerowitz, E. M., Science 283 (1999) pp. 1911-1914. Dixon, M.S., Golstein, C , Thomas, CM., van der Biezen, E.A. and Jones, J.D.G. (2000). Proc. Natl. Acad. Sci. USA (in press). Blatt, M. R., Grabov, A., Brearley, J., Hammond-Kosack, K. E. and Jones, J. D. G., Plant J. 19 (1999) pp. 453-462. Piedras, P., Hammond-Kosack, K. E., Harrison, K. and Jones, J. D. G., Mol. Plant-Microbe Interact. 11 (1998) pp. 1155-1166. Romeis, T., Piedras, P., Zhang, S. Q., Klessig, D. F., Hirt, H. and Jones, J. D. G., Plant Cell 11 (1999) pp. 273-287. Romeis, T., Piedras, P. and Jones, J. D. G. (2000) Plant Cell 12, (In press).

51. 52. 53. 54. 55. 56. 57. 58. 59. 60.

GENE EXPRESSION AND INTERMOLECULAR FORCES IN ESTROGEN/RECEPTOR BINDING QINGXUAN CHEN 1 , STUART ADLER2, AND FREDERICK SWEET3

'institute of Developmental Biology, Academia Sinica, P.O.Box 2707, Beijing 100080, China E-mail: qingxuanchen @yahoo. com 2

Department of Physiology, South Illinois University School of Medicine, Carbondale, Illinois, USA E-mail: [email protected]

3

Department of Obstetrics and Gynecology, Washington University School of Medicine, 4911 Barnes-Jewish Hospital Plaza, St. Louis, Missouri 63110, USA E-mail: [email protected] Nitrogen heterocyclic estrogen analogs first synthesized in our laboratory [1] competitively inhibit 17(J-hydroxysteroid dehydrogenase from human placenta [2]. The 16,17-fused ring pyrazole estrogen analog I forms strong steroid-to-protein hydrogen bonds at the enzyme's steroid-binding site. Key analogs were similarly tested with estrogen receptors in a gene expression system, employing luciferase reporter genes in HeLa cells transfected with human estrogen receptor-a (ERa) or rat estrogen receptor-p (ERP). Employing ERa or ERp receptor complexes of the analogs as mediators of gene expression, versus estradiol or a pure anti-estrogen, ICI-164,384, showed the analogs to be weak estrogens but not antiestrogens.

1

Introduction

Biosynthesis, transport and nuclear receptor mediated hormone action of estrogens is driven by the intermolecular binding of a steroid with an appropriate protein [3]. The direction and velocity of steroid metabolism, distribution in the body via steroid-binding globulins in the blood and also the power of an estrogen to promote hormone effects must, therefore, all be influenced by the strength of intermolecular binding forces in the interaction between steroid and protein [4]. It seemed reasonable to suppose that gaining quantitative insight into the fundamental nature of the intermolecular forces determining steroid-protein interactions would allow us to design and synthesize more powerful estrogens and anti-estrogens. Thus during several years, our laboratory worked on defining the amino acid structures and topography at steroid binding region of enzymes involved in estrogen biosynthesis. W e initially studied the influence of hydrogen bonding on the affinity between estradiol and an enzyme isolated from human placenta, 17 P-hydroxysteroid dehydrogenase [1,2]. A series of synthetic heterocyclic analogs (I to V) from estrone 133

134

Q. Chen et al.

was tested as competitive inhibitors of the enzyme. The data from these experiments are in Table 1. Comparing the Ki-values for enzyme inhibition reveals the pyrazolederived steroid I with groups at both the steroid A and D-ring capable of hydrogen bonding competitively inhibits 17p-hydroxysteroid dehydrogenase by over two orders of magnitude compared to the non-hydrogen bonding isoxazole VI. That is, in VI the 3-O-methyl ether and isoxazole groups do not form hydrogen bonds. Table 1: Heterocyclic Estrone Analogs Inhibit 17p-Hydroxysteroid Dehydrogenase STRUCTURE

COMPOUND

EFFECT OF 3-OR

R= H R= CH3 R= H

RELATIVE AFFINITY-

4.08

100

12.75

32

9.50

43

R= CH3

24.0

17

R= H

69.4

6

R= CH, a

Ki-VALUE (|iM)

424.5

0.96

Relative affinity = 100 x (4.08)/Ki

Inhibition experiments with human placental 17p-hydroxysteroid dehydrogenase confirmed ligand-protein hydrogen bonding could measurably shift the position of equilibrium binding (i.e., affinity) between an estrogen and steroidbinding region at the catalytic site of the estradiol-specific enzyme. It therefore seemed worthwhile to examine similar effects in a system wherein equilibrium formation of a corresponding steroid-protein complex by estradiol and estrogen receptor promotes gene expression. Such a system appeared capable of permitting measurements of the quantitative relationship between ligand-protein hydrogen bonding and gene expression via estrogen hormone action.

Gene Expression and Intermolecular Forces in Estrogen/Receptor Binding 2

135

Methods

Compounds I and V (Table 1) were prepared as previously described [1]. Stock solutions of each compound were prepared with ethanol, and then serially diluted with steroid free cell culture medium for providing the desired concentrations. These compounds were incubated with HeLa cells that had been transfected with human or rat estrogen receptor. Estradiol was from Sigma Chemical Co. in St. Louis, MO, and ICI-164,384 was a gift from Imperial Chemical Industries in Macclesfield, U.K. Assays for gene activation utilized the Vit2-P36L luciferase reporter plasmid, containing two copies of a 26 bp estrogen response element as previously described by us [5,6]. This system utterly depends on binding of the estrogen receptor protein to specific acceptor sites in the DNA for gene expression. Furthermore, DNA binding can only occur after the estrogen receptor is firmly complexed with an estrogenic steroid (e.g., estradiol) and the resulting conformation of the estrogenreceptor complex enables it to bind at specific DNA EREs. Thus the amount of luciferase expressed is a direct function of the estrogen-receptor complex bound to the specific DNA acceptor sites. The day following hormone treatment of the cells, they were harvested, lysed and the lysate was assayed for luciferase activity in a Monolight 2010 luminometer (Analytical Luminescence Laboratories, San Diego, CA) [5]. 3

Gene Expression or Inhibition Due to Estrogen Binding with Receptors

3.1 Small changes in functional groups on estradiol produce big changes in receptor binding and gene expression Recent experiments showed gene expression in HeLa cells transfected with human estrogen receptor-a (ERa) was promoted by the obligatory binding between estradiol or estradiol 3-O-methyl ether with ERa (Figure 1) [5]. Clearly, the difference in affinity for ERa between estradiol 3-O-methyl ether and natural estradiol must have been due to replacing the 3-OH group by the 3-O.methyl ether group. Thus the 3-OH promotes strong steroid-to-protein intermolecular hydrogen bonding while the 3-OCH3 group does not. To further test this hydrogen bonding hypothesis, we performed similar gene expression experiments with ERa-transfected human HeLa cells and the synthetic steroids I and V (Table 1) in the present study. These two steroids were selected because earlier they were found to act as competitive inhibitors of the human placental enzyme 17P-hydroxysteroid dehydrogenase. The measured Ki-values for I

136

Q. Chen et al.

and V is a function of their relative ability to form steroid-protein intermolecular hydrogen bonds.

-15

10

-14

10

-13

10

-12

10

-11

10

-10

-9

-8

-7

10 10 10 10 Concentration ( M )

-6

10

-5

10

10

Figure 1. Gene expression produced by estradiol 3-O-methyl ether (A) versus estradiol (O) with human estrogen receptor-a (after Adler et al, Reference [5]).

3.2 Comparing compound I for estrogenic and antiestrogenic activity with estradiol (estrogen) and ICI-164,384 (antiestrogen) Figure 2 summarizes results from testing I, both as an estrogen and as an antiestrogen. When increasing concentrations of I alone was incubated with HeLa cells transfected with human estrogen receptor, expression of luciferase was measured from an increase in fluorescence resulting from the corresponding increases in luciferase enzyme activity. Compared to a natural estrogen, compound I is approximately five orders of magnitude weaker as an estrogen than estradiol under similar conditions (Figure 2). To test I as an antiestrogen, increasing concentrations of I were co-incubated with estradiol at a fixed concentration of 10"11 M. At concentrations of 10"12 M through 10"7 M, compound I did not reduce expression of luciferase promoted by estradiol binding to form a complex with ERot (Figure 2). However, coincubating ICI-164,384 and estradiol (each at 10"10 M) with the HeLa cells caused gene expression of luciferase to almost be completely inhibited. These results indicate, compound I is merely a weak estrogen and devoid of antiestrogenic properties in the range of concentrations evaluated.

Gene Expression and Intermolecular Forces in Estrogen/Receptor Binding

137

5" 100 -

I

w

80 k

V

I 60|UJ 0)

.>

40

CC 20

id

15

id

14

_L .-8 io 13 id 12 io 11 id 10 id 9 10 10' Concentration (M)

~6

10

.-5

10

id 4

Figure 2. Gene expression promoted by the steroid 16,17-pyrazole derivative I in human HeLa cells transfected with estrogen receptor (ERa). Compound I (O, 10 s to IO4 M) shown compared as an estrogen with estradiol (A, IO"14 to 10"' JW).Antiestrogen testing was done by co-incubating IO"'2 to IO"7 M of I (•) with 10" 11 M of estradiol (X). Pure antiestrogenic, ICI-164,384 (0,10" 10 M) was the control.

3.3. Comparing compound Vfor estrogenic and antiestrogenic activity with estradiol (estrogen) and ICI-164,384 (antiestrogen). Results for V (or 16-(hydroxymethylene)estrone) were similar to that obtained for I when the steroid was tested as an estrogen or antiestrogen with HeLa cells transfected with luciferase and estrogen receptor (Figure 3). Based on the relative degree of gene expression from binding of V to ERa, the steroid binds to estradiol receptor with approximately one order of magnitude greater affinity than does I. These results are the reverse order of the corresponding Ki-values from competitive inhibition of the enzyme 17P-hydroxysteroid dehydrogenase. The relative affinity of I for the enzyme is about 17-times greater than for that of V (Table 1).

138

Q. Chen et al.

S" 1 0 0 •5

80

a x

60

> >

40

UJ 0)

I

20 _

-x15

10

-14

10

.„-13

.^12

.. -11

.^10

_,r9

10

10

10

10

10

Concentration

10

10 7 (M )

10 6

10

10

Figure 3. Gene expression promoted by the steroid 16,17-isoxazole derivative V in human HeLa cells transfected with ERa. Compound V (•) was compared as an estrogen with estradiol (A) and I (O). Similarly, 16-(hydroxy-methylene) estrone (X), the synthetic precursor of both I and V [1], was tested as an estrogen. Results are not shown from testing its antiestrogen activity under conditions similar to those for I (see Figure 2).

4

Discussion

When the heterocyclic pyrazole group is fused with the D-ring in estrone, it imparts strong hydrogen bonding properties to that part of the steroid molecule. This causes it to bind with increased affinity for certain steroid-binding proteins, as for example in compound I (Table I). The sterically similar isoxazole group is chemically different from pyrazole; it does not hydrogen bond. The isoxazole group in V increases the hydrophobic characteristics of the D-ring relative to estradiol. These differences can explain why compound I is a much more powerful competitive inhibitor of human placental 17 $-hydroxysteroid dehydrogenase than V, shown in earlier experiments [1]. When I was compared to V with respect to estrogen receptor binding in the present gene expression studies, the hydrophobic isoxazole ring in V appeared to cause a reversal relative to I in the binding affinities for the enzyme and estrogen receptor. Alternatively we must conclude, when D-ring analogs of estradiol interact with estrogen receptors, the effect of hydrogen bonding near the steroid D-ring does not very significantly affect receptor binding.

Gene Expression and Intermolecular Forces in Estrogen/Receptor Binding 5

139

Acknowledgments

We thank Mary Ann Mallon for technical assistance. This work was supported by grants from the National Institutes of Health in the United States R03 CA70515 (SA) and 5R01 DK15708 (FS), and from the Chinese Academy of Sciences (QC). This project was also supported in part by the State Basic Research Development Program (973), G200016107. References 1.

Sweet F., Boyd J, Medina O., Konderski L. and Murdock G.L., Hydrogen bonding in Steroidogenesis: Studies on new heterocyclic analogs of estrone that inhibit human estradiol 17|3-dehydrogenase. Biochem & Biophys Res Commun 180 (1991) pp. 1057-1063. 2. Murdock G.L., Warren J.C. and Sweet F., Human placental estradiol 17(Jdehydro-genase: Evidence for inverted substrate orientation ("wrong-way" binding) at the active site. Biochemistry 27 (1988) pp. 4452-4458. 3. Sweet F. and Murdock G.M., Affinity labeling of hormone-specific proteins. Endocrine Reviews 8 (1987) pp. 154-174. 4. Anstead G.M., Carlson K.E. and Katzenellenbogen J.A. The estradiol pharmaco-phore: ligand structure-estrogen receptor binding affinity relationships and a model for the receptor binding site. Steroids 62 (1997) pp. 268-303. 5. Meyer C.Y., Lutfi H.G. and Adler S., Transcriptional regulation of estrogenresponsive genes by non-steroidal estrogens: Doisynolic and allenolic acid. J Steroid Biochem 62 (1997) pp. 477-489. 6. Adler S., Waterman M.L., He X. and Rosenfeld M.G., Steroid receptormediated inhibition of rat prolactin gene expression does not require the receptor DNA-binding domain. Cell 52 (1988) pp. 685-695. 7. Ramkumar T. and Adler S., Differential positive and negative transcriptional regulation by tamoxifen. Endocrinology 136 (1995) pp. 536-542.

PROBING FOR THE BASIC OF THE LOW ACTIVITY OF THE ORIENTAL VARIANT OF LIVER MITOCHONDRIAL ALDEHYDE DEHYDROGENASE BAOXIAN W E I , AND HENRY W E I N E R

Department of Biochemistry,

1153 Biochemistry Building, Purdue University, West Lafayette, Indiana 47907-1153, USA E-mail:

[email protected]

Many Asia people possess a variant form of liver mitochondrial aldehyde dehydrogenase where a lysine replaces a glutamate at position 487 in the 500 amino acid homotetrameric enzyme. From the three-dimensional structure of the enzyme, it appeared that residue 487 interacts with two arginine residues, 264 in the same subunit and 475 in a different one. We used site directed mutagenesis to probe for why the Oriental variant had a high Km for NAD and a low specific activity. The results show that these interactions are not the sole reason for the altered properties of the Oriental variant.

1

Introduction

Abusive consumption of alcohol is a problem that exists in all populations. Though investigators have tried to explain why some individuals consume intoxicating amounts of alcoholic beverages, no definitive explanation has been presented. In contrast, it has been found that there are populations who for non-religious reasons do not consume alcohol, or if they do, it is at very low levels compared to other members of the community. It turns out that these people can not metabolize well ethanol to acetate [1,2]. The normal metabolic pathway for alcohol involves ingested ethanol being converted in the liver by the action of cytosolic alcohol dehydrogenase [3] to acetaldehyde. The acetaldehyde in turn is converted into acetate, catalyzed by liver mitochondrial aldehyde dehydrogenase (ALDH) [4]. Acetate is then utilized by liver or other tissue. Both dehydrogenases use NAD as the electron acceptor producing one mole of NADH per mole of compound oxidized. The cell must convert these NADH molecules back to NAD [5]. A person might not convert ethanol to acetate for a variety of reasons. These would include decreased activity of either dehydrogenase or impaired ability to regenerate NAD from NADH. All three of these systems have been studied and it appears that the major reason for some populations to avoid drinking alcohol containing beverages is that they are deficient in an active form of liver mitochondrial ALDH [6]. Though there are many isozymes of the enzyme in liver, it appears that the mitochondrial form is primarily responsible for acetaldehyde

141

142

B. Wei & H. Weiner

oxidation [7]. Apparently, it is the aversion to acetaldehyde that causes people to choose not to drink alcohol containing beverages. Many people from the Orient, particularly of Chinese, Japanese and Korean ancestry, "flush" after drinking ethanol. That is, their faces tend to become reddish [1, 2]. Enzyme analysis was performed on many of these individuals and they were found to be lacking the active form of liver mitochondrial ALDH. Later it was shown that these people possessed a variant of the active enzyme that differed in just one of the 500 amino acids from the active form of the enzyme. At residue 487, a lysine replaced the glutamate that is found in the active form of the 500 amino acid containing homotetrameric enzyme [6]. It was assumed that the Oriental variant of the enzyme was inactive since investigators could not detect any catalytic activity when performing gel assays normally used by investigators studying ALDH. When the tools of molecular biology became available, our laboratory cloned and expressed the active form of the human ALDH [8]. We then mutated the cDNA so it would code for the Oriental variant and expressed it in E. coli. Unexpectedly, the mutant enzyme after purification was found to possess catalytic activity. The specific activity of it was 10% of the non-Oriental form of the enzyme while the Km for NAD increased to nearly 7700 uM compare to 30 uM for the highly active enzyme [9]. Thus, under either physiological conditions or even standard assay conditions, the Oriental enzyme would essentially exhibit so little activity that it would appear to be inactive. When some Asian people who were deficient in ALDH were genotyped, it was found that they possessed genes coding for both the 487 glutamate active variant as well as the 487 lysine low-active form [10]. We later showed that these people express both forms of the enzyme indicating that the Oriental subunit was dominant over the glutamate-form [11]. Crabb's laboratory demonstrated using HeLa cells that the Oriental variant caused a decrease in activity of the glutamate containing enzyme [12]. This was the first proof that the Oriental variant was dominant in a heterotetramer. Before our knowing the structure of the enzyme, we postulated that the reason for the high Km of the Oriental enzyme could be that the positive nicotinamide ring of NAD was located near the glutamate at position 487. The binding of NAD would become difficult when the glutamate residue became a lysine. Consistent with this proposal was our finding that if we placed a neutral glutamine (Q) at this position (residue 487) a low Km, high activity mutant was produced [9]. This shows that it was not the absence of the glutamate residue (E), but the presence of the positively charged lysine (K) that caused the Oriental variant to become inactive.

Oriental Variant of Liver Mitochondrial Aldehyde Dehydrogenase

R475 in B E487 in A R264 in A

143

144

B. Wei & H. Weiner

A-siibunit

B-subunit

Figure 1. Human liver mitochondrial aldehyde dehydrogenase. Panel A shows one subunit of the homotetrameric enzyme. Residue 487 in a glutamate (E) in the active form of the enzyme'and is a lysine in the Oriental variant. Also shown are two arginine (R) residues. Panel B shows a dimer of aldehyde dehydrogenase. One of the arginines that interacts with residue 487 is found in the same subunit (R264) while the other (R475) is found in a different subunit. For purpose of illustration, one subunit is shown as a ribbon. Panel C shows the tetrameric arrangement. Two pairs of dimers make of the tetrameric enzyme.

In 1997 the three dimensional structure of the corresponding beef liver enzyme was solved by Thomas Hurley's laboratory [13], It was found, much to our surprise, that residue 487 was not located near the nicotinamide ring as we postulated. Instead, this residue was located on the surface of the subunit and formed salt bonds with two different arginine residues (R). One was located at position 264 in the same subunit and the other was at position 475 in a different subunit. This structural arrangement is presented in Figure 1 along with an illustration of just one subunit of the enzyme. The enzyme actually is a dimer of dimers. That is, two subunits interact as shown in panel B to form the tetrameric enzyme, shown in panel C. The important interactions with respect to the Oriental variant take place between subunits that form one of the dimer pairs. Since the altered properties of the Oriental variant of ALDH were not a result of the lysine at position 487 directly interfering with the binding of NAD, we undertook an investigation of the importance of the interaction between it and the

Oriental Variant of Liver Mitochondrial Aldehyde Dehydrogenase two arginine residues. Site directed mutagenesis was employed. results are presented in this chapter.

2

145 Some of the

Methods

Single mutations were created by use of oligonucleotides and polymerase chain reaction techniques. The mutant colonies were selected by double stranded DNA sequencing with a thermocycler sequencing kit. Double and triple mutants were constructed by exchanging the cDNA fragments containing the single mutants with the corresponding fragments of the native or E487K cDNA from pT7-7 plasmid. All mutants forms of the enzyme were purified as described previously[14]. The purity of the enzymes was checked by SDS-polyacrylamide gel electrophoresis using the Coomassie Blue staining procedure. The final protein concentration was determined with a Bio-Rad protein assay kit with bovine serum albumin as a standard. Dehydrogenase activity assays were performed by measuring the rate of increase in the fluorescence of NADH formation in 100 mM sodium phosphate(pH7.4) at 25 °C with an Aminco filter fluoro meter. Concentrations of NAD were l-10mM for native and different mutant enzymes. The propionaldehyde concentration was 14 uM. All kinetic measurements were performed at least three times, and the mean values were used for calculations or plots. Kinetic parameters were obtained from the MicroMath scientist program. All detailed description of the methods and materials can be found in our recent publication [15]. 3

Results and Discussion

We previously demonstrated that the recombinantly expressed version of the Oriental variant of human mitochondrial liver aldehyde dehydrogenase was active but had a very high Km for NAD. It bound NAD poorly but bound NADH well [9]. Thus it appears that the enzyme has difficulty in interacting with the positive charge of the nicotinamide ring of NAD. The properties of the enzyme as well as that of the E487Q mutant are presented in Table 1. The latter construct, mentioned above, was studied so we could determine if it was the loss of the glutamate residue or the presence of the lysine that caused the enzyme to possess altered properties. From the data, it is apparent that the presence of a positive lysine hurt the enzyme and not the loss of the negative glutamate.

146

B. Wei & H. Weiner

From the structure shown in Figure 1 it was apparent, so we thought, that the disruption of the interactions between the glutamate at position 487 and the arginines at positions 264 and 475 were the cause of the altered properties of the Oriental variant. To test for this, a number of mutations to the enzyme were made. These included changing the arginines so that there would not be positive charge repulsions between the groups as well as trying to restore the salt bonds by introducing a negative charge in place of the positive arginines. First, simple controls were made which included changing the arginine in the native enzyme. All the data is summarized in Table 1. The arginine at position 264 does not seem to be important for the activity of the enzyme. This is not the case with the arginine at position 475. The mere removal of this arginine caused the enzyme to have altered properties. Thus, it was not possible for us to determine how important were the interactions between residue 487 and 475. It is of possible interest to note that the glutamate at 487 and the arginine at position 475 are not conserved in all the members of the ALDH family. Every form of the enzyme that has a glutamate at 487 has an arginine at position 475. This interaction, then, is critical for the enzyme. It was our hope to be able to explain why the Oriental variant of the ALDH had a low activity and a high Km for NAD. Based on structural information it appeared that the interaction between the lysine at position 487 in the Oriental variant and the arginine 264 in the same subunit or arginine 475 in the subunit making up the dimer-pair was critical. The mutational approach did not allow us to unravel this interesting question for any change made to the arginine at position 475 caused the enzyme to have altered properties. Thus in spite of knowing the three dimensional structure of the enzyme, we still cannot explain how the one amino acid replacement caused the Oriental variant to have such different properties. Table 1 Kinetic properties of various mutant of human liver mitochondrial ALDH.

Mutant E487a (Native enzyme) E487Q E487,R475Q E487.R264Q E487,R475Q, R264Q E487,R475E E487,R264E E487,R475E, R264E E487K E487K, R475Q E487K, R264Q

KM NAD (pM) 37 90 850 60 1300 1300 740 16000 7400 1500 1400

*C<JA

(min1) 200 120 120 125 nd 18 12 4 16 27 27

Hill coefficientb (n) 1.0 1.0 1.8 1.0 1.6 2.0 1.0 ndc 1.0 1.5 1.0

Oriental Variant of Liver Mitochondrial Aldehyde Dehydrogenase

147

E487K, R264Q, R475Q 1500 nd 1.4 E487K,R475E 3200 40 1.5 E487K.R264E 700 31 1.0 E487K, R264E, R475E 7700 7 L4_ a E487 is the Native ALDH possessing a glutamate at position 487. E487K is the Oriental variant ALDH in which glutamate 487 was replaced by a lysine. The other mutants, except for E487Q, represent double or triple mutants that were designed to test for the restoration of activity by reestablishing salt bonds between residues at position 4887 with those at 264 and 475. b Hill coefficient is a measure of subunit interaction. Mutation of R475 caused the enzyme to be one that exhibits cooperativity in coenzyme binding as noted by the value being greater than 1.0. c the activity was too low to determine accurately a Hill coefficient. 4

Acknowledgements

This work was supported in part by a grant from the National Institute of Alcohol Abuse and Alcoholism (AA05812). References 1. Wolff, P.H. Ethnic differences in alcohol sensitivity. Science 175 (1972) pp. 449-50. 2. Harada, S., Agarwal, D.P. and Goedde, H.W., Aldehyde dehydrogenase deficiency as cause of facial flushing reaction to alcohol in Japanese [letter]. Lancet 2 (1981) pp. 982. 3. Crabb, D.W., Bosron, W.F. and Li, T.K., Ethanol metabolism. Pharmacol Ther 34 (1987) pp. 59-73. 4. Svanas, G.W. and Weiner, H., Use of Cyanamide to Determine Localization of Acetaldehyde Metabolism in Rat Liver. Alcohol 2 (1985) pp. 111-115. 5. Lehninger, A.L., Phosphorylation coupled to oxidation of dihydrodiphosphopyridine nucleotide J. Biol. Chem. 190 (1951) pp. 345-359. 6. Ikawa, M., Impraim, C.C., Wang, G. and Yoshida, A., Isolation and Characterization of Aldehyde Dehydrogenase Isozymes from Usual and Atypical Human Livers. J. Biol. Chem. 258 (1983) pp. 6282-6287. 7. Svanas, G.W. and Weiner, H., Aldehyde Dehydrogenase Activity as the Ratelimiting Factors for Acetaldehyde Metabolism in Rat Liver. Arch. Biochem. Biophys. 236 (1985) pp. 36-46. 8. Zheng, C.-F., Wang, T.T. and Weiner, H., Cloning and Expression of the Fulllength cDNAs Encoding Human Liver Class 1 and Class 2 Aldehyde Dehydrogenase. Alcohol. Clin. Exp. Res. 17 (1993) pp. 828-831.

148 9.

10.

11.

12.

13.

14.

15.

B. Wei & H. Weiner Farres, J., Wang, X., Takahashi, K., Cunningham, S.J., Wang, T.T. and Weiner, H., Effects of Changing Glutamate 487 to Lysine in Rat and Human Liver Mitochondrial Aldehyde Dehydrogenase. A Model to Study Human (Oriental Type) Class 2 Aldehyde Dehydrogenase. 7. Biol. Chem. 269 (1994) pp. 13854-13860. Crabb, D.W., Edenberg, H.J., Bosron, W.F. and Li, T.-K., Genotypes for Aldehyde Dehydrogenase Deficiency and Alcohol Sensitivity. The Inactive ALDH2(2) Allele is Dominant. J. Clin. Invest. 83 (1989) pp. 314-316. Wang, X., Sheikh, S., Saigal, D., Robinson, L. and Weiner, H., Heterotetramers of Human Liver Mitochondrial (class 2) Aldehyde Dehydrogenase Expressed in E. coli. A Model to Study the Heterotetramers Expected to be Found in Oriental People. J. Biol. Chem. 271 (1996) pp. 3117231178. Xiao, Q., Weiner, H. and Crabb, D.W., The mutation in the mitochondrial aldehyde dehydrogenase (ALDH2) gene responsible for alcohol-induced flushing increases turnover of the enzyme tetramers in a dominant fashion. J Clin Invest 98 (1996) pp. 2027-32. Steinmetz, C.G., Xie, P.G., Weiner, H. and Hurley, T.D., Structure of Mitochondrial Aldehyde Dehydrogenase: the Genetic Component of Ethanol Aversion. Structure 5 (1997) pp. 701-711. Jeng, J.J. and Weiner, H., Purification and Characterization of Catalytically Active Precursor of Rat Liver Mitochondrial Aldehyde Dehydrogenase Expressed in Escherichia coli. Arch. Biochem. Biophys. 289 (1991) pp. 214222. Wei, B., Ni, L., Hurley, T.D. and Weiner, H., Cooperativity in nicotinamide adenine dinucleotide binding induced by mutations of arginine 475 located at the subunit interface in the human liver mitochondrial class 2 aldehyde dehydrogenase. Biochemistry 39 (2000) pp. 5295-302.

S

RNASES

AND SELF AND NON-SELF POLLEN RECOGNITION IN FLOWERING PLANTS

YONGBIAO X U E 1 , HAIYANG C U I , Z H A O L A I , WENSHI M A , LIZHI LIANG, HUUUN Y A N G , AND YANSHENG ZHANG

Laboratory of Plant Genetics and Developmental Biology, Institute of Developmental Biology, The Chinese Academy of Sciences, Beijing 100080, China ('Authorfor

correspondence)

Email: ybxue @public3. bta. net. en Self-incompatibility (SI) is an important intraspecific reproductive barrier to prevent selffertilization in flowering plants. In many cases, SI is controlled by a single multi-allelic locus, the 5 locus. Molecular analysis of self-incompatible species of the Solanaceae, Scrophulariaceae and Rosaceae have shown that a class of ribonucleases encoded by the S locus, known as S RNases, determine the stylar expression of SI but not its pollen expression. A different gene is thought to control pollen expression of SI (pollen S gene). Here, we present some progress made towards molecular cloning of the pollen S in Antirrhinum using two approaches, 5-locus directed transposon tagging and map-based cloning. Possible pathways of how S RNases interact with pollen S gene product to achieve self and non-self pollen recognition are discussed.

1

Introduction

Fertilization in flowering plants involves several cell-cell recognition events. The pollen first adhere on the stigma surface and then germinate and grow intercellularly within transmitting tissues of the style and finally deliver sperm cells into a structure, called ovule, to fuse with the egg cell and central cells to form the embryo and endosperm, respectively. However, in many species of angiosperms, selffertilization is prevented by an intraspecific reproductive barrier, known as selfincompatibility (SI) [4]. In many cases, SI is controlled by a single multi-allelic locus, termed the S locus, which is also referred to as the monolocus SI. There are also di- or multi-locus SI. The number of S alleles within an S locus is quite large, reaching hundreds in some species [4], and the S locus is the biggest multi-allelic one in plants described so far, similar to animal MHC (major histocompatibility) and fungal mating type loci. Based on floral morphs (style lengths and anther levels) of self-incompatible species, SI is divided into homomorphic or heteromorphic [4]. In the former, floral morphs are the same between individuals with different S genotypes, whereas individuals with different S genotypes among a heteromorphic SI species display different floral morphs. Homomorphic SI is a predominant form and can be further classified into gametophytic and sporophytic based on modes of genetical control of pollen SI phenotype [4]. 149

150

2

Y. Xue et al.

Gametophytic Self-Incompatibility and S RNases

In gametophytic self-incompatibility (GSI), the haploid S genotype of pollen determines its phenotype. When an S allele carried by pollen matches either of the stylar ones, pollen tube growth is arrested within the style after germination. Species from the Solanaceae, Rosaceae and Scrophulariaceae are often of GSI. Biochemical studies on GSI began in 1950s when Lewis identified pollen antigens related to S alleles in evening primroses {Oneothera organesis) [14]. But until 1986, Anderson et al. [2]were able to clone the first gene product encoded by the S locus, known as S locus glycoproteins (SLGs), from Nicotiana alata. So far, over 100 related SLG genes have been isolated from several species of the Solanaceae, Rosaceae and Scrophulariaceae [1, 11, 20, 23, 24]. Based on DNA sequence analysis, the SLGs showed high homology to a class of ribonucleases in fungi (Rh and T2) [10] and contained similar active sites. Because of that, they were refereed to as self-incompatibility (S) ribonucleases (RNases) [17]. S RNases are usually 250 amino acids in length and contain 5 conservative (C1-C5) and 2 hypervariable (Hva and HVb) regions [9] (Figure 1). CI, C4 and C5 are hydrophobic domains and may be related to the formation of the spatial structure of S RNases; C2 and C3 are hydrophilic enzymatic sites. However, recent results showed that C4 domain is not well conserved in the S RNases from the Scrophulariaceae and Rosaceae [20, 24]. The expression of S RNases are developmentally regulated and mainly located in extracellular matrix of styles with a distribution below top 1/3 of style length, where self-pollen tube growth arrest occurs [3, 24]. Using in vivo 32P labeled pollen RNA, McClure et al. [18] demonstrated that labeled RNA remained intact after outcrossing and were degraded after selfing, suggesting that S RNases function selectively in vivo and only degrade RNAs after selfing. This result also indicated that the S RNases possibly inhibit self-pollen tube growth through a cytotoxic effect. In fact, removal of ribonuclease activity of S RNases in Petunia inflata leaded to the loss of their ability to reject self-pollen [8]. Further, transgenic results from P. inflata using both loss-of-function and gain-of-function approaches provided direct evidence that S RNases are responsible for the stylar expression of SI [12].

NH2-COOH

CI

C2

HVa

HVb

C3

C4

C5

Figure 1. Schematic illustration of S RNase organization. CI: 5 Conservative domains. Hva and HVb, Hypervariable domains.

S RNases and Self and Non-Self Pollen Recognition in Flowering Plants

151

It is still not clear how S RNases accomplish pollen recognition. The transgenic results showed that the pollen product of the S locus (pollen S) is not S RNases because down-regulation of S RNase only affected the style SI phenotype but not pollen [12]. Recent results have demonstrated that the HV regions determine S RNase recognition specificity [15], though other regions could not be excluded from their involvement[26]. However, Matton et al. [16] showed that a single amino acid change within the HV region between two S RNases converted their respective pollen recognition specificity. The current models of S RNase action postulate that the pollen component of the 5 locus encodes receptor or inhibitor for S RNases [5, 19]. In the receptor model, the pollen S produces a membrane-bound receptor which specifically internalizes self-RNase and cause self-pollen destruction. In the inhibitor model, the expression of pollen S leads to production of an S ribonuclease inhibitor which inhibits all other S RNase activity except self-S-RNase which then arrest the growth of self-pollen tubes. Currently, both models are lacking molecular or biochemical evidence. However, the inhibitor model could explain the generation of pollen part mutations obtained previously [13] and the result that GSI diploid plants normally become self-compatible after being made tetraploids [1]. In Antirrhinum, a pollen part mutant obtained through transposon mutagenesis experiments was likely produced by an S allele duplication [25]. Recently, Golz et al. [7] have studied pollen part mutants in Nicotiana alata generated through irradiation and clearly shown that some of them resulted from the S allele duplication. In these cases, two different S alleles are present in the pollen, two different S RNase inhibitors would be generated and, therefore, S RNase activities encoded by either 5 alleles would be inhibited. Consequently, pollen carrying two different 5 alleles would overcome the action of S RNases produced in the self-style and allow normal pollen tube growth to occur. However, it is possible that the inactivation of pollen 5 encoded inhibitor could be lethal to pollen tube growth because of the inability to inhibit any RNases from the style. If this were the case, it would be difficult to clone the pollen S gene through conventional transposon mutagenesis approaches. Thus, strategies that select specifically for gametophytic lethal mutations or more physical approaches might be useful for cloning the pollen component of the S locus. 3

Approaches to Clone Pollen S Gene in Antirrhinum

Considering the possibility that pollen S inactivation might lead to gametophytic lethality, indirect approaches should be adopted to clone it. In Antirrhinum, the following two approaches are being actively pursed in order to clone the pollen S.

152 3.1

Y. Xue et al. S locus-directed transposon mutagenesis

DNA transposons are known to transpose to linked loci more frequently. Therefore, linked transposons are more useful to mutate target genes. Further, if a transposon is close enough to a known target gene, polymerase chain reaction (PCR) can be applied to screen a large pool of individuals using gene- and tranposon-specific primers for transposon insertions into the target gene. This approach is known as site-directed transposon tagging which is widely used in plant functional genomics. In order to implement this approach, Antirrhinum plants carrying functional S alleles and an S locus-linked transposon were obtained (Xue et al., unpublished data). Molecular genetic analysis showed that the transposon was highly mobile with a germinal excision rate of about 30%. The plants also carried a non-functional S allele, Sc, which allows a large number of progeny to be generated by selfing. It is possible to screen directly for the S RNase gene insertion because inactivation of S RNase does not lead to lethality but self-compatibility [12]. Currently, we are screening for transposon insertions into an S RNase gene. Once the transposon jumps within or close to the S RNase gene, it will be further mobilized to mutagenise the S locus and, in particular, to screen for gametophytic lethal mutations. The mutations will be further analyzed to determine if they affect pollen expression of SI. 3.2

Map-based approach

Genetic and molecular evidence indicate that the S locus is highly polymorphic [6]. Recent developments in DNA marker and cloning technology, e.g., bulk segregant analysis, and AFLP (amplified fragment length polymorphism) and bacterial artificial chromosomes (BAC), made a map-based cloning approach to the pollen S possible. In this approach, BAC contigs covering the S locus are constructed. Fine mapping of the S locus can be done using a large number of AFLP markers linked to the S locus and an S allele segregating population, allowing the determination of its physical limits. Genes with the pollen S gene signatures, e.g., polymorpic and pollen-specific expression, can be analyzed in details to determine if they encode the pollen S gene product. Up to now, we have made a BAC library of a selfincompatible Antirrhinum line with 52S4 and a BAC clone (ca. 64 Kb) containing a complete S2 RNase gene was obtained. Molecular analysis of this clone has identified over 10 genes tightly linked to the S2 RNase gene (Lai et al, unpublished data).

S RNases and Self and Non-Self Pollen Recognition in Flowering Plants 4

153

Perspectives

Genes encoding S RNases from several plant families have been isolated and their roles in determining stylar expression of SI demonstrated. However, a complete picture of GSI is still missing because the identity of the pollen product of the S locus is still elusive. Major efforts are needed to clone this gene. It appears that several approaches including those described for Antirrhinum are applicable for reaching this goal. Recent identification of a pollen determinant of sporophytic SI in Brassica is very revealing [21, 22]. Pollen S gene from a GSI system will certainly lead to more insights into how flowering plants accomplish self and non-self pollen recognition. 5

Acknowledgements

Financial support by the National Natural Science Foundation of China (grant nos.39670387 and 39825103), the National Climbing Programme of the Ministry of Science and Technology of China and the Chinese Academy of Sciences is gratefully acknowledged. We are also grateful for Drs E. S. Coen and R. Carpenter of John Innes Center, UK for their generous helps. References 1.

2.

3.

4.

5.

Ai, Y., Singh, A., Coleman, C.E., Ioerger, T.R., Kheyr-Pour, A., and Kao, T.H., Self-incompatibility in Petunia inflata: Isolation and characterization of cDNAs encoding three 5-allele-associated proteins, Sex.Plant Reprod. 3 (1990) pp.130-138. Anderson, M. A., Cornish, E. C , Mau, S. -L., Williams, E. G., Hogart, R., Akinson, A., Bonig, I., Grego, B., Simpson, R., Roche, P. J., Haley, J. D., Penshow, J. D., Niall, H. D., Tregear, G. W., Coghlan, J. P., Crawford, R. J. and Clarke, A. E., Cloning of cDNA for a stylar glycoprotein associated with expression of self-incompatibility in Nicotiana alata, Nature 321 (1986) pp. 3844. Cornish, E. C , Pettitt, J.M., Bonig, I., and Clarke A.E., Developmentally controlled expression of a gene associated with self-incompatibility in Nicotiana alata, Nature 329 (1987) pp. 99-102. de Nettancourt, D., Incompatibility in Angiosperms: Monographs on Theoretical and Applied Genetics, Vol. 3. (1977) (Heidelberg: SpringerVerlag). Dodds, P.N., Clarke A. E., Newbigin, E., A molecular perspective on pollination in flowering plants, Cell 85 (1996) pp.141-144.

154 6.

7.

8.

9.

10.

11.

12. 13. 14. 15.

16.

17.

18.

19. 20.

Y. Xue et al. Ebert, P. R., Anderson, M A., Bernatzky, R., Altshuler, M., Clarke, A. E., Genetic polymorphism of self-incompatibility in flowering plants, Cell 56 (1989) pp.255-262. Golz, J. F., Su, V., Clarke, A. E., Newbigin, E., A molecular description of mutations affecting the pollen component of the Nicotiana alata S locus, Genetics 195 (1999) pp.1123-1135. Huang, S., Clark, A.G., Kao, T.-h., Ribonuclease activity of Petunia inflata S proteins is essential for rejection of self-pollen, Plant Cell 6 (1994) pp. 10211028. loerger, T. R., Clark, A.G., Kao, T.-h., Polymorphism at the self-compatibility locus in Solanaceae predates speciation, Proc. Natl. Acad. Sci. USA 87 (1990) pp.9732-9735. Kawata, Y., Sakiyama, F. and Tamaaoki, H., Amino-acid sequence of ribonuclease T2 from Aspergillus oryzae, Eur. J. Biochem, 176 (1988) pp. 683697. Kaufmann, H., Salamini, F., and Thompson, R.D., Sequence variability and gene structure at the self-incompatibility locus of Solanum tuberosum, Mol. Gen. Genet. 226 (1991) pp. 457-466. Lee, H.-S., Huang, S., Kao, T.-h., S-proteins control rejection of incompatible pollen in Petunia inflata, Nature 367 (1994) pp.560-563. Lewis, D., Structure of the incompatibility gene. Ill, types of spontaneous and induced mutations, Heredity 5 (1951) pp.399-414. Lewis, D., Serological reactions of pollen incompatibility substances, Proc.Roy.Soc. (Lond.) B 140 (1952) pp. 127-135. Matton, D. P., Maes, O., Laublin, G., Xike, Q., Bertrand, C , Morse, D., Cappadocia, D., Hypervariable domains of self-incompatibility RNases mediate allele-specific pollen recognition, Plant Cell 9 (1997) pp. 1757-1766. Matton, D. P., Luu, D. T„ Xike, Q., Laublin, G., O'Brien, M., Maes, O., Morse, D., Cappadocia, M., Production of an S RNase with dual specificity suggests a novel hypothesis for the generation of new S alleles, Plant Cell 11 (1999) pp.2087-2097. McClure, B. A., Haring, V., Ebert, P.R., Anderson M.A., Simpson, R.J., Sakiyama, F., Clarke, A.E., Style self-incompatibility gene products of Nicotiana alata are ribonucleases, Nature 342 (1989) pp. 955-957. McClure, B. A., GrayJ.E., Anderson, M.A., Clarke, A.E., Self-incompatibility in Nicotiana alata involves degradation of pollen rRNA, Nature 347 (1990) pp.757-760. Ren, D., Zhang, Y., Xue, Y., Molecular controls of self-incompatibility, Adv. Plant Sci. 1 ( 1998) pp.95-106 (In Chinese). Sassa, H., Nishio, T., Kowyama, Y., Hinano, H., Koba, T., Ikehashi, H., Selfincompatibility^ alleles of the Rosaceae encode members of a distinct class of the T2/S ribonuclease superfamily, Mol.Gen.Gene. 250 (1996) pp.547-557.

S RNases and Self and Non-Self Pollen Recognition in Flowering Plants

155

21. Schopfer, C. R., Nasrallah, M. E., Nasrallah, J. B., The male determinant of self-incompatibility in Brassica, Science 286 (1999) pp.1697-1700. 22. Takayama, S„ Shiba, H., Iwano, M., Shimosato, H., Che, F. S., Kai, N., Watanabe, M., Suzuki, G., Hinata, K., Isogai, A., The pollen determinant of self-incompatibility in Brassica campestris, Proc. Natl. Acad. Sci. USA 97 (2000) pp. 1920-1925. 23. Xu, B., Mu, J., Nevins, D.L., Grun, P., Kao, T.-h., Cloning and sequencing of cDNAs encoding two self-incompatibility associated proteins in Solanum chacoense, Mol. Gen. Genet. 224 (1990) pp.341-346. 24. Xue, Y. Carpenter, R., Dickinson, H.G., Coen, E.S., Origin of allelic diversity in Antirrhinum 5 locus RNase, Plant Cell 8 (1996) pp.805-814. 25. Xue,Y. Carpenter, R., Dickinson, H.G., Coen, E.S., Mutational analysis of the self-incompatibility locus in Antirrhinum, submitted to Heredity (2000). 26. Zurek, D. , Mou, B., Beecher, B., McClure B., Exchanging sequence domains between S-RNases from Nicotiana alata disrupts pollen recognition, Plant. J. 1 (1997) pp.797-808.

THE ROLES OF CARBONIC ANHYDRASE ISOZYMES IN CANCER W. RICHARD CHEGWIDDEN 1 , I A N M . SPENCER 2 , A N D C L A U D I U T . SUPURAN 3

'Lake Erie College of Osteopathic Medicine, 1858 West Grandview Boulevard, Erie, PA 16509, U.S.A. Email: [email protected] 2

Division of Biomedical Sciences, Sheffield Hallam University, Sheffield, SI 1WB, UK.

3

Universita degli Studi, Laboratorio di Chimica Inorganica e Bioinorganica, Capponi 7,1-501221, Florence, Italy.

Via Gino

Specific sulphonamide inhibitors of carbonic anhydrase also inhibit the growth, and suppress the invasion, of certain types of cancer cells in culture, suggesting potential for cancer therapy. Inhibition of cell growth may be mediated through a reduction in bicarbonate provision by the cytosolic CA II and mitochondrial CA V isozymes, for the synthesis of nucleotides and other cell components. It is hypothesized that suppression of invasive properties may be mediated through inhibition of the cancer-associated, cell surface isozymes, CA IX and CA XII, resulting in a less acidic extracellular pH. CA DC may be a useful marker for renal clear cell and cervical carcinomas and a valuable adjunct to PAP screening. CA XII may be a useful marker for colorectal tumours.

1

Introduction

The zinc metalloenzyme carbonic anhydrase (CA: EC 4.2.1.1) catalyses the reversible hydration of carbon dioxide to bicarbonate (C02 + H20 <-» HC03- ) and is seemingly ubiquitous in nature. This reaction is fundamental to so many processes involving gas, ion or fluid transfer, pH control, and production of acid and bicarbonate that the enzyme is perhaps unique in its consequent involvement in numerous physiological functions. These include respiration and acid/base regulation, bone development and function, production of gastric acid, bile and accompanying pancreatic secretions, formation of cerebrospinal fluid and aqueous humour, and production of bicarbonate for a range of metabolic processes. Additional roles have also been suggested in the gastrointestinal and reproductive systems, in muscle, and in molecular signaling. (For review of CA functions see [3].) Of the three gene families (a, P and y) of carbonic anhydrases, only the ccfamily has been shown to be present in mammals. It is known to be expressed in fourteen different isoforms of varying properties and tissue and cellular distribution. Ten of these possess C02-hydration activity (CA I - VII, CA IX, CA XII and CA XIV), whilst three others, designated CA-related proteins, (CA-RP VIII, CA-RP X and CA-RP XI) are probably inactive, since each lacks one or more of the three histidine residues that appear to be essential for zinc-binding. CA XIII has been 157

158

W. R. Chewidden et al.

identified only from expressed sequence tags (ESTs) derived from a mouse mammary gland cDNA library, but is presumed to be active on the basis of the translated sequence. Two receptor-type transmembrane tyrosine phosphatases also possess CA-related domains in the extra-cellular regions {CA-RP(RPTPP) and CARP(RPTPy)}. Among the active isozymes, CA I, II, III and VII are cytosolic, CA V is mitochondrial and CA VI is secreted (in saliva). CA IV is extracellular, anchored to the membrane by a glycophosphatidylinositol (GPI) tail, whilst CA IX, XII and XIV, in common with CA-RP(RPTPP) and CA-RP(RPTPy), are extracellular domains of larger transmembrane proteins. A summary of the subcellular location and major known sites of expression of all the mammalian CA isoforms is given in Table 1. (For reviews see [3,12]) A common feature of almost all a-CA isozymes investigated thus far is that they are strongly, and it would appear, specifically inhibited by certain heterocyclic and aromatic sulphonamides, such as acetazolamide, methazolamide and ethoxzolamide. The sole exception to this is CA III, which is only moderately inhibited. Use of these sulphonamides has played a major part in providing the evidence linking carbonic anhydrase activity to various physiological functions. The relative effectiveness of these inhibitors against a number of mammalian CA isozymes is shown in Table 2. In general, the sequence of effectiveness as inhibitors is ethoxzolamide > methazolamide > acetazolamide against all oc-CAs examined. All the a-CA isozymes examined have been shown to possess additional general esterase activity, to which no physiological significance has been ascribed hitherto [37]. In recent years a range of evidence has accumulated suggesting that the active isozymes, CA II, V, IX and XII may be involved in oncogenesis and tumour growth or invasion. The activity of intracellular CA isozymes may be required to support the enhanced rate of biosynthetic processes characteristic of cancer cells, whilst activity of extracellular isozymes may be involved in the creation of an extracellular milieu that is more conducive to cell invasion. Finally, circumstantial evidence has also prompted speculation that CA-RP(RPTPy) may be a tumour suppressor gene. It is also a tenable supposition that both CA-RP(RPTPp) and CA-RP(RPTPy) may be involved in the control of oncogenesis or cell growth by a mechanism mediated through the dephosphorylation of tyrosine.

159

Roles of Carbonic Anhydrase Isozymes in Cancer Table 1. Expression and distribution of mammalian carbonic anhydrase isozymes. ISOZYME

SUB-CELLULAR LOCALIZATION

MAJOR KNOWN SITES OF TISSUE EXPRESSION

CAI CAII CAIII CAIV

cytoplasmic cytoplasmic cytoplasmic membrane-bound (extra-cellular)

CAVA

mitochondrial

CAVB CAVI CAVII

mitochondrial secreted cytoplasmic

CA-RP VIII

unknown

CAIX

transmembrane (extra-cellular domain) unknown

CA-RP X CA-RP XI CAXII CA XIII1 CAXIV CA-RP (RPTPp) CA-RP (RPTPy)

secreted transmembrane (extra-cellular domain) unknown transmembrane (extra-cellular domain) transmembrane (extra-cellular domain) transmembrane (extra-cellular domain)

red blood cell, intestine ubiquitous red muscle, adipose tissue kidney, lung, gut, brain, eye, widespread in capillary endothelium liver (also skeletal muscle, kidney) widespread (except liver) saliva brain, salivary gland, lung, probably widespread at low levels brain (esp. Purkinje cells) (widespread at lower level) various tumours, gastric mucosa brain (also pineal gland, placenta) brain widespread, especially colon, kidney, prostate unknown widespread, especially kidney, heart central & peripheral nervous system brain, lung2 (widespread in mouse)

To date CA XIII has been identified only from ESTs (expressed sequence tags) from a mouse cDNA library. 2 Human tissue distribution has not been fully investigated.

160

W. R. Chewidden et al.

Table 2

Inhibition of some mammalian CA isozymes by sulphonamides

K, (nM) CA ISOZYME Acetazolamide Human CA I Human CA II Human CA III Human CA IV Murine CA V

Methazolamide

200 7 3 x 105 66 60

10 7 1 x 105 33

Ethoxzolamide 1 0.5 5 x 104 13 5

Data are from [24] for the human isozymes and from [11] for the murine isozyme. 2

CA Isozymes and Cell Growth

Bicarbonate, not C02, is the true substrate for early carboxylation steps in biosynthetic pathways which involve biotin-dependent carboxylases or carbamoyl phosphate synthetases. Use of specific sulphonamide inhibitors has furnished evidence that carbonic anhydrase activity is required for provision of bicarbonate for gluconeogenesis [9,29], lipogenesis [6,39,40] and ureogenesis [29,10]. It may be that a low flux through pathways may be accommodated by the uncatalyzed rate of bicarbonate provision, whilst the catalytic activity of carbonic anhydrase is necessary to sustain a higher level of metabolic flux [6]. Such a high level of flux occurs in cancer cells, where the enhanced rate of cell proliferation necessitates a higher level of synthesis of nucleotides and other cell components, such as membrane lipids, than that required in normal cells. Inhibition of the growth of human cancer cells in culture by specific carbonic anhydrase inhibitors was first reported by Chegwidden and Spencer [5], who drew attention to the possible potential of CA inhibitors in cancer therapy. Table 3 shows the effectiveness of three different sulphonamide inhibitors in inhibiting the growth in culture of two different cell lines: U927, a line established from a diffuse, histiocytic, human lymphoma, and Raji, a line of lymphoblast-like cells established from a Burkitt lymphoma. The relative effectiveness of the three inhibitors in inhibiting cell growth correlates well with their effectiveness as inhibitors of CA activity (Table 2). This supports the premise that the effects observed do, indeed, result specifically from inhibition of carbonic anhydrase. Although the GI50 values obtained are much higher than the Ki values for purified CA isozymes, the inhibitor concentrations present in the cell, and especially in the mitochondrion, are likely to be much lower than that of the medium.

161

Roles of Carbonic Anhydrase Isozymes in Cancer

Acetazolamide, especially, does not readily penetrate the cell membrane [25] and, although ethoxzolamide is the most lipophilic of the three inhibitors employed, a serum protein has been identified to which it binds strongly [8]. Consequently its effectiveness may be much reduced by the fetal calf serum present in the culture medium. Table 3 Inhibition of cell growth by sulphonamides GI 50 (uM) Cell Line U937 Raji

Acetazolamide 230 230

Methazolamide 20 18

Ethoxzolamide 0.5 0.4

Cells were cultured, in the presence of a range of sulphonamide concentration, in RPMI 1640 medium containing fetal calf serum (10%), glutamine (4 mmol/1) and penicillin and streptomycin (lOOug/ml). They were incubated at 37°C under an atmosphere of 5% C0 2 for a period of two days. Cell viability was confirmed by trypan blue exclusion. GI50 is the concentration of sulphonamide that inhibited cell growth by 50% as measured over the two day period. Chegwidden and Spencer [5] hypothesized that the inhibition of cell growth may be attributed to inhibition of nucleotide synthesis. This may be a consequence of sulphonamide inhibition of either the cytosolic isozyme, CA II, or the mitochondrial isozyme, CA V, or indeed, of both of these isozymes. Inhibition of cytosolic CA would limit bicarbonate availability for the cytosolic isozyme of carbamoyl phosphate synthetase (CPS II) that catalyses the first step of de novo pyrimidine synthesis. CPS II glutamine + HC03" + 2ATP ^. carbamoyl phosphate + glutamate + 2ADP + Pj This enzyme forms part of a multi-enzyme complex which appears to specifically direct cytoplasmically produced carbamoyl phosphate into pyrimidine synthesis. However, there is evidence that carbamoyl phosphate produced in the mitochondrion by the mitochondrial isozyme CPS I, may also be used in pyrimidine synthesis [31]. CPS I NH 4 + +HC0 3 " + 2ATP — ^

carbamoyl phosphate + 2ADP + Pj

Inhibition of the mitochondrial CA isozyme, CA V, would also reduce availability of bicarbonate for CPS I.

162

W. R. Chewidden et al.

The hypothesis that cell growth is limited by inhibition of nucleotide synthesis is supported by our observation that supplementation of the culture medium with nucleotide precursors (hypoxanthine and thymidine) resulted in no significant inhibition of growth of either U935 or Raji cells by either acetazolamide (up to 5 mM) or methazolamide (up to 0.2 mM) and only slight inhibition by ethoxzolamide (up to 5 uM) [7]. An intra-mitochondrial source of bicarbonate is also required for pyruvate carboxylase, which catalyses the carboxylation of pyruvate to oxaloacetate.

pyruvate + HC03" + ATP

pyruvate carboxylase biotin ^. oxaloacetate + ADP + Pj

This is an early step in the production of aspartate, glutamine and glycine, all of which are precursors of purines and pyrimidines. However, this is unlikely to be relevant in cell culture, since these amino acids are supplied in the medium. Nonetheless, reduction of mitochondrial oxaloacetate levels may well contribute to cell growth inhibition through a different mechanism. Oxaloacetate is required for the transport of acetyl groups out of the mitochondrion (as citrate) into the cytoplasm, where they serve as substrate for the synthesis of lipids, required as membrane components. There is a wealth of evidence that sulphonamide inhibitors of CA also inhibit lipogenesis [23,39,40]. Metabolic labelling experiments have demonstrated that the mitochondrial isozyme, CA V, plays a role in lipogenesis by providing bicarbonate ions for the production of oxaloacetate [23]. However, the possibility that cytosolic CA plays an additional role in lipogenesis, by providing bicarbonate for acetyl CoA carboxylase, cannot be excluded. 3

CA Isozymes and Tumour Invasion

The recent discovery of two a-CA isozymes (CA IX and CA XII) that are associated with, although not exclusive to, certain types of tumours, suggests that these isozymes may be involved in oncogenesis, cell growth or tumour invasion [35,45,49]. Since each of these two isozymes exists as an extracellular domain of a larger transmembrane protein, it is a tenable hypothesis that, by hydrating carbon dioxide, they produce a more acidic extracellular environment, which may facilitate the invasive and migratory properties of cancer cells. There is evidence that the invasive properties of some types of cancer cells are enhanced at more acidic pH [26]. It is now well established that the extracellular pH of solid tumours (pHe) is often lower than in normal tissues, whilst their intracellular pH (pHi) is generally higher. A lower pHe appears to promote invasiveness, whilst a higher pHi is likely to give a competitive advantage over normal cells for growth [42]. By non-invasive techniques, pHe values in solid

Roles of Carbonic Anhydrase Isozymes in Cancer

163

tumours of 0.2 to 0.5 pH units lower than those in normal tissues have been measured [41,48]. Since each of these isozymes constitute part of a transmembrane protein, the possibility cannot be excluded that their role in cancer may, alternatively or additionally, be mediated through ligand binding and signal transduction. In addition to their association with cancer cells, there are several additional pieces of evidence that imply the involvement of CA IX and CA XII in cancer. Firstly, transfection of cultured NIH33T3 fibroblasts, which do not normally express CA IX, with a vector containing the CA IX gene, changes both their morphology and growth characteristics. They replicate faster, in a less controlled manner, and are less dependent on growth factors [33]. Secondly, in certain renal carcinoma cells in which both CA IX and CA XII are over-expressed, these isozymes are both down-regulated by the product of the von Hippel-Lindau (VHL) tumour suppressor gene [13]. Most renal carcinomas of the clear cell type are caused by inactivation of this gene [18]. Moreover, the human CA IX and CA XII genes have been mapped to regions of the chromosome (bands 17q21.2 and 15q22 respectively) that appear to be amplified in a number of human cancers [30]. Suppression of the invasion of four renal carcinoma cell lines by the specific CA inhibitor acetazolamide ( 1 - 1 0 |0.M) has been demonstrated recently in vitro [32]. However, the pattern of different CA isozymes detected in each cell line did not clearly indicate the identity of the specific isozymes involved. The only cell line in which CA IX was expressed showed the least suppression, and only CA II was expressed strongly in all four cell lines [32]. The authors concluded that the effect of acetazolamide was most likely attributable to the inhibition of the cytoplasmic CA II and/or CA XII. 4

Inactive CA Isoforms and Oncogenesis

Two inactive a-carbonic anhydrases CA-RP(RPTP|3) and CA-RP(RPTPy) form part of the extracellular domain of two receptor-type protein tyrosine phosphatases (RPTPP and RPTPy). These protein tyrosine phosphatases (PTPs) are transmembrane proteins with an extracellular ligand-binding region connected to a cytoplasmic tyrosine phosphatase domain [16,19]. It is considered that the CA domains are involved in signal transduction [1,36] and mutations in this domain of RPTPymay result in loss of response to external ligands [43,47]. The importance of tyrosine phosphorylation by protein tyrosine kinase in controlling such processes as cell growth and differentiation is well established and the transient nature of signaling by phosphorylation requires the additional action of PTPs for its regulation. Since hyperphosphorylation of protein tyrosine residues can cause malignant transformation, inactivation of PTPs may also be oncogenic. The gene encoding RPTPy has been mapped to a chromosomal region that is frequently

164

W. R. Chewidden et al.

deleted in certain renal cell and lung carcinomas, and as a consequence of this it is considered to be a candidate tumour suppressor gene for these cancers [14,17]. The expression of RPTPP is much narrower and seems to be confined to the CNS. Its CA domain binds to a number of ligands where it is intimately involved in intracellular signaling and cell growth. 5

CA Isozymes as Tumour Markers

No firm data was produced to support the use of any CA isozyme as a tumour marker until the discovery of CA IX and CA XII. CA IX was originally identified as a tumour-associated cell surface antigen in HeLa cells, a line established from a human carcinoma of the cervix. At that time it was not known that the antigenic protein possessed an active carbonic anhydrase domain and it was named MN protein. Initial studies demonstrated the expression of MN protein/CA IX in human carcinomas of ovary, endometrium and cervix, but not in the corresponding normal tissues [49]. No expression was observed in normal human heart, lung, kidney, prostate, peripheral blood, brain, placenta and muscle, but message was detected in liver and pancreas [27]. Since it appears to be strongly expressed in most dysplastic and neoplastic cervical tissues, CA IX may well prove to be an important new biomarker [2,20]. There is also strong preliminary evidence that CA IX expression may prove to be a valuable adjunct to cytological diagnosis in improving the discrimination of significant lesions in Papanicolaou (Pap) screening [22]. CA IX is widely expressed in renal cell carcinomas (RCCs), especially those of the clear cell type, but not in normal kidney nor in benign kidney lesions such as cysts, adenomas and oncocytomas. Consequently its potential has been suggested as a biomarker for certain renal cell carcinomas [21,27,28]. Recent evidence suggests that CA IX may also serve as a useful marker of cell proliferation in colorectal neoplasms [38]. The picture is less clear, however, in some other tissues. Whilst CA IX expression is abundant in normal gastric mucosa, it is reduced or absent in gastric tumours [34] Loss of CA IX expression may also be correlated with progression from dysplasia to adenocarcinoma in Barrett's oesophagus [46] and one report also correlates it with several adverse prognostic features in cervical carcinoma [2]. In contrast to the situation with CA IX, the CA XII transcript has been detected in a wide range of normal human tissues, with high levels in kidney, colon, pancreas and prostate [13,45]. Although increased expression of CA XII has been observed in some renal cell carcinomas (RCC) of the clear cell type [13] and in certain cell lines derived from human lung carcinoma [44,45] currently there is no additional data to support the use of CA XII as a tumour biomarker in these tissues. However, there is evidence that this isozyme may be of value in the histopathological diagnosis of colorectal tumours. Furthermore, CA XII expression appears to be

Roles of Carbonic Anhydrase Isozymes in Cancer

165

directly correlated to grade of dysplasia, suggesting the possibility of prognostic application [15]. 6

Concluding Comments

Specific sulphonamide inhibitors of carbonic anhydrase clearly inhibit growth in culture and suppress invasiveness of certain cancer cell lines. The mechanisms of inhibition may be different for each process and have yet to be firmly established. Identification of the specific CA isozymes involved will facilitate the synthesis of more potent isozyme-selective inhibitors which may prove effective in cancer therapy. Work is currently in progress in our laboratories on the synthesis of new CA inhibitors with potential for cancer therapy, and on the effects of CA inhibitors on the growth of both cancer cells in culture and of human tumours implanted into immunodeficient mice. References 1.

Barnea G., Silvennoinen O., Shaanan B., Honegger A.M., Canoll P. D., D'Eustachio P., Morse B., Levy J.B., LaForgia S., Huebner K., Musacchio J.M., Sap J. and Schlessinger J., Identification of a carbonic anhydrase-like domain in the extracellular region of RPTPy defines a new subfamily of receptor tyrosine phosphatases. Mol. Cell. Biol. 13 (1993) pp. 1497-1506. 2. Brewer C.A., Liao S.Y., Wilczynski S.P., Pastorekova S., Pastorek J., Zavada J., Kurosaki T., Manetta A., Berman M.L., DiSaia P.J. and Stanbridge E.J., A study of biomarkers in cervical carcinoma and clinical correlation of the novel biomarker MN. Gynecol. Oncol. 63 (1996) pp. 337-344. 3. Chegwidden W.R. and Carter N.D., Introduction to the carbonic anhydrases. In: The Carbonic Anhydrases: New Horizons, ed. by Chegwidden W.R., Carter N.D. and Edwards Y.H. (Birkhauser Verlag, Basel, Switzerland, 2000) pp. 1328. 4. Chegwidden W.R., Carter N.D. and Edwards Y.H. (eds). The Carbonic Anhydrases: New Horizons. (Birkhauser Verlag, Basel, Switzerland, 2000). 5. Chegwidden W.R. and Spencer I.M., Sulphonamide inhibitors of carbonic anhydrase inhibit the growth of human lymphoma cells in culture. Inflammopharmacology 3 (1995) pp. 231-239. 6. Chegwidden W.R. and Spencer I.M., Carbonic anhydrase provides bicarbonate for de novo lipogenesis in the locust. Comp. Biochem. Physiol. 115B (1996) pp. 247-254. 7. Chegwidden W.R. and Spencer I.M., unpublished data.

166 8. 9. 10.

11.

12.

13.

14.

15.

16.

17.

18.

19.

W. R. Chewidden et al. Dodgson S.J., Inhibition of mitochondrial carbonic anhydrase: a discrepancy examined. /. Appl. Physiol. 63 (1987) pp. 2134-2141. Dodgson S.J. and Forster R.E. II, Inhibition of CA V decreases glucose synthesis from pyruvate. Arch. Biochem. Biophys. 251 (1986) pp. 198-204. Dodgson S.J. and Forster R.E.II, Carbonic anhydrase: inhibition results in decreased urea production by hepatocytes. J. Appl. Physiol. 60 (1986) pp. 646652. Heck R.W., Tanhauser S.M., Manda R., Tu C-K., Liapis P.J. and Silverman D.N., Catalytic properties of mouse carbonic anhydrase V. /. Biol .Chem. 269 (1994) pp. 24742-24746. Hewett-Emmett D. and Tashian R.E., Functional diversity, conservation and convergence in the evolution of the a-, (3- and y-carbonic anhydrase gene families. Mol. Phylogenet. Evol. 5 (1996) pp. 50-77. Ivanov S.V., Kuzmin I., Wei M-H., Pack S., Geil L., Johnson B.E., Stanbridge E.J. and Lerman M-L, Down-regulation of transmembrane carbonic anhydrases in renal cell carcinoma cell lines by wild-type von Hippel-Lindau transgenes. Proc. Natl. Acad. Sci. USA 95 (1998) pp. 12596-12601. Kastury K., Ohta M., Lasota J., Moir D., Dorman T., LaForgia S., Druck T. and Huebner K., Structure of the human receptor tyrosine phosphatase gamma gene (PTPRG) and relation to the familial RCC t(3.8) chromosome translocation. Genomics 32 (1996) pp. 225-235. Kivela A., Parkkila S., Saarnio J., Karttunen T.J., Kivela J., Parkkila A-K., Waheed A., Sly W.S., Grubb J.H., Shah G., Tureci O. and Rajaniemi H., Expression of a novel transmembrane carbonic anhydrase isozyme XII in normal human gut and colorectal tumors. Am. J. Pathol. 156 (2000) pp. 577584. Krueger N.X. and Saito H., A human transmembrane protein-tyrosinephosphatase, PTP£, is expressed in brain and has an N-terminal receptor domain homologous to carbonic anhydrases. Proc Nat Acad Sci USA 89 (1992) pp. 7417-7421. Laforgia S., Morse B., Levy J., Barnea G., Cannizzaro L.A., Li F., Nowell P.C., Boghosian-Sell L., Glick J., Weston A., Harris C.C., Drabkin H., Patterson D., Croce CM., Schlessinger J. and Heubner K., Receptor protein-tyrosine phosphatase y is a candidate tumor suppressor gene at human chromosome region 3p21. Proc. Natl. Acad. Sci.USA 88 (1991) pp. 5036-5040. Latif F., Tory K., Gnarra J., Yao M., Duh F.M., Orcutt M.L., Stackhouse T., Kuzmin I., Modi W., Geil L., Identification of the von Hippel-Lindau disease tumour suppressor gene. Science 260 (1993) pp. 1317-1320. Levy J.B., Canoll P.D., Silvennoinen O., Barnea G., Morse B., Honegger A.M., Huang J.T., Cannizzaro L.A., Park S.H., Druck T., Huebner K., Sap J., Ehrlich M., Musacchio J.M. and Schlessinger J., The cloning of a receptor-type

Roles of Carbonic Anhydrase Isozymes in Cancer

20.

21.

22.

23.

24. 25.

26.

27.

28.

29.

30.

31.

167

tyrosine phosphatase expressed in the central nervous system. J. Biol. Chem. 268 (1993) pp. 10573-10581. Liao S.Y., Brewer C , Zavada J., Pastorek J., Pastorekova S., Manetta A.,. Berman M.L, DiSaia PJ. and Stanbridge E.J., Identification of the MN antigen as a diagnostic biomarker of cervical intraepithelial squamous and glandular neoplasia and cervical carcinomas. Am. J. Pathol. 145 (1994) pp. 598-609. Laio S.Y., Ourelio O.N., Jan K., Zavada J. and Stanbridge E.J., Identification of the MN/CA9 protein as a reliable diagnostic biomarker of clear cell carcinoma of the kidney. Cancer Res. 57 (1997) pp. 2827-2831. Liao S.Y. and Stanbridge E.J., Expression of MN/CA9 protein in Papanicolaou smears containing atypical glandular cells of undetermined significance is a diagnostic biomarker of cervical dysplasia and neoplasia. Cancer 88 (2000) pp. 1108-1121. Lynch C.J., Fox H., Hazen S.A., Stanley B.A., Dodgson S.J. and Lanoue K.F., Role of hepatic carbonic anhydrase in de novo lipogenesis. Biochem. J. 310 (1995) pp. 197-202. Maren T.H. and Conroy C.W., A new class of carbonic anhydrase inhibitor. J. Biol. Chem. 268 (1993) pp. 26233-26239. Maren T.H. and Sanyal G., The activity of sulphonamides and anions against the carbonic anhydrases of animals, plants and bacteria. Annu. Rev. Pharmacol. Toxicol. 23 (1983) pp. 439-459. Matinez-Zaguilan R., Seftor E.A., Seftor R.E., Chu Y.W., Gillies R.J. and Hendrix, M.J., Acidic pH enhances the invasive behaviour of human melanoma cells. Clin. Exp. Metastasis 14 (1996) pp. 176-186. McKiernan J.M., Buttyan R., Bander N.H., Stifelman M.D., Katz A.E., Chen M.W., Olsson C.A. and Sawczuk I.S., Expression of the tumor-associated gene MN: a potential biomarker for human renal cell carcinoma. Cancer Res. 57 (1997) pp. 2362-2365. McKiernan J.M., Buttyan R., Bander N.H., de la Taille A., Stifelman M.D., Emanuel E.R, Bagiella E., Rubin M.A., Katz A.E., Olsson C.A. and Sawczuk I.S., The detection of renal carcinoma cells in the peripheral blood with an enhanced reverse transcriptase-polymerase chain reaction assay for MN/CA9. Cancer 86 (1999) pp. 492-497. Metcalfe H.K., Monson J.P., Drew P.J., lies R.A., Carter N.D. and Cohen R.D., Inhibition of gluconeogenesis and urea synthesis in isolated rat hepatocytes by acetazolamide. Biochem. Soc. Trans. 13 (1985) p. 255. Mitelman F., Chromosome 15 and Chromosome 17. In: Catalog of Chromosome Abberations in Cancer, ed. by Mitelman, F., Johansson, B. and Mertens, F., 5th ed., (Wiley, New York, 1994) pp. 2485-2619 and 2748-2930. Natale P.J. and Tremblay G.C., On the availability of intramitochondrial biosynthesis of pyrimidines. Biochem. Biophys. Res. Commun. 37 (1969) pp. 512-517.

168

W. R. Chewidden et al.

32. Parkkila S., Rajaniemi H., Parkkila A-K., Kivela J., Waheed A., Pastorekova S., Pastorek J. and Sly W.S., Carbonic anhydrase inhibitor suppresses invasion of renal cancer cells in vitro. Proc. Nat. Acad. Sci. USA 97 (2000) pp. 22202224. 33. Pastorek J., Pastorekova S., Callebaut I., Mornon J.P., Zelnik V., Opavsky R., Zat'ovicova M., Liao S., Portetelle D., Stanbridge E.J., Zavada J., Burny A. and Kettmann R., Cloning and characterization of MN, a tumor-associated protein with a domain homologous to carbonic anhydrase and a putative helix-loophelix DNA binding segment. Oncogene 9 (1994) pp. 2877-2888. 34. Pastorekova S., Parkkila S., Parkkila A.K., Opavsky R., Zelnik V., Saarnio J. and Pastorek J., Carbonic anhydrase IX, MN/CA IX: analysis of stomach complementary DNA sequence and expression in human and rat alimentary tracts. Gastroenterology 112 (1997) pp. 398-408. 35. Pastorekova S., Zavadova Z., Kostal M., Babusikova O. and Zavada J., A novel quasi-viral agent, MaTu, is a two-component system. Virology 187 (1992) pp. 620-626. 36. Peles E., Nativ M., Campbell P.L., Sakurai T., Martinez R., Lev S., Clary D.O., Schilling J., Barnea G., Plowman G.D., Grumet M. and Schlessinger J., The carbonic anhydrase domain of receptor tyrosine phosphatase P is a functional ligand for the axonal cell recognition molecule contactin. Cell 82 (1995) pp. 251-260. 37. Pocker Y. and Sarkanen S., Carbonic anhydrase: Structure, catalytic versatility and inhibition. Adv. Enzymol. 47 (1978) pp. 149-274. 38. Saarnio J., Parkkila S., Parkkila A.K., Haukipuro K., Pastorekova S, Pastorek J., Kairaluoma M.I. and Karttunen T.J., Immunohistochemical study of colorectal tumors for expression of a novel transmembrane carbonic anhydrase, MN/CA IX, with potential value as a marker of cell proliferation. Am. J Pathol. 153 (1998) pp. 279-285. 39. Spencer I.M., Hargreaves I. and Chegwidden W.R., Effect of the carbonic anhydrase inhibitor acetazolamide on lipid synthesis in the locust. Biochem. Soc. Trans. 16 (1988) pp. 973-974. 40. Spencer I.M., Dawson M. and Chegwidden W.R., The role of carbonic anhydrase in biosynthetic processes. Isozyme Bulletin 27 (1994) p. 42. 41. Stubbs M., McSheehy P.M. and Griffiths J.R., Causes and consequences of acidic pH in tumours: a magnetic resonance study. Adv. Enzyme Regul. 39 (2000) pp. 13-30. 42. Stubbs M., Sheehy P.M., Griffiths J.R. and.Bashford C.L., Causes and consequences of tumour acidity and implications for treatment. Mol. Med. Today 6 (2000) pp. 15-19. 43. Sun H. and Tonks K., The coordinated action of protein tyrosine phosphatases and kinases in cell signaling. Trends Biochem. Sci. 19 (1994) pp. 480-485. 44. Torczynski R.M. and Bolton A.P. U.S. Patent 5,589,579 (1996).

Roles of Carbonic Anhydrase Isozymes in Cancer

169

45. Tiireci O., Sahin U., Vollmar E., Siemar S., Gottert E., Seitz G., Parkkila A-K., Shah G.N., Grubb J.H., Pfreundschuh M. and Sly W.S., Human carbonic anhydrase XII: cDNA cloning, expression and chromosomal localization of a carbonic anhydrase gene that is overexpressed in some renal cancers. Proc. Natl. Acad. Sci. USA 95 (1998) pp. 7608-7613. 46. Turner J.R., Odze R.D., Crum C.P. and Resnick M.B., MN antigen expression in normal, preneoplastic and neoplastic esophagus: a clinicopathological study of a new cancer-associated biomarker. Hum. Pathol. 28 (1997) pp. 740-744. 47. Wary K.K., Lou Z., Buchberg A.M., Siracusa L.D., Druck T., LaForgia S. and Huebner K., A homozygous deletion within the carbonic anhydrase-like domain of the Ptprg gene in murine L-cells. Cancer Res. 53 (1993) pp. 14981502. 48. Webb S.D., Sherratt J.A. and Fish R.G., Alterations in proteolytic activity at low pH and its association with invasion: a theoretical model. Clin. Exp. Metastasis 17 (1999) pp. 397-407. 49. Zavada J., Zavadova Z., Pastorekova S., Ciampor F., Pastorek J. and Zelnick V., Expression of MaTu-MN protein in human tumor cultures and in clinical specimens. Int. J. Cancer 54 (1993) pp. 268-274.

BIOCHIP AND MINIATURIZATION JIAN Z H A N G , WAN-LI X I N G , YU-XIANG Z H O U , AND JING C H E N G

Biochip R&D Center, Biology Department, Tsinghua University, Beijing, 100084, China E-mail: jcheng @ tsinghua. edu. en A typical bioanalytical system usually consists of three classical steps, i.e., sample preparation, chemical reaction and detection. The total integration of these three steps has been the dream for many years for both academic researchers and entrepreneur. The marriage between molecular biology and the semiconductor industry for the first time brings hope to the scientific community. This presentation will describe the efforts towards the construction of microchip-based total analytical system or laboratory-on-a-chip. Progress made on microscale separation and isolation of cells, DNA amplification (PCR or strand displacement amplification) in microchips, and detection of specific sequence information on chips (via either chip-based capillary electrophoresis or electronic hybridization) will be presented.

1.

Introduction

No more than one decade has passed since the first DNA microarray was fabricated, however, great progress has been made and a lot of attention attracted. For microarrays on a 1 cm2 surface, thousands of or tens of thousands of spots can be manufactured. Molecules on these spots can be many different types, such as DNA, peptide, protein, cell or even tissue. DNA microarray so far is the most commonly used one. Through hybridization with target DNA, a large quantity of bioinformation can be obtained. Besides the explosive data it can generate, biochip has many other merits as well which include small size, little consumption of sample, no contamination, fast speed in analysis, etc. The rapid development in biochip study benefits mainly from two aspects, the great demanding of efficient tools from human genome project (HGP) and pharmaceutic companies' desire in new drug discovery. A good example of the development is the chip-base capillary electrophoresis. The use of capillary electrophoresis chips saves a lot of time in DNA sequencing comparing to the traditional electrophoretic method. Other relevant technologies, especially micro electro mechanical system (MEMS) technology, have been pushing research in biochips going forward quickly, facilitated researchers to construct biochip-based micro total analytical system (uTAS), or so-called lab-on-a-chip system. Usually bioanalytical procedure consists of three conventional steps: sample preparation, biochemical reaction and detection. To integrate all of them in one small system is a dream of many researchers and entrepreneur. In a completed uTAS system, heaters, detectors,

171

172

J. Zhang et aL

microfluidic devices including micropumps and mierovalves are all integrated with biochips. Traditional instruments cannot finish all three steps described above in a systematic manner. Moreover these instruments are large in size and expensive as well. All these hindered their wide use. Portable instruments made with biochips may efficiently overcome these difficulties faced by the traditional industry. Before integration of all three steps on one chip, feasibility study of each step on chip can be done first.

Figure 1. Silicon micropost-type filter designs and filter chips. (A) Offset array of simple microposts (13 x 20 pm spaced lym apart) set across a 500~nm-wide x 20-jim-deep silicon channel. (B) Array of complex microposts (73 pm wide) separated by 7-nm-wide tortuous channels spaced 30 p n apart and set across a 500-|im-wide x 5.7-(im-deep silicon channel fabricated using conventional wet etching procedures. (C) Filter chip with three test channels containing different designs of flow deflector and serial filters. (D) Isolation of 5.78-um-diameter latex microspheres by a post-type filter (5 um channels between 73-um-wide posts set across a 500-pm-wide, 5.7-um-deep channel). (E) Comb-type filter formed from an array of 120 posts (175um long x 18 um wide) separated by 6-um channels set across a 3-mm-wide x 13-um-deep silicon channel. (F) New methylene blue-stained WBCs isolated by comb-type filter (cells released from the front surface (upper) of the filter by reversing the flow through the filter). From Wilding et al. [1]. With permission.

2,

Sample Preparation

Sample preparation has proven to be the most difficult processes among the total three, which includes isolating target cells from a mixture* obtaining DNAS RNA or protein from the target cells. To separate different types of cells, two methods have been attempted, i. e., microfiltration and dielectrophoresis.

Biochip and Miniaturimiion

173

The filters process different shapes and sizes. To isolate white blood cells (WBC) from red blood cells (RBC), the size of filter is critical. The filter type shown in Figure 1A was fabricated on silicon substrate with 7 fim gaps between each two posts. To reduce the flow of material through the filter bed and aid capturing of large cells, filters having more complex shapes were designed also (Figure IB). The filter spacing was designed to be larger than the average diameter of RBC and less than that of WBC. However, when used with blood, no WBCs were captured, and both WBCs and RBCs passed through the filters. The main reason for the unsuccessful attempt is cells are deformable and the concept of average diameter is not really a true value for spheres. New comb-type filter was designed as shown in Figures IE and F. It consists of a series of closely positioned narrow pillars aligned in one row. Spacing between two neighboring pillars is 3.2 |im or 5 pm. Experiments proved 3.2-nm filter was the most efficient one. The shortcoming of comb-type filter is that it is easy to reach the saturation in recovery of cells, resulting low recovery yield. A weir-type filter was then fabricated and proved to be more effective (figure 2). A narrow gap of 3 fim was formed between the dam on chip and the top covering glass. With the application of certain pressure RBCs could pass through the gap, leaving the WBCs retained on the surface of the dam [1,2].

Figure 2. Weir-type silicon rciicrofilter.

Dielectrophoresis has been widely used in the isolation of cells [3, 4, 8]. The experiment of separating E. coli from blood cells using this technology was performed. Figure 3 shows an array of 5 by 5 electrodes which can be addressed individually. Figure 3A indicates that the electrodes are addressed in a checkerboard mode, which means one electrode's polarity is opposite to adjacent one's. Figure 3B shows the simulation result of electric field distribution by computer. Impelled by dielectrophoretic force, blood cells and bacteria cells move towards different field zones. Figure 4 shows separation result accorded with the calculation. Blood cells accumulated in field minima (red color) and bacteria cells in field maxima (white

174

J. Zhang et al

color). Another square-wall mode was also tested. Electtodes can be addressed in another square-form mode, which means polarity of electrodes that in one square form is different from that of electrodes in the neighboring square forms. The studies indicate that checkerboard pattern can get more distinct separation. After isolation, the cells were lysed by applying high voltage pulses. Then the lysate was digested by proteinase K to degrade contaminating proteins. Electrophoresis result shown in Figure 5 indicates that no obvious damage to nucleic acids through the lysis process.

Figure 3. (A) Checkerboard addressing of the five-by-five electrodes. (B) Corresponding computer models of the alternating current electric field distribution. From Cheng et al. [3], With permission.

Figure 4. (A) Separation result of checkerboard format (field maxima are at white-shaded areas and field minima are at the red-shaded areas). (B) Checkerboard format at the completion of the washing process. From Cheng et al. [3]. With permission.

The two methods of microfiltration and dielectrophoresis have their own advantages and disadvantages. It is obvious that microfiltration is only adaptive for separating cells with different size, and while cells are alternated, the filter's size should be changed. Only cells with large size can be isolated by now due to the difficulty of fabricating filters having smaller size. Dielectrophoresis, dependent on dielectrophoretic characteristic of cells* is discriminative for cells with various sizes. But the appropriate frequency, at which one kind of cells are subjected to positive dielectrophoretic force and the other subjected to negative dielectrophoretic force, is

175

Biochip and Miniaturization

empirically obtained. The two methods are expected to integrate for separation various kinds of cells.

* V%yjm ••$ j«^ sis.-& #2. \ i

Figure 5. Agarose gel analysis of nucleic acids released from E. coli by electronic lysis. Lane 1: ADNA Hind III digest marker, lane 6: $X 174 Hae III digest marker. Lane2: supercoiled plasmid pCR2.1, lane 3: the corresponding linear plasmid DMA. Lane 4: the electronic lysate with and, lane 5: without RNase treatment. From Cheng et al. [3]. With permission.

3.

Biochemical Reaction

Biochemical reaction includes various Mads of reactions, such as chemical labeling, receptor-ligand binding, reverse transcription, nucleic acid amplification, etc. As an example PCR carried out on chip will be discussed as representative of biochemical reaction.

Figure 6. Microfabricated silicon-glass chip used for PCR (reaction volume 12 |il, surface area is 210 mm2). From Cheng et al.[5]. With permission.

Silicon-glass chips for nucleic acid amplification were fabricated using wet etching procedures and anodic bonding (Figure 6). First a sink was etched, which had a depth of 115 pm, followed by thermal growing a silicon dioxide layer with a thickness of 2000 A. Then a top glass was bonded with the chip, forming a reaction cell.

176

J. Zhang et al.

To amplify DNA on chip, Taq DNA polymerase was mixed with C. jejuni bacterial DNA. 10 silicon-glass chips were filled with 12jxl of PCR reaction mixture containing 200 urn each dNTP, 0.6 urn each primer and 1.2 ng C. jejuni DNA. The reaction mixture was initially heated to 94°C for 1 min and cycled for 28 cycles: 15 s at 94 °C, 1 min at 55 °C and 1 min at 72 °C. A final extension was performed at 72 °C for 10 min. The amplifications on chip were paralleled with reactions in tubes, using the same mixture and under the same thermal cycling condition. Products generated from both chips and tubes were examined by gel electrophoresis. The result indicates that yield of chip PCR was equivalent to that of PCR in tubes [5]. Apart from simple PCR reactions, other amplifications have been accomplished on chips too. These include multiplex PCR, single-step reverse transcriptase (RT)PCR, ligase chain reaction (LCR), and strand displacement amplification (SDA) [6]. 4.

Detection

The detection schemes used in biochip-based nucleic acid analysis are of two types. One is based on the detection of the separated nucleic acid molecules, and the other, so called microarray, is based on the examination of the hybridization characteristics between immobilized oligonucleotide probes and the target DNA molecules. Capillary electrophoresis (CE) chips are used to sequence DNA or examine DNA polymorphisms. Comparing with traditional electrophoretic method, CE chips' analysis is about 10 to 100 times faster. On the other hand, its reproducibility is relatively equivalent with traditional methods [7]. Various types of CE chips have been developed using different materials [9, 10]. Microarray can be used for gene expression study, mutation research and resequencing. Probe DNA molecules are pre-immobilized on chip, and then target DNA are introduced and hybridized with probes. Most of the chips that have been developed so far are passive type, however, active chips (using all types of forces) are starting to catch up. As an example of bioelectronic active chips, arrays of electrodes are fabricated on silicon substrate, which can be addressed individually by applying electric signals. After probes have been immobilized onto chips and targets are introduced, the electrodes are positively biased. As DNA molecules carry negative charges, with the application of a positive electric field, the target molecules can have access to the probes more easily and quickly. Hence, the hybridization speed is accelerated greatly. When the hybridization reaction is completed, the chip is made negatively biased, thereby the mismatched target DNA molecules are pushed away and washed off. By controlling the intensity of the electric field, different stringencies can be obtained [3, 6].

Biochip and Miniaturization 5.

177

Integration

After each of the three steps proven to be functional, a complete system Integrating all of the three was constructed. As an example, a chip having 100 electrodes is used for sample preparation. Applying alternating electric signals, target cells can be isolated from the mixture by dielectrophoretic force. When the separation is accomplished, high frequency pulses are applied to the electrodes and the cells lysed. The chip's backside-has a ceramic heater attached. By switching to three different resistors, the chips temperature can be set at 60!, 90!, 95!, when powered by a 11 volt D.C. supply. This satisfies temperature condition of- strand displacement amplification (SDA). Connected in tandem is another chip having 25 electrodes for electric controlled hybridization. The fluidic controlling unit consists of a pump and a series of valves which operate under programmed commands. A battery driven laser emitter (2 mW)5 with wavelength set at 635 nm Is utilized for Induction of fluorescence. A CCD camera Is employed for fluorescent Imaging (Figure 7).

Figure 7. The prototype of the sample to answer portable lab-on-a-chip system.

Using the system, micrococcus lysodeikticus was separated from whole blood* then lysed by applying high frequency pulses and deproteinated using Proteinase K. Employing SDA method, 81 base pairs of the Salmonella entericaspaQ gene In the digested product were amplified. Then the amplicant was denatured. Introduced Into the second chip and hybridized with pre-immobilized probes. Hybridization result was detected by the CCD camera. The entire process took approximately one hour.

178

J. Zhang et al.

References: 1. Wilding, P., Cheng, J., et al., Integrated Cell Isolation and Polymerase Chain Reaction Analysis Using Silicon Microfilter Chambers. Analytical Biochemistry 257 (1998) pp. 95-100. 2. Cheng, J., et al., Sample preparation in microstructured devices. Topics in Current Chemistry 194 (1998) pp. 215-231. 3. Cheng, J., et al., Preparation and hybridization analysis of DNA/RNA from E. coli on microfabricated bioelectronic chips. Nature Biotech. 16 (1998) pp. 541546. 4. Cheng, J., et al., Isolation of cultured cervical carcinoma cells mixed with peripheral blood cells on a bioelectronic chip. Anal. Chem. 70 (1998) pp. 23212326. 5. Cheng, J., et al., Chip PCR. II. Investigation of different PCR amplification systems in microfabricated silicon-glass chips. Nucleic Acids Res. 22 (1996) pp. 380-385. 6. Cheng, J., et al., Fluorescent imaging of cells and nucleic acids in biochips. SPIE, 3600 (1999) pp. 23-28. 7. Cheng, J., et al., Degenerate oligonucleotide primed-polymerase chain reaction and capillary electrophoretic analysis of human DNA on microchip-based devices. Anal. Biochem. 257 (1998) pp. 101-105. 8. Wang, X.B., Huang Y., Becker F.F. and Gascoyne P.R.C. Gascoyne. A unified theory of dielectrophoresis and travelling-wave dielectrophoresis. Appl. Phys. 27(1994)pp.l571-1574. 9. Becker, F.F., Wang, X.B., et al., Separation of human breast cancer cells from blood by differential dielectric affinity. Proc. Nat. Academ. Sci. (USA) 29 (1995) pp.860-864. 10. McCormick, R.M., et al., MicroChannel Electrophoretic Separations of DNA in Injection-Molded Plastic Substrates, Anal. Chem., 69 (1997) pp.2626-2630.

FUNCTIONAL GENOMICS: A PLATFORM FOR THE DISCOVERY OF NEW THERAPIES

DALIA C O H E N

Novartis Pharmaceuticals Corporation, 556 Morris Avenue, Summit, New Jersey, 07901, USA E-mail: dalia. cohen @pharma. novartis. com Functional genomics can be described as scientific and technological approaches which are being applied to bridge genomic research with the discovery and development of diseaserelevant therapeutic targets. These scientific approaches can offer significant opportunities in the search for causal and disease modifying therapies to better treat society's most outstanding medical needs. The use of functional genomic approaches is demonstrated in our recent efforts to elucidate cellular events leading to tumor cell cycle arrest in response to the inhibition of histone deacetylase, an important regulator of gene transcription.

1

Comprehensive Functional Genomics Platforms

Large scale DNA sequencing in the public and private sectors will enable the elucidation of a detailed genetic and physical map of the human genome. Furthermore, completion of the human genome sequence containing an estimated 100,000 genes [22, 3, 6] should identify molecular targets for disease-modifying therapies that are novel and designed to satisfy unmet medical needs. In the 1970's, relatively few characterized pharmacological proteins such as enzymes and receptors were available to researchers. However, advances in molecular biology in the 1980s resulted in an exponential growth in the number of genes and gene products as potential drug disease targets. Currently, marketed drugs interact with about 400 genes or gene products and an estimated number of important genes for disease predisposition, onset and progression range from 3,000 - 10,000. Therefore, many novel genes and proteins can still be identified as potential targets for pharmaceutical research and development. Among the classical drug targets are enzymes, receptors, channels, DNA, growth factors and hormones. Currently marketed drugs for the treatment of a number of diseases can be found in the list assembled by [5]. For example, for the treatment of nervous system disorders, eight channels, twelve enzymes and one hundred and fifteen receptors are the targets for drugs on the market. For the treatment of neoplastic diseases, drug targets include five DNA sequences, twenty enzymes, six factors and seven receptors. For the treatment of inflammation, one-channel, nineteen enzymes and twenty-six receptors are common drug targets.

179

180

D. Cohen

To identify novel disease relevant targets that can be exploited for therapeutic discovery, new scientific avenues and technologies are being explored. Functional genomics approaches are being applied to link genomic research with the process of drug discovery and development. These approaches encompass technological platforms ranging from in silico biology (computational biology) to the study of whole organisms. Computational biology encompassing data analysis and interpretation [9] is of major importance in the discovery of potential drug targets. New gene family members, such as secreted factors, orphan receptors, GPCR, kinases, phophatases, and proteases that are associated with diseases can be identified using informatic tools. In addition, linking genes to chromosomal positions associated with specific disorders can be performed. Furthermore, using bioinformatics tools to identify gene homologues that are evolutionarily conserved is likely to give insight into gene function. In the molecular biology arena, key technologies for the elucidation of potential drug targets have been developed. High throughput of particular mRNAs within the pool of cellular messages can be achieved using several approaches based on differential display, reverse transcriptase PCR and DNA array [12, 13, 18]. These technologies allow for the measurement of differential gene expression in healthy versus diseased tissues, or in drug treated versus control cells. Proteomics approaches enable the analysis of differential protein expression, post translation modification and protein regulation [15]. Studying the proteome of a cell is an important companion to gene expression studies since there is often an insufficient correlation between the level of expression of different genes and the relative abundance of the corresponding proteins. Furthermore, the same gene does not directly encode for a protein and its post translational modifications; therefore the complete structure (s) of an individual protein cannot be determined by reference to its gene sequence alone. Moreover, these proteins form cellular networks comprised of numerous signalling pathways in a living cell, which may be altered in disease. Proteomics technologies are likely to identify the components of these altered pathways. In the cell biology arena, gene function is assessed using high-throughput cellbased assays. For example, cDNAs, antisense oligonucleotides, and peptide libraries expressed in cells followed by selection of biologically relevant phenotypes allows identification of genes mediating the expected phenotypes. In addition, in situ and immuno-hybridization methodologies and the use of specific antisense sequences are widely used to determine and validate gene function. Finally, yeast, C. elegans, Drosophila and mouse represent some of the most important experimental systems for understanding gene function and are being used for in vivo gene profiling experiments. [14]. In addition, comparative genetics studies and the opportunity to genetically manipulate homologous genes in these

Functional Genomics

181

organisms can identify important components of gene functions and give valuable clues as to their potential role as a disease mediator in humans. These technologies and approaches need to be used in an interactive manner in order to successfully assign gene function, place individual genes into biological pathways, predict initiating the disease process and screen and optimise therapeutic leads. Moreover, applying functional genomics approaches with the extensive knowledge and availability of in vitro and in vivo model systems to study disease pathophysiology as well as integration of functional genomics technologies with more established scientific disciplines (e.g. protein chemistry, biochemistry, pharmacology, physiology) is a major strength of the pharmaceutical industry. To complement the in-house efforts in functional genomics, Novartis has ongoing external collaborations with Celera (genome information and data-bases), Incyte (gene-chip technology and the LifeSeq database), Affymetrix (gene-chip technology), Protana (Proteomics), and Rigel (high throughput target discovery and validation). In addition, Novartis is a member of the Wellcome Trust / Industrial Consortium to generate a Single Nucleotide Polymorphism (SNPs) map for the public domain. 2

Studying Transcription Regulation and Cell Cycle Control Employing Functional Genomics Approaches

The application of functional genomic is demonstrated in our recent efforts to elucidate cellular events leading to tumor cell cycle arrest in response to the inhibition of histone deacetylase, an important regulator of gene transcription. Histone acetylation is a key regulatory mechanism thought to modulate gene expression by altering the accessibility of transcription factors to DNA. Histone deacetylases (HDACs) repress gene transcription and their enzymatic activity is inhibited by trapoxin, a microbial derived cyclotetrapeptide. We [17] have demonstrated that treatment of human tumor cells with trapoxin, causes induction of mRNA and protein levels of the p21 gene, the inhibitor of cyclin-dependent protein kinases. Furthermore, changes in the transcription of a small subset of genes that regulate the cell cycle were also observed. To monitor additional genes with altered transcription in response to trapoxin in human tumor lung cells, DNA microarrays were used. Selective modulation, greater than 3 fold, was observed for 32 out of -7000 genes and the results were confirmed by Real Time PCR. These included the p21 watl and gelsolin genes previously shown to be trapoxin inducible [7]. These genes are currently being evaluated for their role in cell cycle and proliferation of tumor cells. To study further the function and regulation of HDAC1, we searched for novel cellular factors that interact with HDAC1 using a yeast two-hybrid screen [2]. A large N-terminal region of HDAC1 from amino acids 53 to 285 out of a total of 482

182

D. Cohen

residues was used as the bait to search for interacting cellular factors in a HeLa cDNA library. A human gene was identified that demonstrated a specific interaction with HDACl. The gene, husl+p, encodes a polypeptide that is about 30% identical to S. pombe protein (for hydroxyurea sensitive) and it was therefore named human Husl. Husl homologues have also been identified in mouse, C. elegans and Drosophila [4]. S. pombe husl+p was reported to be a checkpoint rad protein that together with five other known rad proteins, relays a signal from DNA-damage or replication block to downstream effectors [10, 16]. This resulted in a G2/M growth arrest in cells suffering DNA damage or replication block. The interaction between HDACl and hHusl was characterized in vitro and in vivo. The HDACl putative region that interacts with hHusl encompasses amino acids 53 to 240. In transfected cells, immunoprecipitation of tagged HDACl precipitated co-expressed tagged hHusl. Furthermore, tagged HDACl was found to co-immunoprecipitate with rad 9, which is one of the checkpoint rad proteins. The finding that hHusl interacted with radl and rad9 [11, 19, 21], suggests the existence of a functional complex between HDACl, hHusl, radl and rad 9. This HDACl-rad 9 interaction might be stabilized by hHusl, which could act as a bridge between HDACl and rad 9. Our findings that HDACl interacts with G2/M checkpoint rad proteins suggests an involvement of HDACl in cell cycle regulation. Interestingly, bioinformatics analysis indicated that both hHusl [1] and radl [20] might contain the so-called PCNA motif responsible for the trimerization and binding to DNA of the proliferating cell nuclear antigen (PCNA), a processivity factor for DNA polymerase [8]. This analysis suggests that checkpoint rad proteins employ a mechanism similar to that of PCNA binding to DNA. The interaction of Husl with HDACl could lead to chromatin structure modifications that facilitate DNA repair. 3

Conclusion

In conclusion, the genomics revolution is now entering a new phase whereby the pioneering efforts to map and sequence the human genome, and the enormous wealth of data they have generated, are being converted into information on gene and protein function in normal and disease states. The progress of functional genomics will focus pharmaceutical research towards disease relevant targets and provide a starting point for discovery of causal and disease modifying therapies to address society's most outstanding medical needs.

Functional Genomics

183

References 1.

2.

3. 4.

5. 6. 7.

8.

9. 10.

11.

12.

13. 14.

Aravind, L., Walker, D. R., and Koonin, E. V. Conserved domains in DNA repair proteins and evolution of repair systems. Nucleic Acids Res 27 (1999) pp. 1223-1242. Cai, R., Yan-Neale, Y., Cueto, M., Xu, H. and Cohen, D. Interaction between the human histone deacetylase 1 and components of the G2/M checkpoint machinery. Proceedings of the American Association for Cancer Research 41 (2000) p. 808. Cohen, D., Chumakov, I., and Weissenbach, J.,. A first-generation physical map of the human genome. Nature 366 (1993) pp. 698-701. Dean, F. B., Lian, L., and O'Donnell M. cDNA cloning and gene mapping of human homologs for Schizosaccharomyces pombe radl7, radl, and husl and cloning of homologs from mouse, Caenorhabditis elegans, and Drosophila melanogaster. Genomics 54 (1998) pp. 424-436. Drews, J., and Ryser, S. Nature Biotechnology (1996) Fields, C , Adams, M. D., White, O., and Venter, J. C. How many genes in the human genome? Nat. Genet. 7 (1994) pp. 345-346. Fischer, D. D., Sambucetti, L. C , Kwon, P., Xu, H., Hall, J., Buxton, F. and Cohen, D. Histone deacetylase inhibition selectively modulates gene expression through chromatin acetylation. Proceedings of the American Association for Cancer Research 41 (2000) p. 808. Gulbis, J. M., Kelman, Z., Hurwitz, J., O'Donnell, and M , Kuriyan, J. Structure of the C-terminal region of p21(WAFl/CIPl) complexed with human PCNA. Cell 87 (1996) pp. 297-306. Kingsbury, D. T., Bioinformatics in drug discovery. Drug Development Research 41 (1997) pp. 120-128. Kostrub, C. F., al-Khodairy, F., Ghazizadeh, H., Carr, A. M., and Enoch, T. Molecular analysis of husl+, a fission yeast gene required for S-M and DNA damage checkpoints. Mol Gen Genet 254 (1997) pp. 389-399. Kostrub, C. F., Knudsen, K., Subramani, S., and Enoch, T. Huslp, a conserved fission yeast checkpoint protein, interacts with Radlp and is phosphorylated in response to DNA damage. EMBO J 17 (1998) pp. 2055-2066. Liang, P., and Pardee, A. B., 1992. Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction. Science 257 (1992) pp. 967970. Lillie, J. Probing the genome for new drugs and targets with DNA arrays. Drug Development Research 41 (1997) pp. 160-172. Mushegian, A. R., Garey, J. R., Martin, J., and Liu, L. X. Large-scale taxonomic profiling of eukaryotic model organisms: A comparison of orthologous proteins encoded by the human, fly, nematode, and yeast genomes. Genome Research 8 (1998) pp. 590-598.

184

D. Cohen

15. Pandey, A., Mann, M., Proteomics to study genes and genomes. Nature 405 (2000) pp. 837-846. 16. Russell, P. Checkpoints on the road to mitosis. Trends Biochem Sci 23 (1998) pp. 399-402. 17. Sambucetti, L. C , Fischer, D. D., Zabludoff, S., Kwon, P. O., Chamberlin, H. A., Trogani, N., Xu, H., and Cohen, D. Histone deacetylase inhibition selectively alters the activity and expression of cell cycle proteins leading to specific chromatin acetylation and antiproliferative effects. J. Biol. Chem. 49 (1999) pp. 34940-34947. 18. Shiue, L. Identification of candidate genes for drug discovery by differential display. Drug Development Research 41 (1997) pp. 142-159. 19. St Onge, R. P., Udell, C. M., Casselman, R., and Davey, S. The human G2 checkpoint control protein hRAD9 is a nuclear phosphoprotein that forms complexes with hRADl and hHUSl. Mol Biol Cell 10 (1999) pp. 1985-1995. 20. Thelen, M. P., Venclovas, and C , Fidelis, K. A sliding clamp model for the Radl family of cell cycle checkpoint proteins. Cell 96 (1999) pp. 769-770. 21. Volkmer, E., and Karnitz, L. M. Human homologs of Schizosaccharomyces pombe radl, husl, and rad9 form a DNA damage-responsive protein complex. J Biol Chem 274 (1999) pp. 567-570. 22. Watson, J. D., The human genome project: past, present and future. Science 248 (1990) pp. 44-51.

A NOVEL MATHEMATICAL ANALYSIS OF HUMAN LEUKOCYTE ANTIGEN (HLA) POLYMORPHISM BINGJIAN F E N G , DEJING P A N , SHANGWU C H E N , Z H E N Y E , AND ANLONG X U

Department of Biochemistry,

Zhongshan Email:

University, Guangzhou,

510275,

CHINA

[email protected]

In 1970, Wu and Kabat proposed an algorithm to calculate the variability of a specific site, defined by the number of different amino acids of a given position divided by the frequency of the most common amino acid of the site. This algorithm is then widely applied to MHC and TCR systems to understand their polymorphism and their relationship to diseases. Consider that the Wu-Kabat and it's modified index are not sensitive enough to evaluate polymorphism contributed by scarcely appeared members in a set of entities, and are excessively sensitive to one or two of the most common members, we propose a new algorithm to evaluate the variability of a given site with greatly improved accuracy. This new index is applied to HLA-DRB1 sequences to make further understanding of this gene.,predicting that residues 9 to 13 and residue 31,33 may correlate with antibody binding of HLA-DRB molecule which is well documented.

1

Introduction

Class II molecules of the Human Leukocyte Antigen (HLA) are cell surface a and p" heterodimeric proteins that presents peptides derived from extracellular antigens. These molecules play a central role in tissue compatibility and autoimmune diseases as well as immune response against cancer and infectious diseases [1]. Among the HLA class II locus (DR, DP and DQ), the most polymorphic one is the HLA-DRB which exhibit both allelic and haplotypic polymorphism. The former is manifested by the great number of alleles discovered nowadays, according to the resent data from the IHWS's database (1999, Jan), the numbers of alleles in HLA class II genes up to 260 DRB, 2 DRA, 38 DQB1, 20 DQA1, 82 DPB1, 13 DPA1. The latter is manifested by the existence of variable numbers of expressed as well as nonexpressed DRB genes [2]. In 1970, Wu and Kabat [3] proposed an algorithm to calculate the variability of a specific site, defined by the number of different amino acids of a given position divided by the frequency of the most common amino acid of this site. This algorithm is then widely applied to MHC and TCR systems to understand their polymorphism and its relationship to diseases. Noted that, the Wu-Kabat's algorithm pays too much attentions to the most common amino acid, Rita Jores and etc. [4] has modified it to improve its sensitivity. This new diversity index is defined as the number of distinguishable amino acid pairs occurring at a given position divided by the frequency of the most common amino acid pair at that position. Although it is really 185

186

B. Feng et al.

better than the Wu-Kabat one, but because of accounting for only two of the most common amino acid, it is not sensitive enough to evaluate the polymorphism contribute by the minor ones. At the same time its numerator is too much sensitive to the scarcely appeared amino acid at a specific site. This will cause underestimate of variability in some cases and overestimate in other cases. Here we propose a new algorithm to calculate the variability index of a given site with greatly improved accuracy. The new algorithm is applied to HLA-DRB1 sequences to make further understanding of this gene. 2

2.1

Material and Methods

Sequence data source

All of the HLA class II gene sequences including 213 HLA-DRB1 alleles, 19 DRB3, 7 DRB4, 13 DRB5, 3 DRB6, 2 DRB7, 1 DRB2, 1 DRB8 and 1 DRB9 are downloaded from the IHWS's (International Histocompatibility Workshop) database (www.anthonynolan.com/HIG/). This sequence file contains all the exons but no introns of the alleles. The exon2 encoding residue from 6 to 94 of the first domain of DRB molecule is the most polymorphic region on DRB locus, and is closely related to peptide binding and antibody recognition. With this consideration, we design a program to extract the second exon sequence segment from the sequence file offered by IHWS. These sequence segments are input into another program "Polysis" to analysis their polymorphism with the new algorithm as well as the WuKabat's. All programs were written in C++ and run on a Pentium PC. 2.2

Variability index algorithm

Here we describe the deduction of the algorithm for nucleic acid sequence data. As for the amino acid sequence data, there should be some modifications (not shown here). Suppose that the numbers of nucleotides at a certain position are: A:X, T:X 2

C: X3 G: X,

n=4

Then ~ _ X] + X2 + X3 + X4 4

/(X, -X)2 + (X2 -X)2+(X, Standard Deviation = , V 4

-X)2 +(X 4 -X)2

187

Novel Mathematical Analysis ofHLA Polymorphism

4

|X2+X2+X2+X42-4X

1 1 + X2 + X3 + X 4 ) 2 <X 1 2 +X|+X 3 2 + X 4 2 <(X 1 +X 2 +X 3 +X 4 ) 2 •.•-(X .-. 0 < Standard Deviation < V3X2 As shown above, the range of standard deviation (SD) of a specific site is determined solely on the sample size at this position. Because the sample sizes are not always the same through out the sequence, the ranges of the "Standard Deviations" may be different. The SDs cannot be compared between sites until it is transform into another value by this formula: AdjustedDeviation -

SD-Min(SD) Max(SD)-Min(SD)

StandardDeviation

According to this formula, the range of the value is from 0 to 1,0 indicate that the four bases distribute equally, 1 represent for the most conservative case (there is only one nucleotide appeared). Finally, this formula is minus by 1 to reverse the meaning of the value as shown below: Variabilitylndex — 1 - AdjustedDeviation _ StandardDeviation

The adjusted deviation, which is defined as standard deviation divided by has a fixed value range. This characteristic is critical to the analysis of HLA-DRB sequences because there are many unavailable data on the boundary of the second exon, therefore the sample sizes are not unified throughout the sequence. 2.3

Chi square test

Chi square test can be used to examine the relationship between subject's distribution on two categorical variables. In such test, the null hypothesis is that the distribution of subjects on one category is independent with that on the other category. A contingency table is set up as follows:

188

B. Feng et al.

Class 1

Class 2

total

Category 1

Count of subjects =A

Count of subjects =C

A+C

Category 2

Count of subjects =B

Count of subjects =D

B+D

Total

A+B

C+D

A+B+C+D

In this table, the left margin and the upper margin contains the categories that the subjects can be classified into according to two distinct criteria, or categorical variables, respectively. The value of each cell in the table is the observed number of subjects. The right margin and the bottom margin is the summary of the cell values of a row or column. The first step of a chi square test of independence is to compute the expected number of subjects in each cell under the null hypothesis. The general formula for expected cell frequency is:

where etj is the expected frequency for the cell in the ith row and the jth column. nj+ is the total number of subjects in the ith row, n+ • is the total number of subjects in the jth column, and n is the total number of subjects in the whole table. The next step is to subtract the expected cell frequency from the observed cell frequency. This difference gives the amount of deviation or error for each cell. These values are then squared, and divided by the expected cell frequency for each cell. The chi squared statistic is computed by summing the value given by the last step. The formula is represented by: Tf

e„

where or is the observed frequency of the cell in the ith row and the jth column, e.. is the expected cell frequency. Xp means that this is a Pearson chi-square statistic. In this paper, a chi square test is performed on each site throughout the second exon of HLA-DRB1. All the DRB1 sequences are divided into 13 categories according to their serotypes. The number of As Ts Cs and Gs are counted for each site. Thus, there are 13 rows and 4 columns in the contingency table. The test result can be reported as follows: The differences of distribution of As Ts Cs or Gs between serotypes are significant or not significant.

Novel Mathematical Analysis ofHLA Polymorphism 3

3.1

189

Results and Discussion

Comparison of the new algorithm with the Wu-Kabat's algorithm

From the Wu-Kabat formula, Number of different nucleotides at a position Proportion of the most commonly appeared nucleotide at this position one can easily think of its shortcomings. First, this formula is too sensitive to its nominator - "number of different nucleotides at a position". In some conservative DNA positions, the value of its denominator is close to 1, therefore the appearance of a sparsely appeared nucleotide would probably double or triple the final VI value, while in biological thinking, this one or two nucleotides out of a data set of tens or hundreds or even thousands of nucleotides did not attribute so much to the polymorphism at this position. Second, this calculation takes into account only the most commonly appeared nucleotide, regardless of the distributions of others. As a result, the final VI score would prone to underestimate the polymorphism. On the other hand, the new algorithm proposed in this paper stems from a widely used calculation of variability. It has taken into consideration of all the four different nucleotides, and is thought to be more accurate than the old one. Here we design a computer simulation program to compare the two methods. In this simulation, 50,000 collections of characters As, Ts, Cs and/or Gs are generated. In each collection, the total number of characters (sample size) and the proportions of A, T, C or G are randomly defined. Variability indices of each collection is then calculated by the new algorithm (donated as VInew) as well as the Wu-Kabat's (donated as VIWU) and are plotted against each other. From Fig.l, we can find 3 "knife" shaped clusters of dots. It can be proved that from the upper knife to the bottom one, all the dots in a single knife have the same number of different nucleotides, from 4 to 2 respectively. In cases that there is only one kind of nucleotides appearing in data sets, VInew is 0 and VIWU is 1, so the corresponding point in the figure is a single dot (0,1). The two shortcomings or WuKabat method are demonstrated in this figure: on the left part of the figure, which represent for conservative DNA sequence positions, the dots cluster into three horizontal lines, corresponding to different nucleotides number of 4, 3 and 2, showing that the appearance of rare nucleotides influences the VIWU by doubling or tripling it. The lines being horizontal can be accredited to the insensibility of VIWU to rarely distributed nucleotides, therefore, although their proportions vary and change the whole polymorphism, these changes do not affect the VIWU. In addition, most of the dots lies below the diagonal, shows that Wu-Kabat's method commonly underestimate the real polymorphism of sequences.

B. Feng et al.

190

0

New algorithm

1

Figure 1: Comparison of the new algorithm with the Wu-Kabat algorithm. Variability indices of 50,000 collections of nucleotide sets randomly created by computer simulation are calculated by both methods and are plotted against each other. In nucleotide sets with bias distribution of different nucleotides, which represent for relatively conserved DNA positions and is displayed in the left-most part of the graph, the Wu-Kabat coefficients can be classified into three narrow range of values which are closed to 2, 3, 4 respectively. This can be explained by the Wu-Kabat coefficients' sensitivity to me number of different nucleotides. When a position is not very conserved, the Wu-Kabat's algorithm commonly gives a lower output man the new algorithm, which can be explained by the failure of Wu-Kabat's coefficient in detecting the polymorphism contributed by scarcely appeared nucleotides.

3.2

Polymorphism analysis ofHLA-DRBl exon2

Variability Index within each serotype and that of all available sequences cumulatively are calculated as shown in Fig.2. From this firgure, we can find 3 super variable region in exon2, spanning base pair of 12-35, 64-97, 186-209, correspond to amino acid 9-16, 26-37, 67-74 respectively. It suggests that a region spanning base pair of 156-166 corresponding to amino acid 57-60 is also variable.

191

Novel Mathematical Analysis ofHLA Polymorphism

Variability index Ul

02 03 1 1

...

1

04 III

I

ill..

1

ill!

'

05 06 07

1 08 09 10 11

1 , .,

12 1

1

13 , 1 1 1 .

1

III

1

1

.

I l l

i

nL,

14 1

1

J

1

15 1

16

1 .1 .

TOTAI ,nIII 111

,

1, 1 II

1

ill.

1 ll,l..ll .,

Figure 2: Variability Index of Each Serotype. From the top to the bottom, Variability indices of serotype HLA-DRB1*01 - HLA-DRB1*16 are shown orderly. The bottom row of the figure shows the cumulative variability index of 213 alleles. As shown in the figure, mere are 3 super variable region in the exon2 of HLA-DRB1, the left region (indicated by the vertical box) are die most variable, while it is conserved within each serotype.The height of each horizontal box range from 0 to 1.

In contrast with the DRB1 sequences as a whole, the left most super variable region (labeled by the box in the figure), and only this region, is conserved within most of the serotypes. This indicates that the distribution of the four kinds of nucleotides in this region is somewhat correlated with the serotype the allele belongs to. One of the interpretations of these findings is that, the residues in this region are antibody epitopes or they can influent the conformation of other antibody epitopes, thus the replacement of some residues in this region may alter the recognition of some antibodies. In order to predict the shared antibody epitopes or other residues related to antibody recognition, we perform a person chi square teston each amino acid position of the DRB1 second exon. In this test, DRB1 alleles are divided into 13 groups according to the serological typing result that is indicated in their nomenclature. Numbers of each kind of amino acid at a certain position is counted to each group. Therefore, there are 13 rows and 20 columns in the test table. Chi-square test on this table is carried out with the following null hypothesis: The distribution of amino acids is not correlated with groups or serotypes, hence

192

B. Feng et al.

they are the same within and between groups. Chi square test of each position are also computed and plotted into Figure 3.

II ill . 1. 1

i

L. .

2 3 4 5 6 7 8 9 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 12345

Figure 3: Chi-square test graph of HLA-DRBl exon2. A vertical line in the graph indicate the statistical P value which <0.01.

The test result indicates that residues in the first and second super variable regions are obviously correlated with serological classification. Among them, residues 11, 13, 30, 37, 74 give a significant testing statistic (p=0.01), indicating their strong correlation with antibody recognition. To this point, we cannot safely conclude whether these residues are antibody epitopes or they can change the conformation of other epitopes except for residue 11 and 13. Because their side chains were proposed to project into the peptide binding groove, contacting with the bound peptide and are not accessible to antibodies which recognize the DRB1 molecule from below the binding groove. On the other hand, antibody recognition of residue 10 and 12 was well-documented [6]. So residue 11 and 13 possibly would affect the conformation of the neighboring residues. In consideration that, residue 11 and 13 are some of the residues that closely bind peptides, binding of peptide to HLA-DRB molecule may alter the recognition of these antibodies.

Novel Mathematical Analysis ofHLA Polymorphism 4

193

Acknowledgment

This work is partially supported by a "Youth Elite Fund" of National Science Fund Council No. 39725007 (China). References 1. Erik Thorsby. Invited Anniversary Review: HLA Associated Diseases. Human Immunology 53, (1997) pp. 1-11. 2. Goran Andersson, Leif Andersson, Dan larhammar, Lars Rask, Sunna Sigurdardottir. Simplifying genetic locus assignment of HLA-DRB genes. Immunology Today. Vol.15 No.2 (1994), pp. 58-62. 3. Jerry H. Brown, Theodore S. Jardetzky, Joan C. Gorga, Lawrence J. Stern, Robert G. Urban, Jack L. Strominger, Don C. Wiley. Three-dimensional structure of the human class II histocompatibility antigen HLA-DR 1. Nature 364 (1993) pp. 33-39. 4. Klohe-E; Fu-XT; Ballas-M; Karr-RW. HLA-DR beta chain residues that are predicted to be located in the floor of the peptide-binding groove contribute to antibody-binding epitopes. Hum-Immunol. 37(1) (1993) pp. 51-8. 5. Rita Jores, Pedro M. Alzari, Tommaso Meo, Resolution of hypervariable regions in T-cell receptor p chains by modified Wu-Kabat index of amino acid diversity. Proc.Natl.Acad.Sci.USA 87 (1990) pp. 9138-9142. 6. Wu,T.T. & Kabat.E.A. An analysis of the sequences of the variable regions of Bans Jones proteins and myeloma light chains and their implications for antibody complementarity. J.Exp.Med. 132 (1970) pp. 211. 7. Fu XT, Drover S, Marshall WH, Karr RW. HLA-DR residues accessible under the peptide-binding groove contribute to polymorphic antibody epitopes. Hum Immunol 43(4) (1995) pp. 243-50.

CHARACTERIZATION OF A NEW TISSUE-SPECIFIC MUTATION OF THE YELLOW GENE WHICH SUPPORTS TRANSVECTION JI-LONG C H E N * 1 AND Jm Liu 1 ,

KATHRYN HUISINGA 2 AND PAMELA G E Y E R 2 , JAMES M O R R I S 3 ,

AND C.-TING WU3 1

Institute of Developmental 2

Department

3

Department

Biology, Chinese Academy of Sciences, Beijing 100080, of Biochemistry,

China.

University of Iowa, IA 52242, USA

of Genetics, Harvard Medical School, Boston, MA 02115, USA. Corresponding

author, E-mail,

[email protected]

Transvection is an allelic interaction mediated by homologous chromosome pairing. Such interactions can modify gene expression. One gene that displays transvection effects is yellow of Drosophila melanogaster, as indicated by intra-allelic complementation between certain alleles. Transvection at yellow were originally documented using the y2 mutation which is caused by the insertion of a gypsy retrotransposon. To understand the molecular mechanisms involved in transvection and investigate whether gypsy can enhance transvection at yellow, we screened through a collection of tissue-specific yellow mutations. We identified one allele that is mutant in wing and body pigmentation as y2. Furthermore, we found that this allele can transvect with both y2 and yellow null alleles (/"'") carrying promoter disruptions. To determine the molecular basis for these unusual complementation properties, we characterized this new allele. Our results demonstrate that this mutation is caused by a large insertion in the 5' regulatory region. Molecular characterization shows that the insert does not contain gypsy element, suggesting that this transposable element is not required for transvection at yellow. In addition, our results indicate that both cis and trans enhancers may play a role in transvection.

1

Introduction

The yellow gene of Drosophila melanogaster encodes a protein that is responsible for the pigmentation of cuticle structures. Wild type expression of the gene causes a blackish-brown pigmentation in the wing, body, bristle and tarsal claw cuticle. Mutations in the gene result in a change in pigmentation to yellow color. The gene contains four tissue-specific enhancers. The wing enhancer and body enhancer, which control pigmentation in the wings and body, are located upstream of the yellow promoter. In addition, there are two other enhancers located downstream of the promoter which control pigmentation in the bristles and tarsal claws [1]. It has been reported that certain yellow mutants show intra-allelic complementation [2-4]. In other words, the levels of yellow gene expression in some tissues of the progeny are higher than found in either parent. This effect is known as transvection. Transvection was first noted by Ed Lewis in 1954 in studies of the bithorax complex [5]. Since that time, a number of genes, including white, brown, sex combs reduced, eyes absent have been shown to exhibit transvection [6-9]. The 195

196

J.-L. Chen et al.

potential to transvect may be a feature of many genes in Drosophila. Moreover, related phenomena in Drosophila and other organisms, such as meiotic transvection in fungi, gene silencing in plants and transactivation of Igf2 in mouse, have also been observed [10-17]. Thus, insights into the mechanism of transvection may provide a widespread understanding of the molecular basis of gene regulation and chromosome organization. Initial characterization of transvection at yellow gene involved the y2 allele. The 2 y mutation is a consequence of the insertion of the gypsy element 700 bp upstream of the transcription start site, between the wing and body enhancers and the promoter. Gypsy carries the Su(Hw) binding region, which when bound by the Su(Hw) protein prevents the wing and body enhancers from interacting with promoter [3]. The y2 allele complements some ynul1 alleles, but fails to complement other/""alleles. It was found that the complementing ynu alleles all contained nonfunctional promoters while the non-complementing ynu alleles contained intact promoter regions [3-4]. In the early studies of transvection at yellow it was suggested that gypsy element may play a role in transvection [3]. Recently we determined that gypsy was not essential for transvection at yellow[ 18]. However, the results from previous studies suggested that gypsy might enhance the transvection since complementation with a non-gypsy yellow allele, yS2f29 , was not as strong as that seen with y2 [18]. To better understand the molecular basis for transvection at yellow, we examined whether other tissue-specific mutations could support transvection similarly but independently of y2 and gypsy element. The results presented here show the structure and complementation pattern of a new, non-gypsy tissue-specific yellow mutation, y2374, that supports transvection independently of y2. We also present evidence indicating that both cis and trans enhancers may play a role in transvection. 2 2. /

Materials and Methods Drosophila stocks

The y2347 flies were obtained from the Uman region of Russia by Dr. Golobovsky (gift from Dr. M. M. Green of the University of California, Davis). The y3c3 flies were produced by ethyl methane sulfonate (EMS) treatment of y' flies [18]. The ym flies(P-element derivative of y76d2S) were obtained from Dr. M. M. Green. The ym, y3c3, y' and y2 were previously characterized [3,18]. Flies carrying yellow mutations were maintained on standard cornmeal-molasses medium at 25°C. Transvection was determined by crossing the certain yellow alleles as shown in table 1. The intragenic complementation between certain yellow alleles was tested by scoring the pigmentation in resulting progeny. Pigmentation phenotypes in wings and body were determined by examining 2-3 day old files on a scale of 1 to 5. According to

New Tissue-Specific Mutation of the Yellow Gene Supporting Transvection

197

this scale, 1 represents the null or nearly null state, and 5 represents the completely wild-type state. Table 1. Complementation data in yellow alleles. Pigmentation in wings (W) and body (B) was scored as described in Materials and Methods. The scores shown in this table were obtained in the laboratories of P.K.G and J.L.C. but slightly differ in some cases from scores obtained in die laboratory of C.-T.W., which may reflect culture condition.

crosses y2347

y2 W,B 3,3

W, B 1-2, 1-2

2

3,3

S2f29

1-2, 1-2

1-2, 1-2 1-2, 1-2

4,4 4,4 1,1

4,4 4,4 1,1

y

y'#8 y3c3

y' 2.2

y2347

Analysis

yim

y3c3

y

W,B 4,4 4,4

W,B 4,4 4,4

W, B

2-3, 2-3

2-3, 1 1, 1 1, 1

1,1 1,1 1,1 1,1

S2f29

W,B 1-2, 1-2 1-2, 1-2 1-2, 1-2 2-3, 2-3 2-3, 1

1,1

1,1 1,1 1,1

1,1

1,1 1, 1

ofy2"4

Genomic DNA was isolated from y23 4 flies. Genomic library was constructed in the Lambda DASH II Vector (strategene). The library was screened with yellow genomic sequences using standard techniques [19]. Eight positive phages were isolated from the library. DNA containing y2374 allele was subcloned into pBluescript (strategene). The structure of y was determined by restriction mapping and sequence analysis. 2.3

In situ

hybridization

In Situ hybridization was performed as described by Lim[20]. DNA probe containing the gypsy element was labeled with Biotin-16-dUTP by nick-translation. Chromosome squashes were prepared from salivary gland cells and hybridized with the probe at 37°C for 18 hr. Detection of hybridization signals was performed after incubation with avidin-biotinylated peroxidase mixture, followed by a second incubation with DAB.

3 3.1

Results and Discussion Complementation

properties ofy2374

allele

y is a tissue-specific yellow mutation which shows the same phenotype that is observed with the y2 allele. In other words, y2374 flies have mutant pigmentation in the wings and body but wild type phenotype in the bristles and tarsal claws. To

198

J. -L. Chen et al.

evaluate the degree of complementation of y , the intragenic complementation tests were conducted. In these studies, y2374 flies were crossed to flies carrying the yellow null alleles, ym, y3cS and y1 (Table 1). Three yellow null alleles are complete mutant in all tissues. The results from these experiments showed that y2374 allele complemented the yellow null alleles, ym and y3c3, but not y' (Table 1). Flies that were heterozygous for y2374 and ym or y3c3 exhibited pigmentation in wings and body at a level 4, which was comparable to the level observed in the heterozygotes of y / y or y / y . Surprisingly, complementation also occurred between y2374 and y2, showing pigmentation at level 3. As expected, y2347 failed to complement y' (Table 1). Early studies of transvection at the yellow gene of Drosophila melanogaster were conducted using the y2 mutation. It was shown that when y2 was paired with certain yellow null alleles that contain non-functional and/or structurally altered promoters, complementation would occur in the wing and body tissues. However, y2 failed to complement when paired with yellow null allele that contains a functional promoter [3]. It was determined previously that both yim and y3c3 contain deletions of the promoter region, while y1 contains a functional promoter (Figure 1D-1F) [3, 18]. The mutant phenotype associated with y1 is a result of a lbp change in the initiation codon. The results described above indicate that y2347 exhibits the same pattern and similar level of complementation as y2. These data support a proposed model accounting for transvection [3], According to this model, for transvection to occur the yellow null allele must contain a non-functional promoter. 3.2

Gypsy element does not participate in the regulation of transvection

To understand the molecular basis for the tissue-specific phenotype of the y23 4 mutation, we carried out a structural analysis of yellow gene associated with this mutation. A lambda library was constructed and eight clones containing the mutant yellow locus were selected. Restriction mapping and sequence analysis showed that the wing and body enhancers as well as the coding region of yellow gene are intact, suggesting that y2347 mutation is not due to the disruption of the sequences of yellow. The data from molecular analysis determined that y2347 mutation is caused by a large insertion located 12lbp upstream of the transcription start site of yellow gene (Figure 1A). The precise size of the insert is unclear. The insertion is flanked by 1.4 kb defective Hobo transposable elements. In addition to Hobo elements, inserted DNA contains the telomeric associated sequences (TAS) (Figure 1A). Thus, two possibilities exist for explaining the mutant pigmentation in wings and body of y2347 flies. One could envision that communication between the upstream enhancers and the promoter is blocked by the large insertion which would result in the inability of the wing and body enhancers to act on the promoter effectively from their distant location. Alternatively, the y2347 mutation is caused by effects of sequences inserted into the gene, which may be a repressor such as TAS [21]. However, we do not

New Tissue-Specific Mutation of the Yellow Gene Supporting Transvection

199

favor the second suggestion since we have no evidence. We suggest here that the mutant pigmentation of this new yellow allele is caused by block of communication between the upstream enhancers and the promoter. In the early studies of transvection at yellow it was found that at least one allele of each pair-wise complementing combination contained a full or deleted copy of the gypsy element [3]. Recently we identified a non-gypsy yellow allele, yS2/29 Although y82f29 complemented certain yellow null alleles such as y'*8, complementation with this non-gypsy allele was not as strong as y2 [18]. The issue then arose as to whether gypsy could enhance the intra-allelic complementation. To determine whether gypsy was involved in the complementation at yellow, we examined the y2374 allele by in situ hybridization with gypsy probe. No gypsy element was observed in y2374 allele, whereas y2 control showed this element. These results provided evidence that DNA inserted into the yellow gene does not contain gypsy. The fact that y2374 exhibits strong complementation similar to y demonstrates that gypsy and the Su(Hw) binding region may not participate in the regulation of transvection at yellow. 3.3

y2374 allele supports a proposed model that both cis and trans enhancers may be involved in the intra-allelic complementation at yellow

Previous studies suggested that transvection at y2 allele might have two components, both a cis and trans component [3,18]. The trans component was the idea that when the complementing allele contained a non-functional promoter, the enhancers of this allele were released from their obligation in cis and allowed to activate the y promoter in trans. In addition to enhancer action in trans, the cis enhancers on y2 were also believed to act on their own promoter by enhancer bypass of the gypsy insulator. It was proposed that enhancer bypass resulted from the topology of paired ye Wow alleles [18]. In this paper, we used a yellow null allele, y3c3. It was previously characterized that y3c3 contained a 3.6 kb deletion that removed the upstream body enhancer, the promoter, and the first exon (Figure IE). Interestingly, it was found that y23 4 can strongly complement y3c3, showing pigmentation in wings and body at a level 4 (Table 1). These results indicate that when y2374 is paired with y3c3, the body enhancer of y23 must act in cis. We suggest that the cis effect of transvection at y2374 is a result of the structural changes in the promoter region of the complementing allele. When y2374 is paired with y3c3, these structural changes could alter the topology of the y2374 promoter allowing the enhancers on y2374 better access to their own promoter. y p allele is also a tissue-specific mutation which is caused by a 4.1 kb deletion that removes the body and wing enlancers and 2.3 kb upstream region (Figure 1C) [18]. As a consequence of this deletion, y82f29 flies exhibit an identical phenotype to flies with the y2347 allele. However, only y2347 allele showed strong complementation similar to y2, while y82f29 exhibited a lower level of

200

J.-L.Chen etal.

complementation than y2 or y2374 against both ym and y3cS (Table 1). These data suggest that complementation between y2374 or y2 and certain yellow null alleles may involve both a cis and trans effects of the enhancer elements, giving rise to near wild type pigmentation in wings and body of y2374/ynuU or y2/ynu" flies. Complementation with y82)29 results only from the enhancers of ynuU acting in trans on the promoter located in the homologous chromosome, so that the phenotype of 82{29 flies [18]. y82p9/ynuii i s i n t e r m e d i a t e between wild type and y nu 2374 In addition to complementing y ' alleles, y also showed an intermediate level of complementation when paired with the y2 allele (Table 1). Since both y2374 and y contain large inserts, pairing between these alleles may change the structural conformation of the gene to give the enhancers better access to the promoter, thereby allowing the enhancers to activate the promoter and increase the level of transcription. Additionally, sequences present in y2374 may influence the insulating ability of gypsy making it a weaker insulator. Diminished insulating ability may allow the wing and body enhancers of y2 to overcome the gypsy insulator block and activate the y2 promoter, resulting in intermediate pigmentation of the wing and body cuticles. 4

Acknowledgment

We thank Ms. R. Roseman (University of Iowa) for her skilled technical assistance. This work was supported by grants from the Chinese Academy of Sciences and "863" program and NNSFC (grant No. 39870388) to J.L.C., and an NIH grant to P.K.G. References 1.

2. 3.

4.

Geyer, P. K., and Corces, V. G., Separate regulatory elements are responsible for the complex pattern of tissue-specific and developmental transcription of the yellow locus in Drosophila melanogaster. Gene & Devel. 1 (1987) pp. 9961004. Green, M. M., Complementation at the yellow Locus in Drosophila melanogaster. Genetics 46(196l)pp. 1385-1388. Geyer, P. K., Green, M. M., and Corces, V. G., Tissue-specific transcriptional enhancers may act in trans on the gene located in the homologous chromosome: the molecular basis of transvection in Drosophila. EMBO J. 9 (1990) pp. 2247-2256. Morris, J. R., Chen, J. L., Filandrinos, S. T., Dunn, R.C., Fisk, R., Geyer, P. K., and Wu,C.-T., An analysis of transvection at the yellow locus of Drosophila melanogaster. Genetics 151 (1999) pp. 633-651.

New Tissue-Specific Mutation of the Yellow Gene Supporting Transvection 5.

6.

7.

8.

9. 10.

11.

12.

13.

14. 15. 16.

17.

18.

19.

201

Lewis, E. B., The theory and application of a new method of detecting chromosomal rearrangements in Drosophila melanogaster. Am. Nat. 88 (1954) pp. 225-239. Jack, J. W., and Judd, B. H., Allelic pairing and gene regulation: A model for the zeste-white interaction in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA. 76 (1979) pp. 1368-1372. Henikoff, S., and Dreesen, T. D., Trans-inactivation of the Drosophila brown gene: Evidence for transcriptional repression and somatic pairing dependence. Proc. Natl. Acad. Sci. USA. 86 (1989) pp. 6704-6708. Pattatucci, A. M., and Kaufman, T. C , The homeotic gene sex combs reduced of Drosophila melanogaster is differentially regulated in the embryonic and imaginal stages of development. Genetics 129 (1991) pp. 443-461. Leiserson, W. M., Bonini, N. M., and Benzer, S., Transvection at the eyes absent gene of Drosophila. Genetics 138 (1994) pp. 1171-1179. Gelbart, W. M., Synapsis-dependent allelic complementation at the decapentaplegic gene complex in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA. 79 (1982) pp. 2636-2640. Kapoun, A. M., and Kaufman, T. C , Regulatory regions of the homeotic gene proboscipedia are sensitive to chromosomal pairing. Genetics 140 (1995) pp. 643-658. Donaldson, K. D., and Karpen G. H., Trans-suppression of terminal deficiencyassociated position effect variegation in a Drosophila minichromosome. Genetics 145 (1997) pp. 325-337. Pal-Bhadra, M., Bhadra, U., and Birchler, J. A., Cosuppression in Drosophila: gene silencing of Alcohol dehydrogenase by white-Adh transgenes is polycomb dependent. Cell 90 (1997) pp. 479-490. Aramayo R., and Metzenberg, R. L., Meiotic transvection in fungi. Cell 86 (1996) pp. 103-113. Matzke, M. A., and Matzke, A. J. M., Homology-dependent gene silencing in transgenic plants: what does it really tell us? Trends Genet. 11 (1995) pp. 1-3. Goldsborough, A. S., and Kornberg, T. B., Reduction of transcription by homologue asynapsis in Drosophila imaginal discs. Nature 381 (1996) pp. 807810. Sun, F. L., Dean, W. L., Kelsey, G., Allen, N. D., and Reik, W., Transactivation of Igf2 in a mouse model of Beckwith-Wiedemann Syndrome. Nature 389 (1997) pp. 809-815. Morris, J. R., Chen, J. L., Geyer, P. K., and Wu, C.-T., Two modes of transvection: enhancer action in trans and by pass of a chromatin insulator in cis. Proc. Natl. Acad, Sci. USA. 95 (1998) pp. 10740-10745. Sambrook, J., Fritsch, E. F., and Maniatis, T., Molecular cloning: A laboratory manual (Cold Spring Harbor Lab. Press, NY, 1989).

202

J.-LChen etal.

20. Lim, J. K., In Situ hybridization with biotinylated DNA. Drosophila Information Service 72 (1993) pp. 73-76. 21. Karpen, G. H., and Spradling, A. G., Analysis of subtelomeric heterochromatin in the Drosophila minichromosome Dp 1187 by single P element insertional mutagenesis. Genetics 132 (1992) pp. 737-753.

MHC CLASS II SUPPRESSION BY TROPHOBLAST cDNAs G R A E M E L H A M M O N D , DIVAKAR MANDAPATI, JAVIER DAVILA, M I C H A E L A C O A D Y , AND ALFRED LM

BOTHWELL

Yale University School of Medicine, Department of Surgery and Section of 333 Cedar Street, New Haven, CT 06510 USA E-mail:

Immunobiology,

[email protected]

Trophoblasts synthesize dominant negative transacting factors that suppress expression of MHC genes. We report our initial experience in cloning a class II suppressor. A human trophoblast cDNA library was transfected into HeLa cells that were then stimulated with IFNy. Low class II expressors were isolated by flow cytometry. Individual low expressors were expanded by limited dilution. cDNA from clones that remained class II negative contained a 0.48 Kb non-coding RNA. Cross species activity was tested in murine B cells. This showed 80 to 90% suppression of class II antigen and mRNA.

1

Introduction

Trophoblasts are placental endothelial cells that form a barrier between maternal and fetal blood. These cells are not recognized by the maternal immune systems. A persistent finding in trophoblasts is the complete absence of all MHC class I and II surface antigens, except HLA-G [1]. The absence of MHC antigens on trophoblasts is thought to be one mechanism that prevents recognition and rejection of the fetal allograft. Although the mechanism for protecting fetal tissues from attack by a desperate immune system is undoubtedly very complex [2], there are aspects of the phenomenon that are readily apparent. For example, other placental cells, and of course fetal cells, either express or can be induced to express MHC antigens. Since these cells all arise from the same ovum, a control mechanism must exist, that when activated is capable of extinguishing MHC expression in all cells. We previously reported that when stable hybrid fusions were created between trophoblasts and B cells, expression of all classical MHC antigens was extinguished at the transcriptional level and that the mechanism of action may involve blockade of the CIITA promoter [3]. If this mechanism could be unraveled and adapted to cells that normally express MHC antigens, a model might be created that could eliminate T cell recognition of tissues. This could theoretically have implications for developing transgenic animals to serve as organ donors or cell cultures for gene or drug delivery systems. We here report the cloning of a trophoblast class II suppressor cDNA.

203

204 2 2.1

G. L. Hammond et al. Method Generation of trophoblast cDNA library

Poly A+-RNA was prepared with the PolyATTract System (Promega) from trophoblast cell line JAR. JAR cell cDNAs were prepared with oligo-dT primers, size-selected, and cloned using standard methods [4] with directional adapters into the mammalian expression vector pSH4-hph m which provides an SV40 promoter, poly-A addition signal, and hygromycin resistance [5]. The cDNAs were expanded in E. coli and large plasmid preparations with two CsCl bandings produced stock solutions for transfection. Size-selected sublibraries were prepared with 0.5- to 4 kilbase (kb) and 4- to 23-kb inserts. The Jar large-insert expression library consisted of 1.7xl05 independent clones, and it was used in the experiments described here. A clone of the cervical carcinoma cell line HeLa, clone 6, was isolated by limiting dilution culture that gave expression of HLA-DR antigen after stimulation by recombinant human IFN-y (Boehringer-Mannheim, Indianapolis, IN), and this clone was expanded for further use. 2.2

Library transfection and screening

The cDNA expression library was transfected by the calcium phosphate method [6] into HeLa clone 6 cells. Approximately 2x10^ stable transfectants resistant to 150 (xg/ml hygromycin (Boehringer-Mannheim) were screened from 4 transfections over several months. Multiple rounds of selection were performed by IFN-y challenge (200 U/ml for 2 days) and sterile sorting by flow cytometry of live, lightly trypsinized cells, gating on the lowest 10% of the range of HLA-DR antigen staining. The HLA-DR mAb L243 was used versus non-immune mouse IgG2a (Sigma, St. Louis, MO) as a negative control. Untransformed, IFN-y-treated HeLa cells served as positive control cells. mAb binding was detected with Rphycoerythrin-goat anti-mouse IgG secondary Ab (Molecular Probes, Eugene, OR) on a FACS IV instrument (Becton-Dickinson, Bedford, MA). The isolation of antigen-negative cells was completed by cloning through limiting dilution and screening subcultures grown in chamber slides (Nune, Rochester, NY) by means of immunocytochemistry with L243 mAb.

2.3

Plasmid rescue and sequencing

Polymerase chain reaction (PCR) primers were prepared that amplified sequences between the promoter and the poly-A signal of the expression vector. pSH4-l (5'GATG-TTGCCTTTACTTCTAGGCCT-3'). Amplification was performed over 30

MHC Class II Suppression by Trophoblast cDNAs

205

cycles of 1 min at 94°C, 2 min at 55°C, and 3 min at 72°C in a thermal cycler (Perkin Elmer, Norwalk, CT) with 1 U of Tag DNA polymerase (Gibco BRL). PCR products were cloned into the pCR3 vector and grown in TOP10F' E. coli according to the instructions of the manufacturer (Invitrogen. Carlsbad, CA). The CMV promoter in pCR3 drives mammalian expression, and the vector provides neomycin resistance. Plasmids were purified on Qiagen tips (Qiagen Corp., Valencia, CA). Supercoiled pCR3 plasmid was prepared by ligation with no cDNA insert in the TA-cloning site. Restriction endonucleoase mapping of 9 cDNA clones was performed to establish the orientation of the cDNA. Constructs were introduced into HeLa clone 6 cells by lipofection or by calcium phosphate-mediated transfection. Stable secondary transfectants were selected with G418-containing medium at 300-600 u/ml G418 (Gibco BRL). Double-stranded plasmid DNA was sequenced by the fluorescent cycle sequencing method with an Applied Biosystems 373A DNA Sequencer. Primers used for sequencing were pSH4-l, pSH4-2, T7 promoter primer (5'-TAATACGACTCACTATAGGG-3'), and internal primers (5'-GTGTGATCTGAAAACCCTGCTTGG-3'), (5'-AGACTACTTCCCCATACATGCG-3'), and (5'CC ATACAG AGC AAC ATACCAGTAC-3'). 2.4

Cross species transfection

In order to test for cross species activity and functional effect, the rescued plasmid was subcloned into the mammalian expression vector pREP4 which encodes the hygromycin resistance gene and allows replication within B cells as an extrachromosonal episome. The pREP4-cDNA expression constructs were then transfected by lipofection into CH27 and A20 murine class II+ B cells. This liposome-mediated transfection was carried out on approximately 106 cells in 6-well plates. The cells were treated for 4 hours with the lipid-DNA complex made with 6 |a.g of DNA, and 24 uL of lipofectin (GIBCO BRL). Fetal bovine serum was added (10%) and the cells were incubated overnight. After a 2-day nonselective culture, cells were selected in Bruff's medium containing 500 |ig/mL, hygromycin. Transfected cells were acetone fixed at multiple days of selection, and stained, along with untransfected control cells, for class II antigen using the avidin-biotin-peroxidase technique. Additional untransfected CH27 and A20 cells were stained with antibodies against the B-cell antigen CD45 as a positive control and with normal mouse IgG2a as a negative control. Immunocytochemical experiments were repeated several times to ensure internal consistency. These stably transfected clones were then screened by flow cytometry. RNA was then isolated and Northern blots probed for I-Ad mRNA. Changes in mRNA concentration seen by Northern analysis were quantitated by densitometry. The presence of the plasmid in the transfected cells was confirmed by RTPCR.

206

3

G. L. Hammond et al. Results

Shown in Figure 1 is a typical example of positive and negative controls and class II expression after three sorting cycles following transfection with the trophoblast library. Figure 2 shows one step in the process of cloning HeLa cells by limiting dilution. Following stimulation by IFNyand staining with L243, some cells can be seen that are positive slightly negative and markedly negative for class II expression.

"0

~

f,

----OJ.

C_'4_1._'S_SA_r"_,"_#2_,J_l_2

-r-_~_--r04_1..;.49:.:9:.::0o::;;$C=o:.:n:.:.:,O=c2=-_ _~_---,

::~,

I\D41499Al'lm#2,J12

~

c ..

~;t-~., ~ J o~t~":~r"~,;, 'b'

FLl

FLl

Figure 1. FACS scan of negative control (top) and positive stained with L234 (class II-middle) and cells stained for class II after three sorting cycles (bottom).

207

MHC Class II Suppression by Trophoblast cDNAs

«f Figure 2. Cytoprep shows vailing degrees of class II suppression after sterile sorting of cDNA library transfected HeLa cellsfollowingIFN-y treatment..

When clones were picked that were class II negative by light microscopy, plasmid rescue was carried out by the method described. The plasmid from cells containing activity had the laboratory number T10AC9. The sequence of the identified cDNA was: AAGCGGCGAGGTGCCTTTACTACATGTGTGATCTGAAAACCCTGCTTCX5TTCTGAGCTGC GTCTAWGAATTGGTAAAGTAATACCAATG^ TTTCACWGAAATTTTAAAAATCATG^ GGWCCATTTITTAAATGTTTAAAAATATGTTGACATGGTAGTTCAGTTCWAA ACTTGGGGATGATGCAAACAAWACTGTCGWGGGATTTAGAGTGTATTAGTCACGCATG TATGGGGAAGTAGTCTCGGGTATGCTGTTGTGAAATTGAAACTGTAAAAGTAGATGGTTG AAAGTACTGGTATGTTGCTCTGTATGGTAAGAACTAATTCTGTTACGTCATGTACATAAT420 TACTAATCACTTTTCWCCC A 481

60 120 180 240 300 360 480

Figure 3 shows the effects of T10 AC9 transfection into CH27 (murine) B cells. The transfected human trophoblast cDNA down-regulated constitutive expression of MHC class II antigen in murine B cells with uniform suppression achieved at 30 days of selection. The untransfected and positive control cells demonstrated homogeneous dark cytoplasmic and plasma membrane staining. The negative controls exhibited slight detectable staining.

208

G. L Hammond et al

* jlli

Figure 3. Antibody staining of untransfected and stably transfected CH27 cells at 30 days of selection (120x). Untransfected cells demonstrate the presence of class II antigen (top left). Class II antigen is significantly reduced bet not absent both intracelMarly and on the cell surface of cells transfected with T1GAC9 (top right). CeEs stained for CD45 (bottom left) and with normal mouse purified Ig (bottom right) are shown for comparison.

As shown in Figure 4, three transfected A20 clones showed marked suppression of I-Ad antigen expression compared to wild type (WT) positive control. As shown in Figure 5, analysis of Northern blots from two clones (9.1 and 9.2) demonstrated that I-Ad mRNA synthesis was suppressed by 84% or 93%. 4

Discussion

MHC class II molecules play a central role in both humoral and cellular immune responses by presenting peptide antigens to CD4+ T cells. The molecules are heterodimeric transmembrane proteins consisting of a and p chains. In man, there are three class II antigens, HLA-DR, -DP, and -DQ. Constitutive and inducible expression is controlled at the level of transcription by a highly conserved promoter region consisting of four cw-acting DMA sequences [10] referred to as the S (W or Z), X, X2 and Y boxes that either directly or indirectly react with fra/is-acting regulatory proteins to up regulate transcription. Tight regulation of MHC class II transcription controls quantitative variations in cell surface expression of MHC class II antigens and thus plays a key role in modulating and controlling the immune response [11]. Normal expression of HLA class II genes requires the promoter binding proteins RFX, X2BPS NF-X and NF-Y

209

MHC Class II Suppression by Trophoblast cDNAs

as well as the associated protein RFXAP and the CIITA transactivator protein for transcription to be initiated.

— —

Neg. Control Class II

A20 Wild Type

A20 T10AC9- pCFi3 Clone#1

A20 T10AC9-pCR3 Clone#3

A20 T10AC9-pCR3 Clone#5

Figure 4. Top panel is class II antigen positive control and bottom three panels are A20 clones transfected with T10AC9 and stained for class II antigen showing high degree, but not complete, suppression of class II expression.

It is clear from this and other studies [3,12,13] that a deficiency or alteration of functional CIITA is the single key to class II suppression in trophoblasts since all other nuclear factors must be present before the gene can transcribe. Accordingly, in order to understand class II suppression in trophoblasts, one must understand CIITA regulation. Since CIITA constructs initiate transcription in constitutive non expressing cells, other factors such as nuclear translocation inhibition of CIITA or

210

G. L. Hammond et al.

CIITA activation dependency by lack of GTP binding proteins is not a factor and other CIITA transcription inhibitors such as IL-1(5 [14] and TGF-p [15] were not present in JAR cell cultures.

I-Ad mRNA

WT

Clone 1 Clone 2

Figure 5. Total cellular RNA extenuated from WT and two A20 clones transfected with T10AC9. RNA was electrophoresed and probed for 1-Ad mRNA and radiographs scanned showing suppression of 1-Ad mRNA synthesis in T10AC9 transfected cells.

How T10AC9 may affect CIITA is only speculative. The following statements regarding the cloning of T10AC9 can be made: 1. It has no effect on class I expression 2. Since trophoblasts are constitutively class II suppressed and are unresponsive to IFN-y or secondary messenger pathway stimulation [16] T10AC9 may operate independently of the IFN-y, Jak-STAT pathway. 3. It never completely suppresses class II antigen expression 4. Its activity is more easily reproducible in murine B cells (A20, CH27) than in human B cells (Raji, UC). 5. Its effect is variable and appears to be dependent upon factors that, at present, are not understood. The following statements about the structure of T10AC9 can also be made: 1. Using the Zuker program for RNA folding, no obvious resemblance to ribozyme structure can be found. 2. Screening the gene bank database for cDNA expressed sequence tag (EST) clones showed no clones complementary to the T10AC9 sequence which indicates that there is little likelihood of the RNA acting as an antisense. 3. Although there are several short runs of pyrimidine bases, there are no known natural examples of triple helix interactions so the likelihood of transcriptional inhibition by triplex formation is small. 4. There are no apparent promoter sequences in T10AC9; therefore, it is unlikely that the RNA is acting as a decoy. 5. All of the above make a very strong case for T10AC9 being a regulatory RNA.

MHC Class II Suppression by Trophoblast cDNAs

211

Further gene bank database searches for ESTs reveals that T10AC9 has 100% homology with a .481 kb segment of a 14.7 kb non coding transcript of the Multiple Endocrine Neoplasia-1 (MEN-1) gene. The MEN-1 gene is suspected of transcribing for a tumor suppressor factor [17]. There is also conservation with 85% homology with 194 bp from a .464 kb EST from goat lactating mammary tissue [18]. Under normal circumstances, the CIITA gene is controlled entirely by diffential activation of multiple promoters of a single transactivator gene [19]. There are four independent promoters (pi, pll, pill and pIV) that are active in different cell types. Each promoter is upstream of a separate and distinct non coding first exon that is linked to the identical CIITA structural gene. This arrangement allows for easy detection of the CIITA promoter that is active in various cell types. For example, pi and pill are active in constitutive class II expressors with pi found in dendritic cells and pill found in B cells. The pIV promoter is active in class II inducible cells and itself contains four cw-regulatory elements, NF-GMa, IFN-y activation site (GAS), E Box and an IFN-y regulatory factor site (IFR-1). No cell line or tissue has yet been identified in which the CIITA type II promoter is active [19]. All the promoter regions are theoretical targets for suppressing molecules. Also, the NF-GMa site in the pIV promoter may, itself, be a down regulatory element since mutants of this element increase synthesis of a (3-globin reporter gene construct three times over WT[20]. The mechanism of CIITA suppression in trophoblasts is undoubtedly quite complex. If T10AC9 is a regulatory RNA, it may be responsible for orchestrating the position of a number of proteins that together produce CIITA suppression. For example, protein binding to the GAS element, the E box and IRF-1 binding site are all necessary for STAT-1 activation of CIITA [20] and removal of any one of these would result in blocked IFN-y activation of CIITA. However, it might be lethal to block the JaK-STAT pathway since IFN-y is required for activation of other essential genes such as GBP [21], iNOS [22], IRF-1 and USF [20]. Indeed, trophoblasts have IFN-y receptors [23]. Constitutive class II expression does not require IFN-y, yet class II genes can be 80-90% suppressed by T10AC9. As seen by its effect on both HeLa and B cells, T10AC9 may act in some way on pi, III and IV. Accordingly, many possible mechanisms for CIITA blockade in trophoblasts can be postulated such as blocking proteins guided to promoter regions or activating proteins being ushered away from promoters. Obviously, considerably more work is required to ascribe a specific functional role to T10AC9 in class II suppression in trophoblasts. However, there is strong evidence of non coding RNAs guiding proteins to DNA target sequences. The RNA transcripts may hybridize to their genomic counter-parts and direct regulatory proteins to specific sites [24].

212 5

G. L. Hammond et al. Acknowledgements

This work is supported by a grant from the Bugher Foundation. References 1.

2.

3.

4.

5.

6. 7.

8.

9.

10.

11. 12.

Pazmany L, Mandelboim O, Vales-Gomez M, Davis DM, Reyburn HT and JL Strominger. Protection from natural killer cell-mediated lysis by HLA-G expression on target cells. Science 274 (1996) pp. 792. Xu C, Mao D, Holers VM, Palanca B, Cheng AM and H Molina. A critical role for murine complement regulator crry in fetomaternal tolerance. Science 287 (2000) pp. 498. Coady MA, Mandapati D, Arunachalam B, Jensen K, Maher SE, Bothwell ALM, and GL Hammond. Dominant negative suppression of major histocompatibility complex genes occurs in trophoblasts. Transplantation 67 (1999) pp. 1461. Sambrook J, Fritsch EF and T Maniatis. Molecular Cloning. A laboratory manual. (Cold Spring Harbor, New York: Cold Spring Harbor Lab Press, 1989) pp. 1. Vasavada HA, Ganguly S, Chorney M, Mathur R, Shukla H, Swaroop A and SM Weissman. pSH4: A mammalian expression vector. Nucl Acids Res 18 (1990) pp. 3668. Kriegler M. Gene Transfer and Expression. A laboratory manual. (Stockton Press, New York, 1990). Chang C-H, Fontes JD, Peterlin M and RA Flavell. Class II transactivator (CIITA) is sufficient for the inducible expression of major histocompatibility complex class II genes. J Exp Med 180 (1994) pp. 136. Coady MA, Mandapati D, Chlosta W, Braxton J, Nelson PJ, Peyman J and GL Hammond. The effect of silencer cDNA on B-cell class II MHC expression. Surgical Forum XLVII (1996) pp. 395. Riley JL, Westerheide SD, Price JA, Brown JA and JM Boss. Activation of class II MHC genes requires both the X box region and the class II transactivator (CIITA). Immunity 2 (1995) pp. 533. Benoit C and D Mathis. Regulation of major histocompatibility complex classII genes: X, Y and other letters of the alphabet. Ann Rev Immunol 8 (1990) pp. 681. Watanabe Y and CO Jacob. Regulation of MHC class II antigen expression. J Immunol 146 (1991) pp. 899. Morris AC, Riley JL, Fleming WH and JM Boss. MHC class II gene silencing in trophoblast cells is caused by inhibition of CIITA expression. Am J Reprod Immunol 40 (1998) pp. 185.

MHC Class II Suppression by Trophoblast cDNAs

213

13. Murphy SP and TB Tomasi. Absence of MHC class II antigen expression in trophoblast cells results from a lack of class II transactivator (CIITA) gene expression. Mol Reprod Dev 51 (1998) pp. 1. 14. Rohn W, Tang LP, Dong Y and EN Benveniste. IL-1 beta inhibits IFNgamma-induced class II MHC expression by suppressing transcription of the class II transactivator gene. J Immunol 162(2) (1999) pp. 886. 15. Piskurich JF, Wang Y, Linhoff MW, White LC and JP Ting. Identification of distinct regions of 5' flanking DNA that mediate, constitutive, IFN-gamma, STAT1, and TGF-beta-regulated expression of the class II transactivator. J Immunol 160(1) (1998) pp. 233. 16. Peyman JA, Nelson PJ and GL Hammond. HLA-DR genes are silenced in human trophoblasts and stimulation of signal transduction pathways does not circumvent interferion-y unresponsiveness. Transpl Proc 24 (1992) pp. 470471. 17. Guru SC, Agarwal SK, Manickam P, Olufemi S-E, Crabtree JS, Weisemann JM, Kester MB, Kim YS, Wang Y, Emmert-Buck MR, Liotta LA, Spiegel AM, Boguski MS, Roe BA, Collins FS, Marx SJ, Burns L and SC Chandrasekharappa. A transcript map for the 2.8-Mb region containing the multiple endocrine neoplasia type I locus. Genome Res 7 (1997) pp. 725. 18. Le Provost F, Lepingle A and P Martin. A survey of the goat genome transcribed in the lactating mammary gland. Mamm Genome 7(9) (1996) pp. 657. 19. Muhlethaler-Mottet A, Otten LA, Steimle V and B Mach. Expression of MHC class II molecules in different cellular and functional compartments is controlled by differential usage of multiple promoters of the transactivator CIITA. EMBOJ 16 (1997) pp. 2851. 20. Muhlethaler-Mottet A, Di Berardino W, Otten LA and B Mach. Activation of the MHC class II transactivator CIITA by Interferon-y requires cooperative interaction between Stat 1 and USF-1. Immunity 8 (1998) pp. 157. 21. Briken V, Ruffner H, Schultz U, Schwarz A, Reis LF, Strehlow I, Decker T and P Staeheli. Interferon regulatory factor 1 is required for mouse Gbp gene activation by gamma interferon. Mol Cell Biol 15 (1995) pp. 975. 22. Kamijo R, Harada H, Matsuyama T, Bosland M, Gerecitano J, Shapiro D, Le J, Koh SI, Kimura T, Green SJ et al. Requirement for transcription factor IRF-1 in NO synthase induction in macrophages. Science 263 (1994) pp. 1612. 23. Peyman J A and GL Hammond. Localization of IFN-y receptor in first trimester placenta to trophoblasts but lack of stimulation of HLA-DRA, -DRB, or invariant chain mRNA expression by IFN-y. J Immunol 149 (1992) pp. 2675. 24. Wassenegger M, Heimes S, Riedel L and HI Sanger. RNA-directed de novo methylation of genomic sequences in plants. Cell 76 (1994) pp. 567.

NULL ACTIVITY MUTATION OF PHENOLOXIDASE IN DROSOPHILA MELANOGASTER NOBUHIKO ASADA*' 1 , NOBUKO KAWAMOTO 1 , ANDTAKASHI H A T T A 2 1

Biological Laboratory,

Faculty of Science, Okayama University of Science, Okayama

700-

0005, Japan E-mail:

[email protected]

2

Research Institute of Technology, Okayama University of Science, Okayama Japan

703-8232,

The null activity mutants of phenoloxidase, MoxGM95 and Dox-3KD95, in Drosophila melanogaster were characterized by native polyacrylamide gel gel electrophoresis. cDNA fragment that is coding MoxGM95 was determined by RT-PCR from 10 g third-instar larvae as the starting material. The results of cDNA sequencing analyses, minimum size of the translated region was 1,011 and number of deduced amino acid residues was 337. The 74 insertion sequence, included stop codon, was found at the catalytic site of prophenoloxidase. The functional significance of this mutation was discussed.

1

Introduction

Invertebrates have an open circulatory system, and have defense mechanisms to prevent blood loss. Prophenoloxidase is activated after wounding induced by b1,3-glucan and lipopolysaccharides as a non-self recognition system [1]. In Drosophila melanogaster, phenoloxidase occurs as a precursor designated as prophenoloxidase A] and A3 coded by Mox and Dox-3, located at the second chromosome 79.6 and 53 respectively. Prophenoloxidase is activated with several detergents including alcohols rapidly within one second [2,3]. MoxGM9S and Dox3 are isozyme variants lacking phenoloxidase activity isolated from native populations from Gomel and Krasnodar, in the former Soviet Union, respectively. This chapter will present Drosophila phenoloxidase, with special reference to its functional and deleterious effect of null activity mutant, MoxGM95. 2

Native Polyacrylamide Gel Electrophoresis

In this article, the term 'activation' is used after Asada et al. [2]. For electrophoresis, polyacryramide gel electrophoresis in 5% gel, and determination of the phenoloxidase activity in the gel were performed after Asada et al. [2]. All chemicals were reagent grade. A typical zymogram (ca. 0.5|J.g protein/lane) was shown in Fig. 1. Prophenoloxidase in the gel was activated with 50% 2-propanol and then stained with 20mM L-tyrosine or L-dopa. In prepupal extract of Oregon-R 215

216

N. Asada et al.

(wild type strain), a single band of phenoloxidase, corresponding with Mox, was revealed in the gel incubated with 20mM L-tyrosine (lane 1); two bands corresponding with Mox and Dox-3 were resolved with 20mM L-dopa (lane 4). In MoxGM9$, no band was detected in the gel incubated with either 20mM L-tyrosine (lane 2) or 20mM L-dopa (lane 5), while in Dox-3KD95, no band corresponded with Dox-3 was detected in the gel (lanes 3 and 6). Visible marker genes linked as c Mox wt and rdo Dox-3 pr had no effect on the electrophoresis.

•*

IBP :*W*

** A t

m 1 2

3

4

5 8

Figure 1. Zymogram after native polyacrylamide gel electrophoresis of the extract from homogenates of Drosophila melanogaster. Prophenoloxifase was activated with 50% 2-propanol and then reacted with lOmM lOmM L-tyrosine (lenes 1-3) or 20 mM L-dopa lanes 4-6). Lanes 1 and 4: Mox (wild type), lanes 2 and 5: Moxcms, lanes 3 and 6: DoxS*095.

3

RT-PCR and DNA Sequencing

The RNA was extracted from lOg of late-third-semester larvae according to Chomczynsky and Sacchi [4]. The cDNA was synthesized using oligo(dT) primer (Gibco BRL Life Technologies Inc.) after Gubler and Hohhman [5]. The oligonucleotide primers for the polymerase chain reaction (PCR, [6]) were designed based on the sequence of five arthropode species, namely, Anopheles gambiae [7], Bombyx mori [8], Drosophila melanogaster [9], Hyphantria cunea [10] and Manduca sexta [11,12]. Primers were obtained as 0.2umol as given in Fig. 2. Numbers show the approximate locations of the primers used in this study. The PCR was 35 cycles, each of which consisted of denaturation at 94 °C for 1 min, annealing at 55-60 °C for 1 min, and another at 72 °C for 2 min. The first PCR product was used for die secondary "nested" PCR experiment. The PCR products obtained were amplified again by using the same pair of primers by PCR before the

Null Activity Mutation of Phenoloxidase in Drosophila melanogaster

217

analysis of cDNA sequence. The PCR products were electrophoresed and stained with ethidium bromide. DNA sequencing was performed with an automatic DNA sequencer (Model 377, PE Applied Biosystems). The 10 partially overlapping PCR products were subjected to DNA sequence analyses. Of these 10 products, the minimum size of the translated region was 1,01 lbp. Amino acid sequence was deduced from the cDNA sequence, and the number of amino acid residues was 337 in MoxGM95. The insert sequence included 74 base, coded eight amino acid residues VSWARRLES, then terminated by the stop codon, TGA (Fig. 3). Numbers at the margin corresponded to those cDNA in MoxGM95. In comparison of amino acid residues with Mox and MoxGM95 by alignment, two spontaneous mutations, 12M-L and 291V-E, were found respectively (Fig. 4).

[i]

[4] [71

[21

[9]

=£55}

[3]

[6]

[81 [5]

5'

[11

(30 mer)

5 -CC»TG»CTftRe»CGGmTCTGA»aGCCTTGG-3'

5F-2

[21

(20 n e r )

5 ~GTTGCCCftTCGCCCGGATRC--3'

[101

5R-1

[3]

(20 n e t )

5 -GTftTCCGGGCGaTGGGCBftC-3'

Cu5<

I«l

(34 mar)

5 -CaT/CCaTTGGCBCTGGCBTT/CTG/CGTC/G/TTBCCC-3'

Cu3'

[5]

(29 mer)

5 -GTGCCfiC/GCG/TGTMHUKSaTa/C/G/TGGATC-3*

SH-2

[6]

(18 mer)

5 -GCCBGTGCOUiTGATGGC-3 •

Cu5-2 [7]

(17 mer)

5 -GCTGGCGTGaTCGCGTG-3'

Cu3~2 [8]

(19 mer)

5 -GCftTaGGftCTGGCGftTGCC-3'

3F-1

[9]

(20 mer)

5 -CGCCTCGTTCaCCCBTCTCC-3'

3R-1

[10] ( 2 0

3'

[11] (29 mer)

raer)

fill

5 -GGaGaTGGGTGftSCGBGGCG-3' S -CGGTCTTGGCCa'FATTGCCaaBGGCaCCG-3'

Figure 2. Oligonucleotide primers for.PCR. of Mox.

Cu above the open square means the copper-binding site

218

N. Asada et al. MoxMS w

CCATGACTAA CACSGATCTG AAAGCCTTGG AACTfJAJTGTT CCASCGACCC CCAT0ACTAA CAC6SATCTG AAAGCCTTGG AACTTjCpTT CCASCGACCC

AAGATTGGTC T0AASAATCA . AAGATTGGTC TMAAGAATCA

AAGCGCAGC TATCCCG! S^CvAA COAAAA3CTA CSGSACG^GI AACC5CACC TAICCC3CCC SASJ,-pftCCAA CCAAAASCTA CSvSACGTGf

CCGTTTGGAA 7CTJGA6CAT AfjGTCATAGA AAGCACTASA AATTTTTATT

TGCTAATCCA TAG

Figure 3. Minimum nucleotide sequence of the MoxGM95 cDNA. The nucleotide sequence was determined by RT-PCR.

Null Activity Mutation of Phenoloxidase in Drosophila melanogaster M O X

^

219

1 MTNTDlKAlElIlFGRPtEPAFTTRDSGKTVlELPDSFYTDRYRNDTEEVG

M O X 6 M S 5 -f"

MTNTOLKALEtLFORPLEPAFTTROSSKTVtELPOSFYTDRVRNDTEEVG

51 NRFSKDVDUIPiaElSNVPSLEFTKKIGLKtiQFSLFNNHHBEtASEUT NHFSKOVDLKIPIOELSNVFSUFTKKIGLKNQFSLFNNRHREIASEUT 10! LPMSAPNLRGFVStSVYTKDRVNPVlFGYAYAVAVAHRPOTREVPITNIS tFMSAPNLRQFVSLSVYTKDRVNPVtFQYAYAVAVAHRPDTREVPiTNIS 151 QIFPSNFVEPSAFROAROEASVIGESSARVHVDIPQNYTASDRIDEORLA OlFPSNFVEPSAFRDARQEASViGESQARVHVDiPQUYTASOREDEQRLA 201 YFREOf6¥MSHHiHiHLVYPTTGPTEV¥NKDRRQELFYYMHH0i LARYNV YFREDIGVNSHHWHWHLVYPTTGPTEVVNKDRRGELFYYMHHOILARVNV 251 ERFCNNLKKVQPLNNLRVEVPEGYFPKlLSSTNNRTYPAEllrNQKLRDVD ERFCNNUKV.OPtNNLRVEVPEGYFPK I LSSTNNRT.YPARgrNOKlRD¥D •Mum I. . n i l . . — —

— J —

:0K 381 RHDGRVEISDVERWRDRVIAAIDQGYVEC EO^SWARLES*. RHDGRVEISD¥ER*RDR¥UAIDOGYVEC

Figure 4. The predicted MoxGM9S gene protein sequence. The MoxGM9S gene protein of 337 amino acid residues, deduced from the cDNA sequence of PCR amplified products, is presented.

4

Functional Aspect of MoxGms

Protein of prophenoloxidase in Drosophila melanogaster has two copper-binding sites, Cu(A) and Cu(B), as enzyme catalytic domain, and has high sequence homology in several arthropods. The cDNA sequence and the amino acid residues in MoxGM95 were compared to each other and to the corresponding wild type, Mox, information. According to the insertion site in MoxGM95, the localization of the sequence was the putative catalytic site of phenoloxidase protein between copperbinding sites of Mox, and it could be absent the Cu(B) site in MoxGM9S by the abnormal biosynthesis of prophenoloxidase protein. Southern analyses in which genomic DNAs were digested with eight restriction endonucrease, Bam HI, Eco RI, Eco RV, Hin dill, Pst I, Sac I, Sau 3A, Xho I (Asada et al., unpublished data). It is possible that the catalytic site of phenoloxidase cannot have the affinity for the substrate(s) owing to the incomplete protein and cannot alter the conformation after activation of prophenoloxidase molecule due to the insertion sequence in MoxGM9S.

220

N. Asada et al.

It appears that Mox is a 2 stop codon mutation and in agreement with genetic predictions that phenolocidase acts as an indispensable protein to maintain life in Drosophila [13]. Further analyses of the precise structure and nature of MoxGM95 and Dox-3KD93 of these sequences will shed more information on the mechanisms of the isozyme.

5

Acknowledgments

We are grateful to Drs. S. Tomino, Okayama University of Science, X. G. Xiong, Chinese Academy of Sciences and L. I. Korochkin, Russian Academy of Sciences for their valuable advice throughout the study. We thank Messers. T. Hanai, R. Okumura, M. Ohta, Okayama University of Science, for their technical assistance. The nucleotide sequence reported in this paper has been submitted to the DDBJ/EMBL/GenBank database with the accession number AB041265. References 1. Ashida, M., and Yamazaki, H.I., In: Molting and Metamorphosis (Japan Scientific Societies Press/Springer-Verlag, Tokyo, Berlin, 1990) 2. Asada, N., Fukumitsu, T., Fujimoto K., and Masuda, K., Insect Biochem. Molec. Biol. 23 (1993) pp. 515-520. 3. Asada, N., J. Epp. Biol. 282 (1998) pp. 28-31. 4. Chomcznsky, P., and Sacchi, N., Anal. Niochem. 162 (1987) pp. 156-159. 5. Gubler, U., and Hoffman, B.J., Gene 25 (1983) pp. 263-269. 6. Saiki, R.K., Scarf, S., Fallna, K.B., Horn, G.T., Erlich, H.A., and Arnheim, N., Science 230 (1985) pp. 1350-1354. 7. Jiang, H., Wang, Y., Congcong, M., and Kanost, M.R., Insect Biochem. Molec. Biol. 27 (1997) pp. 693-699. 8. Kawabata, T., Yasuhara, Y., Ochiai, M., Matsuura S. and Ashida, M., Proc. Natl. Acad. Sci. USA 92 (1995) pp. 7774-7778. 9. Fujimoto, K., Okino, N., Kawabata, S., Iwanaga, S., and Ohnishi, E., Proc. Natl. Acad. Sci. USA 92 (1995) pp. 7769-7773. 10. Park, D.-S., Shin, S.W., Kim, M.G., Park, S.S., Lee, W.-J., Brey, P.T., and Park, H.-Y., Insect Biocem. Molec. Biol. (1997) pp. 983-992. 11. Hall, M , Scott, T., Sugunaran, M., Soderhall, K., and Law, J., Proc. Acad. Sci. USA 92 (1995) pp. 7764-7768. 12. Jiang, H., Wang, Y., Korochkina, S.E., Benes, H, and Kanost, M.R., Insect Biochem. Molec. Biol. 27 (1997) pp. 693-699. 13. Asada, N. Kawamoto and H. Sezaki, Dev. Comp. Immunol. 23 (1999) pp. 535-543.

MOLECULAR INFORMATION FUSION FOR METABOLIC NETWORKS R A L F HOFESTADT, MATTHIAS LANGE, AND U W E SCHOLZ

Bioinformatics /Medical Informatics, Institute of Technical and Business Information Systems, Otto-von-Guericke-University Magdeburg, P.O. Box 41 20, 39016 Magdeburg, Germany E-mail: {hofestae\mlange\uscholz}@iti.cs.uni-magdeburg.de Today molecular information systems are available that integrate different molecular database systems. However, the electronic information system KEGG represents the Biochemical Pathways and allows the access to different database systems which show the static representation of the molecular data and knowledge. The next important step is to implement molecular information systems which will allow to integrate different molecular database systems and analysis tools. In our paper we present an Integrative Molecular Information System for the simulation of metabolic networks.

1

Introduction

The architecture of our molecular information system allows the information fusion based on different database systems. For the simulation of metabolic networks we use the kernel of our simulation environment Metabolika which enables the interactive simulation of biochemical networks [6]. The idea of the MARGBench project [4] is to connect the simulation kernel with the WWW data sources using the database integration software. Therefore, molecular knowledge can be transferred into analytical metabolic rules - the language of Metabolika. Based on that integration software the simulation of metabolic processes is available. The configuration of Metabolika is represented by the actual metabolite concentrations. Metabolika allows the calculation of all possible configurations (derivation tree) based on the selected metabolic knowledge (biochemical scenario) and the start configuration.

2

Modelling of Metabolic Networks

The availability of the rapidly increasing volume of molecular data on genes, proteins and metabolic pathways enhances our capability to study cell behaviour. To understand the molecular logic of cells, we must be able to analyse metabolic processes and gene networks in qualitative and quantitative terms. Therefore, modelling and simulation are important methods.

221

222 2.1

R. Hofestddt et al. Rule-based model

The kernel of our system represents an inference mechanism which can be interpreted as a rule based system [6]. Our model is an extension of the Chomsky type-0 grammar. Defining a global rule, this formalization allows the representation of genetic, biosynthetic, and cell communication processes. Using abstract concentration rates (integer values), we expand this discrete model. This has been realized by using multi-sets for the representation of metabolites. Metabolites are molecular structures or substance concentrations. Furthermore enzymes are proteins which catalyze biochemical reactions, whereas inducers and repressors are metabolites which are able to accelerate or slow down (prevent) biochemical reactions. In our model the biochemical concentration of a cell is a mixture of these components. By these definitions the abstract metabolism is given by the actual cell state (configuration) and the specific metabolic reaction rules. Therefore, the basic unit of our system is the metabolic rule. This is a formal construction which is able to describe different metabolic reactions. In that chapter we will present the basic structure of that rule based system. Regarding a biochemical reaction, we identify the following situation. A substance or a concentration (S) will be transferred into a product or a concentration (P). This metabolic reaction can be influenced by a concentration of inhibitor metabolites (I) or/and a concentration of enhancer metabolites (£). Using formal languages and the definition of grammars, the substance or the substance concentration can be interpreted as the left side of the rule and the metabolic product as the right side of the metabolic rule (Figure 1).

I ~--. !-

S

\

; = * P, p

I*

A

E —"' Figure 1. Metabolic Rule: S (substrate), P (product), I (inhibitor), E (enhancer) and p (rule probability)

Moreover, the inhibitors will reduce and the enhancers will increase the flux. The actual influence of / and E will depend on the concentration of that elements regarding their actual configuration. Furthermore, each element of the set / and E represents a specific function which consists of two parameters. The first parameter is the specific threshold of the metabolite and the second parameter is his reaction behaviour (kinetics). Regarding the reaction behaviour, the user of our simulation environment can choose between three different functions: hyperbolic, sigmoid, and

Molecular Information Fusion for Metabolic Networks

223

linear. The literature shows that metabolites are characterized by their specific reaction behaviour. However, the sigmoid and hyperbolic behaviour seem to be quite common. A 5-tuple (5, P, E, I, p) with p e [0,1 ] Q and the multi-sets S,P,E,I is called a metabolic rule, p is called rule probability, S (substance) a set of preconditions, P (product) a set of post conditions, E (enhancer) a set of catalyzed conditions, and / (inhibitor) a set of inhibitor conditions. Based on the metabolic rule we are able to define the basic model. G-(Z,R,Zo) is called metabolic system. Z is a set of configurations, zo is called start configuration, and R is a set of metabolic rules which is called metabolic rule set. The first step to use the rule based model is the representation of the metabolic knowledge by using the specific rule based systems. In the case of biochemical reactions we can translate metabolic pathways directly. Regarding a graphical representation of metabolic pathways [11], every edge (sub graph) will represent a metabolic rule as follows: the left node (father) of this sub graph, where the edge will go out, is the S multi-set, the right node will present the P multi-set. Enhancers and inhibitors can be found by circles or numbers pointed to that edge. UREA CYCLE AND METABOLISM OF AMINO GROUPS

( Citrate cycle 1

ONIB

Figure 2. Part of the urea cycle from the KEGG system

Using our prototype BioBench, this metabolic pathway, which is shown in Figure 2, can be picked up from the KEGG system [12] located in Japan and will be translated directly into the language of Metabolika. The following example illustrates the translation process of the urea cycle from the KEGG-system into the language of the simulation tool. In this case the EC numbers are replaced by the full enzyme names (e. g. 2.1.3.3 is replaced by Ornithine carbamoyltransferase).

224

• • • •

R. Hofestddt et al. rule/-!: {{Carbamoyl-Phosphate, Ornithine), {Citrulline}, {Ornithine carbamoyltransferase], 0 , 1.0) rule r2: ({Citrulline, Aspartate}, {L-Arginino-succinate}, {Argininosuccinate synthase], 0 , 1.0) ruler3: ({L-Arginino-succinate}, {Fumarate, Arginine], {Argininosuccinate lyase], 0 , 1.0) ruler4: ({Arginine], {Ornithine, Urea], {Arginase}, 0 , 1.0)

Moreover, regarding any biochemical reactions we can discuss the processes of gene regulation (micro-pathways) and the processes of cell communications. Therefore, the BioBench server allows the access to the TRANSFAC database [5]. Operon Pi

R

P2

Op

S,

S,

P = Promoter

S = Structure Gene

Op = Operator

T = Terminator Gene

E = Enzyme

R = Regulator Gene

Figure 3. Model of the gene regulation process

The following rule set shows how the gene regulation process (see Figure 3) can be translated into the language of Metabolika: rulera: ( 0 , {Repressor}, {RNA-Polymerase, ATP, PY], 0,p) rulerA: ({A}, {Inductor}, 0, 0,p) ruler,.: ({Inductor, Repressor}, {Inductor-Repressor}, 0, 0,p) rule/v: ({Inductor-Repressor}, {E\,E2}, {RNA-Polymerase, ATP, P2}, 0,p) rulere:({A},{A'},{El],0,p) rulerf:({A'},{A"},{E2},0,p) Using metabolic rules, the modelling of cell communication processes (see Figure 4) is simple. Metabolites will go into (will leave) the cell, if only the P (S) component of that rule is not empty. However, the inhibitor and enhancer component of the metabolic rule allows the simulation of receptor effects.

Molecular Information Fusion for Metabolic Networks

225

Figure 4. The abstract cell communication process

• •

ruler.: ({*}, 0 , 0 , 0,1.0) ruler*: ( 0 , { p } , 0 , 0 , 1.0)

Therefore, metabolic networks will be represented by a set of metabolic rules. The processing mechanism of our model is as follows. Any rule is called activated, if the elements (concentrations) of the S component are elements (concentrations) of the actual configuration. Any activated rule r can go into action. The action of r will modify the actual cellular state of the metabolic system. All elements of the actual cellular state, which are elements of the substance set (5) of rule r, will be eliminated. All elements of the product component (P) will be added. Therefore, the action of rule r is a substitution P for S which can produce a new configuration. Example: Consider the rule set of the Urea pathway and let ( CarbamoylPhosphate, Ornithine, Ornithine carbamoyltransferase) be the actual cell state. However, only the first rule is activated and will go into action, because the probability value is 1.0. The action of that rule will consume the CarbamoylPhosphate and Ornithine molecules and will produce a Citrulline molecule. This biochemical reaction will be catalyzed by the enzyme Ornithine ca rbamoyltransferase. The one step derivation of a metabolic system is defined by the (quasi) simultaneous action of all activated rules. Therefore, we consider the set of all activated rules and determine two new sets: the before-set and the after-set. The before-set includes all before elements of the activated rules. A definition of the after-set is analogous. Using these sets, the one-step derivation could be interpreted as an addition and subtraction of concentrations. Example: Regarding again the rule set of the urea pathway, and let the actual cell state be ( Carbamoyl-Phosphate, Ornithine, Ornithine carbamoyltransferase, Citrulline, Aspartate, Argininosuccinate synthase, Arginine, Arginase). However, the first (rx) , second (r2) and last rule (r4) are activated. Therefore, different onestep-derivations can be produced (non-deterministic rule system). Each action can be interpreted as an independent event. Therefore, the probability of each one-step derivation can be calculated from the absolute probability values of all activated and deactivated rules. In our simulation system

226

R. Hofestddt et al.

this will be done by multiplying these values. One-step derivations inductively produce complex derivation trees of configurations. Based on the theory of the rule based modelling of metabolic processes we developed the simulation shell Metabolika [6]. Metabolika allows the integrative simulation of biochemical networks including cell communication processes. Metabolika is implemented in C and runs on a SUN Sparc workstation. Its main parts are the rule editor and the configuration editor/browser. 2.2

Theoretical aspects

The set of the reachable configurations is an infinite set, and the set of all derivations is enumerable. Moreover, the set of all configurations is not decidable [7]. Use of concentrations (multi-sets) is the main reason for the indecidability. However, this result implies that no interesting question can be solved in the research field of biotechnology. In practice biochemical systems are restricted. In our model we can restrict the depth and width of the derivation process. Therefore, important questions are decidable, and we have to discuss the complexity of the derivation algorithm. If we restrict the derivation depth the language L(G, i) is decidable. L(G, i) is the set of all configurations which can be produced from the start configuration by the application of up to / derivation steps. Hence, for a metabolic system with the generation depth i, it is decidable, if A: is a member of L(G, i). Based on the exponential complexity of the derivation process, this question cannot be solved in practice if i is high. Therefore the calculation of a derivation tree is not possible in practice. However, using our simulation tool we have to restrict the derivation depth and/or width. 3

Information Fusion

The presence of numerous informational and programming resources on gene networks, metabolic processes, gene expression regulation, etc., described above, raises an acute problem of data integration and suitable access. Goal of such integration is to create a virtual informational environment, enabling an access to the significant information on the basis of simultaneous exploration of many databases available via Internet. Effective possibilities for data base integration are provided by the World Wide Web technology. One of the most developed technologies of WWW integration of molecular databases uses the Sequence Retrieval System (SRS). It is based on local copies of each component database, which have to be provided in a text-based format. The results of the query are sets of WWW-links. Thus the user can navigate through these links. Up to now, several hundreds of databases on molecular biology are integrated under SRS [3]. However, within the frames of this approach, data fusion

Molecular Information Fusion for Metabolic Networks

227

is still a task of the user. We also do not find real data fusion; i. e. data for one real world object (e. g. an enzyme) coming from two different databases (e. g. KEGG and BRENDA [14]) is represented two times by different WWW page objects. Therefore, research groups try to integrate molecular databases on a higher level than the SRS approach. For, they apply results of current database research, e. g. federated database systems, data warehousing architectures or data mining techniques [1]. Many bioinformatics problems require 1. 2. 3. 4. 5.

access to data sources that are high in volume, highly heterogeneous and complex, constantly evolving and geographically dispersed, solutions that involve multiple carefully sequenced steps, information to be passed smoothly between the steps, increasing amount of computation and increasing amount of visualisation.

BioKleisli (see [2]) is an advanced technology designed to handle the first three requirements directly. In particular, BioKleisli provides the high-level query language CPL that can be used to express complicated transformation across multiple data sources in a clear and simple way. In addition, while BioKleisli does not handle the last two requirements directly, it is capable of distributing computation to appropriate servers and initiating visualisation programs. The idea of our project is to present a virtual laboratory for the analysis of molecular processes (diseases). Therefore, we integrated different specific database systems which represent molecular and medical knowledge and a simulation environment (see [9, 4] for more information). 4

The Biomedical Workbench

As mentioned before, the analysis of existing systems shows that on the one hand many database systems which contain data about biochemical reactions are available. On the other hand powerful analysis tools, e.g. simulation tools in this domain exist. We built an integrated molecular information system which is called Biomedical Workbench (BioBench) [9]. The idea of this system is to present a virtual laboratory for the analysis of molecular processes. For that reason we integrated different database systems which represent molecular and medical knowledge. We called this integration information fusion. Hence we have the possibility to detect equal data in various databases. The graphical user interface gives the user access to a compact local information system. In case of modelling and simulation of metabolic processes the specific biochemical knowledge will be identified by using these database systems. As next step, this knowledge will be

R. Hofestddt et al.

228

transferred automatically into the language of analytical metabolic rules, the language of Metabolika. The simulation of this biochemical reactions will be produced by the kernel of Metabolika and results of this simulation will be visualized by a special visualization component (VisTool). The BioBench prototype integrates three different databases: KEGG (see [12]), MDDB (see [8]) and parts of the TRANSFAC database (see [5]). This integration is based on a fix and hard implementation of special adapters for the access onto the different systems. These adapters transferred the data of the component systems in a unique form. The basis for the unique representation form is a global data model which integrates the models of the component systems. Figure 5 shows the architecture of the prototype. Graphical User Interface (tuns on a WWW-Cliant)

Search-Tool

Information Viewer

Metabolika User Interface Metabolika Rule-Editor

Visualisation Tool

1

Virtual Integrated Interface (Sun Web-Setver and RMI) Ttansfac •

MDDB-

KEGGActapter

KEGGDatabase

1

GP-DB Adapter

1

MetabolikaAdaptor

Global Pathways DB

Figure 5. Architecture of the BioBench prototype

So far, the main application of the BioBench prototype was the analysis of metabolic pathways. In one component database system, the MDDB, information about metabolic diseases are stored. These diseases are caused by genetic defects. The result is, that a special enzyme for metabolism can not be synthesised. At the end a special biochemical reaction could not be catalyzed by the missing enzyme and substrates could not be transferred into products. With help of BioBench and the including simulation tool it is possible to analyze this break in a metabolic pathway and to look for alternative pathways. Furthermore, based on the BioBench prototype, we are developing a new system which is called MARGBench.

Molecular Information Fusion for Metabolic Networks

229

Summarizing, the biggest advantage of this BioBench is its possibility to automatically generate the input data for the simulation tool. It is not necessary for the user to type in the simulation parameters manually. Nevertheless, the fixed architecture of the systems is a disadvantage. The adapter are fix implemented software modules. If the access interface of any component database will be modified, then a corresponding adapter must be newly implemented. Moreover, only one simulation tool is fix interlinked. 5

Logical Integration of Molecular Databases

In the field of molecular biology, a steady stream of data is generated and researchers are collecting their knowledge in fast growing databases. These databases are distributed around the world. Several of them are accessible through the Internet which is the basis for a wide and public use. For the adequate work of biologists a wide range of this molecular-biological data are essential. For this reason it is necessary to look at as many databases as possible. That raises the question, how to obtain that data. First, the manual way is to interactively browse the databases. Hereby it is not really possible to make the data available to any computer program. This approach is useful for investigation because the effort of getting these data as an input into a computer program, e.g. via "cut and paste" or typewriting is not very workable. Another approach is to realize a computer supported data access. Therefore, an additional software layer is required that provides the mechanism of access and delivery of merged data. Thus, the biological tools access the several databases not directly, rather they get the data from a special software layer which is called the Database Integration Server. Hence the following problems are to solve: • • • • • •

complex declarative queries standardized software interface user defined data views transparent merging of databases solving the several kinds of database heterogeneity transparent physical database access

1. 2. 3. 4.

To fulfil the above requirements four approaches are proposed [10]: Hypertext navigation (see [3]) Data warehouse (see [13]) Multi database queries (see [2]) Federated databases (see [1])

230

R. Hofestddt et al.

In correlation with the research project Modelling and Animation of Regulative Gene Networks an architecture has been designed which realizes a logical database integration based on the concept of federated databases. The system architecture (see WWW address: http://wwwiti.cs.uni-magdeburg.de/~mlange/BioDataServer/) has been implemented as a prototype called BioDataServer [4]. A workable Internet access to the molecular-biological databases is the main requisition for a database integration. Thereby several problems are to solve: • • • •

different different different different

interfaces (e.g. CGI, JDBC) query languages (SQL, OQL, non-standardized) data presentations (HTML, flat files, database objects) data structures (static, dynamic)

To hide the access heterogeneity, the BioDataServer uses adapter for the physical data access. For each data source a special adapter exists which is able to handle the data retrieval. In the case of an HTML-based data source the adapter accesses the specific URL and parses the resulting HTML-page. Current work studies the possibilities of semiautomatic generation of adapters (see [4]). To obtain a complete and wide spectrum of data it is recommendable to access as many databases as possible. Therefore the queries will be distributed to each relevant database. The information how to distribute the queries are stored in an integrated user scheme. This scheme is relational and defines the source for each attribute. On the basis of these schemes, the BioDataServer accesses the related attributes at the specific databases. Furthermore an automatic mechanism for merging the attributes from the various databases is necessary. This is the task of the integration layer of the BioDataServer and can be solved using mathematical set operations. The premise to access the Database Integration Server by computer programs is the definition of an interface. Because the server should be accessible through the Internet, a communication protocol and a query language must be specified. Nowadays a lot of database systems exist which support SQL as the query language, which in turn is based on the relation model and standardised. Nearly all commercial database systems support SQL and thereby it has been established worldwide. This was the reason to support SQL by the BioDataServer. In the field of interfaces for remote database access, different techniques have been established, e.g. JDBC and ODBC. ODBC is currently only supported by Microsoft platforms. Therefore the BioDataServer offers a JDBC driver which provides a standardized database access to JAVA applications. Consequently any JAVA platform can simply access the BioDataServer by related JAVA programs. To fulfill extended requirements to a universally applicable database integration server, a new architecture was developed and has been implemented as a JAVA application. The main advantages of this BioDataServer are

Molecular Information Fusion for Metabolic Networks • • • • •

231

the transparent physical database access, dynamic building of a new virtual, logical integrated database standardized access interface, client - server capability and the platform independency.

Summarizing, this kind of database integration is a step for the standardized integration of worldwide distributed molecular databases and the related software tools. 6

Summary and Outlook

First, a short description of the current situation in the field of bioinformatics was given in this paper. On the one hand, a huge amount of data in heterogeneous systems is available. On the other hand, more and more powerful analysis tools are growing up. One main research interest is the integration of the databases and the analysis tools. Following this line, we illustrated the theory of our rule based approach which is implemented in the Metabolika system. An example showed the simulation possibilities of this tool. Furthermore, the idea of information fusion and our prototype BioBench were described. Finally an approach for the logical integration of molecular databases was presented in a detailed form and some advantages of this integration were illustrated. The most frequently introduced concepts of information fusion are implemented in our MARGBench prototype. During our current work we are connecting the pieces of our system. After voluminous and intensive tests we plan to bring our system in the WWW. Some more information about the prototype and our project is available under the address: http://wwwiti.cs.uni-magdeburg.de/iti_bm/marg/. 7

Acknowledgements

This work is supported by the German Research Council (DFG). References 1. Conrad S., Federated Database Systems: Concepts of Data Integration. Springer-Verlag, Berlin/Heidelberg, 1997. {In German).

232 2.

3.

4.

5.

6.

7. 8.

9. 10. 11. 12.

13.

14.

R. Hofestadt et al. Davidson S. B., Overton C , Tannen V. and Wong L., BioKleisli: a digital library for biomedical researchers. International Journal on Digital Libraries, 1 (1997) pp. 36-53. Etzold T., Ulyanow A. and Argos P., SRS: Information Retrieval System for Molecular Biology Data Banks. Methods in Enzymology, 266 (1996) pp. 114128. Freier A., Hofestadt R., Lange M. and Scholz U., Integration, Modellierung und Simulation metabolischer Wirknetze. Preprint 13, Fakultat fur Informatik, Universitat Magdeburg, 1999. (In German). Heinemeyer T., Chen X., Karas H., Kel A. E., Kel O. V., Liebich I., Meinhardt T., Reuter I., Schacherer F. and Wingender E., Expanding the TRANSFAC database towards an expert system of regulatory molecular mechanisms. Nucleid Acids Research, 27(1) (1999) pp. 318-322. Hofestadt R. and Meinecke F., Interactive Modelling and Simulation of Biochemical Networks. Computers in Biology and Medicine, 25(3) (1995) pp. 321-334. Hofestadt R., Theorie der regelbasierten Modellierung des Zellstoffwechsels. Aachen: Shaker, 1996. (In German). Hofestadt R., PriiB M., Scholz U. and Urban H., The Metabolic Diseases Database (MDDB) - A Molecular Database Toolkit for the Detection of Inborn Errors. In O. Zimmermann and D. Schomburg, editors, Proceedings of the German Conference on Bioinformatics (GCB '98), Kbln, October 7-10, 1998. Hofestadt R. and Scholz U., Information Processing for the Analysis of Metabolic Pathways and Inborn Errors. BioSystems, 47(1-2) (1998) pp. 91-102. Karp P. D., A Strategy for Database Interoperation. Journal of Computational Biology, 2(4) (1995) pp. 573-586. Michal G., Biochemical Pathways. Heidelberg: Spektrum Akademischer Verlag, 1999. Ogata H., Goto S., Kazushige S., Fujibuchi W., Bono H. and Kanehisa M., KEGG: Kyoto Enzyclopedia of Genes and Genomes. Nucleid Acids Research, 27(1) (1999) pp. 29-34. Ritter O., Comprehensive Genome Information Systems. In S. Suhai, editor, Theoretical and Computational Methods in Genome Research (Plenum Press, New York, 1997) pp. 177-184. Schomburg D., Schomburg I., Chang A. and Bansch C , BRENDA the Information System for Enzymes and Metabolic Information. In R. Giegerich, R. Hofestadt, T. Lengauer, W. Mewes, D. Schomburg, M. Vingron, and E. Wingender, editors, Proceedings of the German Conference on Bioinformatics (GCB '99), Hannover, October 4-6, pages 226-227, 1999.

INTRON-SIZE AND EXON POLYMORPHISMS IN THE MOUSE TISSUENONSPECIFIC ALKALINE PHOSPHATASE GENE

NILS FROHLANDER, AND JOSE LUIS MILLAN

Department of Medical Genetics, University of Umed, Umed, S-901 87 Sweden and The Burnham Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA Email: [email protected] Restriction analysis of genomic DNA from five mouse strains revealed previously unknown variations of the mouse TNAP gene, i.e. point mutations and intron-size differences. Our cloning and sequencing of 17.5 kb of the Balb/c TNAP gene allowed for a detailed analysis of these polymorphisms. Comparing the exonic sequences with previously published cDNAs, we found amino acid substitutions compatible with protein polymorphisms observed electrophoretically. Most variations, however, are due to intronic sequence polymorphisms, including variable length simple sequence repeats.

1

Introduction

Alkaline phosphatases (APs) are members of a multi-gene family that include four genes in humans and three active genes in mice. These genes are classified according to their ubiquitous or restricted expression as tissue-nonspecific (TNAP) or tissue-specific AP genes, i.e., the intestinal AP, placental AP and germ cell AP in humans and embryonic AP and intestinal AP genes in the mouse. While the human, rat and mouse TNAP gene extends over approximately 40-50 kb of DNA, the structure of the tissue-specific genes is very compact with each gene composed of less that 5 kb of DNA [3,4]. The human tissue-specific APs, particularly the placental AP and the germ cell AP genes, display a high degree of genetic polymorphism [3]. In contrast no genetic variability has been reported for any of the murine AP genes. However, we encountered unexpected difficulties while attempting to target the mouse TNAP (mTNAP) gene in the process of generating a mouse model of infantile hypophosphatasia [6]. The fact that several homologous recombination constructs failed to produce the expected targeted events prompted us to investigate more closely the structure of the mTNAP gene. In this paper we have sequenced and characterized the mTNAP gene and compared it, by restriction analysis and limited sequencing, to the TNAP gene from different mouse strains. The results indicate a substantial degree of polymorphism explained not only by point mutations, but also by intron-size differences between TNAP alleles.

233

234 2

N. Frohlander et al. Materials and Methods

Gene structure: An EMBL-3 SP6/T7 library from Balb/c liver DNA (Clontech Laboratories Inc, Palo Alto, CA; catalog number ML 1030J) was plated on NM-538 cells and screened as before [5]. A full length human TNAP cDNA [12] was used as probe for the initial library screening. Positive clones were plaque-purified and expanded according to [8]. Selected genomic fragments were subcloned into the Bluescript KS+ (Stratagene, CA) or pGEM-3 (Promega, WI) vectors. Sequences were assembled and analyzed using the MacVector sequence analysis software (International Biotechnologies Inc., New Haven, CT). Determination of genetic variability: High molecular weight DNA was obtained from CD1, 129/J, C57B1/6J, FVB and Balb/c mouse strains according to established protocols [8]. Blots were probed with either the entire mouse TNAP cDNA [2], with a 5' fragment covering exon I through half of exon VI or with a 3' fragment covering the remaining exonic sequences (Eag I digest, cDNA position 670). For PCR amplification of exon II and intron 2 a 5' primer corresponding to the junction sequence reported by [11], i.e., 5'-TAA CTT CTA GGA TCG GAA CGT CAA-3' and a 3' partner (5-CCA TCT CTG GAG CTG ACA ATG GGC-3') corresponding to positions 2115-2138 (Fig. 1) were used. A 1.2 kb fragment encompassing exon VI was amplified from Balb/c and 129/J using the following primers: 5'TCGGATGGATTTTCTTCTGCAACT-3' (positions 5295-5318) and 5'GACATGTCAGCTCTGTGTGCAGGC-3' (positions 6503-6526).

3

Results and Discussion

TNAP Gene Characterization: Three overlapping lambda clones were retrieved from the EMBL-3 SP6/T7 library. Fig. 1 shows the sequence of the mTNAP gene, with the exception of the untranslated exon I and the 32 kb long intron 1. In analogy with the human AP gene family, the structure of the mTNAP gene differs greatly from the very compact nature of the mouse tissue-specific AP genes [5]. The mTNAP gene has one additional exon in the 5' untranslated region, separated from exon II by a 32 kb long intron [11], and while the 11 exons of the mouse TSAP genes are interrupted by introns ranging in length from 70 to 261 bp the corresponding introns of the mTNAP gene range in size from 0.3 to 3 kb. All intronexon junctions described here are in perfect agreement with those previously published for C57BL/6J, whereas the lengths of several introns are at variance with those reported earlier [11]. In Balb/c, introns 2 and 6 are approximately 500 bp larger than those estimated for C57BL/6J while intron 5 is 500 bp smaller in Balb/c. The figure also highlights simple sequence repeats in the mTNAP gene. Though one may expect a random distribution of simple sequence repeats throughout the genome, we observe an overrepresentation of repeats in the 5' half of the mTNAP

Polymorphisms in the Mouse Tissue-Nonspecific Alkaline Phosphatase Gene

235

gene, especially in introns 4 and 5, of possible relevance for the intron size differences in this region discussed below. As expected, di- and tri-nucleotide repeats, e.g. d(CA)n, d(GT)„, d(GT) n d(CCA)n, d(GGT)n, are most frequent but tetra-nucleotide, e.g. d(GGGC) n and d(GGGA)n, as well as longer repeats are also present in the mTNAP gene. TNAP gene polymorphisms: TNAP in the mouse is encoded by the Akp-2 locus, mapped to chromosome 4 [9]. The existence of allelic variation for Akp-2, consistent with two common alleles designated Akp-2a and Akp-2", has been demonstrated by gel electrophoresis [14]. Monoclonal antibodies distinguishing this variation have subsequently been developed [1]. Two mTNAP cDNA sequences have been reported, derived from Icr random-bred mice and the 129/J derived cell line NULLI-SCC1, respectively [9,2]. The transcribed region of the TNAP gene from C57B1/6J was subsequently reported in agreement with the cloned cDNAs [11]. The two cDNAs differed at only two amino acid positions, Ser for Leu in the signal peptide and Arg for Pro at position 504. C57B1/6J and 129/J are both of the Akp-2'5 phenotype, as determined by cellulose acetate electrophoresis [14]. Apparently some heterogeneity exists within this allele. We have examined the Balb/c mTNAP gene, which electrophoretically is of the Akp-2a phenotype. Differences in nucleotide and amino acid sequences compared to either or both published cDNAs are indicated in Fig. 1. Some clustering is observed, with multiple mutations in exons V, VII, IX and XI and in the translated part of exon XII, whereas no variations are observed in exons III, IV VI, VIII, and X. Of the five amino acid substitutions described, only one, Gin for Arg at position 301, affects the charge of the mature protein, making it more acidic. This is consistent with the faster mobility of the Akp-2a phenotype on cellulose acetate electrophoresis. Fig. 2 shows genomic DNA from five mouse strains, digested with common restriction enzymes and probed either with a 720 bp 5' TNAP gene probe (Fig. 2A), the entire mTNAP cDNA (Fig. 2B), or a 1.7 kb 3' TNAP gene probe (Fig. 2C). The polymorphic nature of the mTNAP gene is readily apparent. In analogy with their Akp-2 phenotypes, the restriction patterns for C57BL/6 and 129/J were identical, whereas Balb/c differed from the other two strains. Differences between alleles can be explained in some instances by simple mutations affecting a restriction site, and in others by insertions or deletions affecting intron sizes. Single point mutations can account for the Apa I 1.2 kb and 2.2 kb fragments in Balb/c (Fig. 2A), generated from the 3.4 kb fragment by the presence of an Apa I site in intron 8. A Pst I site in intron 6 in 129/J and C57B1/6J generates a 2.9 kb and a 1.2 kb fragment from the Balb/c 4.1 kb fragment (Fig. 2B). Furthermore, some discrepancies exist between the TNAP fragments from the Balb/c genomic DNA and the restriction map based on sequencing the TNAP gene isolated from a Balb/c library: A 6.5 kb Eco RI fragment on Southern blot is not accounted for in the sequenced DNA, while an expected 3' 6.2 kb Hind III fragment does not appear. This may reflect a high

236

N. Frohlander et al.

frequency of recombination, non-homogeneity of the genomic library or cloning artifacts. Intron size differences are demonstrated by the Apa I, Bam HI and Hind II digests (Fig. 2C), indicating polymorphic introns in the 5' region of the TNAP gene. The possibility that the differences in 5' fragment sizes between Balb/c and the other two strains represents an artifact was eliminated by the demonstration of duplicate bands when the DNAs were mixed before the respective digest (data not shown). On the contrary, one intron size difference was characterized by sequencing the 5' end of intron 6 from the 1.2 kb PCR product from 129/J. This analysis identified a polymorphic simple sequence repeat which explained the intron size variation of this region, as well as numerous point mutations which result in the appearance of new Pst I and Kpn I sites not present in Balb/c DNA (Fig. 3). Sequencing the 3' end of intron 5 from the same 129/J PCR product we found a C-G substitution which caused the elimination of the Kpn I site present in Balb/c DNA (Fig. 1, position 5890), among other mutations (data not shown). Reprobing the Kpn I and Pst I digests with a genomic 1.2 kb Hind II fragment (positions 5669-6887) illustrates the effects of these mutations (Fig. 4). These results, thus, explain both the Pst I and the Kpn I polymorphisms described above and demonstrate the significance of simple sequence repeat polymorphisms for intron-size differences in the mTNAP gene. It may be concluded from our data, that a limited protein polymorphism is compatible with extensive allelic variation at the DNA level. Since targeting frequency via homologous recombination has been shown to be greatly dependent on the absence of polymorphic sequences in the homology arms [7], our current analysis explains while several of our TNAP targeting constructs failed to yield positively targeted ES cells. From a biochemical genetics viewpoint, it is worthwhile noting the contrasting differences between polymorphic and nonpolymorphic AP loci in humans and mice. In humans, two of the three tissuespecific AP genes, i.e., the placental and germ cell AP genes, display a very high frequency of common and rare alleles, while only one RFLP has been reported for the intestinal AP and the TNAP genes [3]. In mice, however, it is the TNAP gene that displays a remarkable degree of genetic polymorphism, as reported here, while no genetic variation is known for the tissue-specific AP isozymes. While the significance of this variability is unclear it may relate to the conclusion derived from phylogenetic studies that suggest that AP isozyme diversification occurred after rodent and primate radiation [5]. Thus, selective pressures on these genes seem to differ between humans and mice.

Polymorphisms in the Mouse Tissue-Nonspecific Alkaline Phosphatase Gene 4

237

Acknowledgments

This work was supported by grant CA42595 and DE12889 from the National Institutes of Health and by grants from the Swedish Medical Research Council and the Kempe Foundation. References 1. 2. 3. 4.

5. 6. 7. 8.

9. 10. 11. 12. 13. 14.

Daikiri, K., Nakamura, S., Ikegami, S., Nakamura, M., Fujimori, T., Tamaoki, N. & Tada, N., Immunogenetics 29 (1989) pp. 235-240. Hahnel, A. C. & Schultz, G. A., Clin. Chim. Acta 186 (1990) pp. 171-174. Harris, U.,Clin. Chim. Acta 186 (1989) pp. 137-150. Henthorn, P. S., Millan, J. L. & Leboy, P. Acid and alkaline phosphatases in Dynamics of Bone and Cartilage Metabolism. Ed. Seibel, Robins & Bilezikian. (Academic Press, 1999). Manes, T., Glade, K., Ziomek, C. A. & Millan, J. L., Genomics 8 (1990) pp. 541-554. Narisawa, S., Frohlander, N. & Millan, J. L. Dev. Dynamics 208 (1997) pp. 432-446. Riele, H. T., Maandag, E. R., & Berns, A., Proc. Natl. Acad. Sci. USA. 89 (1992) pp. 5128-5132. Sambrook, J., Fritsch, E. F. & Maniatis, T., Molecular cloning: A laboratory manual (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989). Terao, M. & Mintz, B., Proc. Natl. Acad. Sci. USA. 84 (1987) pp. 7051-7055. Terao, M., Pravtcheva, D., Ruddle, F. H. & Mintz, B., Somatic Cell Mol. Genet. 14 (1988) pp. 211-215. Terao, M., Studer, M., Gianni, M. & Garattini, E., Biochem. J. 268 (1990) pp. 641-648. Weiss MJ. Henthorn PS. Lafferty MA. Slaughter C. Raducha M. Harris H., Proc. Natl. Acad. Sci. USA 83 (1986) pp. 7182-7186. Weiss, M. J., Spielman, R. S. & Harris, H., Nucl. Acids Res. 15 (1987) pp. 860. Wilcox, F. H. & Taylor, B. A., J. Hered. 72 (1981) pp. 387-390.

238

N. Frohlander et al.

taacttCtagGATCGGMCGTCAAlTAACGTCMTTMCAGCTGACGCTGCCCCCCAK^ _ ,. t

120 MetlleSer

Exon.II.

CCAlT7TFAGTACTr^CCATCGGCACCTG<:CTTACC^ ProPheLfiuVa 1 L e u A l a T l e G l y T h r C y s IxjuTh r A s n S e r P h e V a I P r o G

240

agatcgtggacccaaacaaagqagttaagaciCCqqaaqccagqtqGtcaaaqcaaqatictcttagcttqcttttaaacr-qqtcttcaqtagqaaaagttaqatqcatqaqtaagtgtgtiQ tati^cacgcgagcatgcatgcacgagtgcgtgtcccaggtcn;cctggagctggagttaaaggctgttgtaagccaccccacatgggtgctgggaatcactctggtcctccaccgtaag agcaatac^cgtt^taactgccggtccatgttcc&atagggaagcatttctaeaa^^ aaaacacacaa<^^^cacaqail;flffqpara^r.f^tttt.t.^ catcctgatgccgggtqtcattctgtatgttctggaaacatgggagtggatttgtgccgtaggtactaatcagtaccttaggacatgtcacatcctgggcattggtccctgccgtttacc CcctggaataatggteccctggtatttcattgtccaaaccattaLtgcttaacactaccagataaggaggtcggtttggtactgactcagaccctcttaggaacagactctgaatcctca g<Mgattctccatctaqcagagaccggtcagaatggcagca9agtgggtattgagaaagatggggcagactggcgtcagtcaggcctggtgtctgcctgactgctgggggtaatgggcat gtggttaggatgggctgaagttcttcaacccccaactggcacacaagcaaatgctccctgacccaggacactctggagaggttagaaccagooaggtttacagaaccagtggtttaaacc atcgacccaggggctggggagacgagggctctgttggtaaggatctgggtgtgtgacaaagctcgttgtagggggcagattcttgtaattcctggggcttgccgttcagcttagccaagt tagtgatctccaggtccaatgagagaccctgtctcaaacaccaagtgggtctctccgtcaaagccttatgacttccatcaccagctccccccctctattcaccatggccagaaactagaa caaaagcacttacccaatggacagaggtggacagcaagcccaggcctccagcttcctctgggagcgcaggctatactgtggagcctgaaactgggggttccagaagacacatctaaaaga ggagggcctggagcttaggggtqaagacagcagqcaggtgtttggggtaoaggagggctagaggcctqgtgqacccatgatttgagatcagtgtggttcccctgcctaggctgtctgtaa tggccattacccggttgtttggatgctctcatctcctcctcggctgctcaattttgccctcttcccttcctgcggggaggaagttttgcccctcacacaagccatggaagcttcatgtgt tcagaaccctcggcacaoaggctggggtccatgcctggactgagaggcttctaagcacatCgtcctgaggggcaagagCagatctaaccaggct-ggagcctgtgaccatatctttagccc ac^gacgtacactgtggccttacactggacccttcacctctttggacctctgtctctacctcaccctgtcccaagagagagaggtctttgaaggaggcatctacagggagccttcagcct tgggcttccggttctgtgggcaggtctactgtggccagagatccacagagaggaagctttgtaaaacagtggagccccattgtcagctccagagatggaaccctagccatctccttggag tacagggatggaaaaaggtgaagggcatccctgggccctctctgtgaatccagccacttctggaattagtgtggttLtCaaaaagtcccatttgacaggtgaggaactgaggcctgggga agtggtttaccaggtaccaaggctgtggattgattactacgtaagaqagagacagggagagagatggaggaagaggaggaactccaggtcttattttactctaagaggctggtctgagtg ggggcagtgagcgctagagagagaggtggtgatggagcctcagcccttgccttccccactaccctaacaaggctccttgggtaatgtatcccagccaaggctaggcttgctttctagctc cacccgcagtctcaccaggaaggccttagctgctgtgatgaacaagccaggtggagtttatcatggctatagctgaatacctctaggaatgcctgtgaagaggttaagatttccctgcta cctgcttgctggcaatggctgctccccgcgggtgaagggcaagtgtgggatttccagggaaatccaactctgccaaagtatgagtctgttcctgctttgtgtcttgggaagagtcactta gtgtctctggtctttggtgtcctcctgttgcgatgtgtgaagatgtcctcgggctggcttacggggttcgcctggaacctgttcacaaaccttcactccgccctccaggtttagatcatc

360 480 600 720 840 960 1080 1200 1320 1440 1560 16B0 1800 1920 2040 2160 2280 2400 2520 2640 2760 2880

t(^gaccc^c«gcaatgaactactggggtgtctgtaattctggggatggatagcccatccccc^iccttgccaaataaccacctgtctctg<^gAGAMGACacaCACCCCAGTl'ACTGflC l u L y s G l u A r g A s p P r o S e r Ty r T r p A

3000

Exon Hi

GACAGCAAGCCCAAGAGACCITGAAAMTCCCCTGAAACTCCAAMGCTCAACACCA^ rgGlnGlfiAlaGlnGluThrLeuLysAanAlaLeuLysLeuGlnLysLeuAsiiThrAsriValAlaLysAsnVallleMetPrieLeuGlyAapG

3120

gcagtgtggcttggaatggagggagatgggggcagcaoggctaggaatgggaagctggggttagggaggctgggagoagtcgagctgggggccttgcgctgggcctttggagcctgagcc tttgagctgagccatggatgctggaac^tgactcctggtatcagtaaaggacttgtggaagcggctctgagagatagactaggaaaggactaacctacgccctgagaatatgcaggacag agagacatgcggcgacatgtcatgtcctcctgtctgctacccactgcccttgcctgtgtcttagcactgaagagaggggctctgggaacctgggaaagctgactgaagggtctgagagcc

3240 3360 3480

tgagttctccgtcrctcccacagGTATGCKCGTCTCCACAGTMCCGCTGCCCGMTC lyKetGlyValSerThrValThrAlaAlaAirglleLeuLvsGlyGlnLeuHisHisAsnThrGlyGluGluThrArgLeuGluMetAapLyaPhePr

3600

CTTTGTGGCCCTCTCCAAGgtgagccctacaocaggtCcactcttgacctggggaactgcaaggccatcagacctcctggtcccgtggtcccagggtgaaaccaggagccattcccaagc oPheValAlaLeuSerLys

3720

atttaqttcatcqaliqtiaaaacctctcafrflt.af.grgtgtgtgtgtgtgtgtg,t,tfg9C9CCrt.gcatatiqtqcagctgc;gtgt;gV1gt.gt,BCgrgf g ^ a a ^ a t a ^ ^ r ^ A o a " a n a g ^ 9 r - ^ r ; : ' c aggacagcatgaatccttcccaagctgctggtgtctgtgtaaccagcgcctggctgttgtagctgcacatggactggctgtccctgagccttgtcttgtct-ttgtctcttcctgcactga cagcagccaccgtctgaccccccacaccatggatggcttgtgcttactcttgtgtttcctgtaaacaagagccagcttagttcacacctcccaggttagattccagtgtgaatcattcat taccctac^t.qtgggr.^qtqagqagaagttt^tqgagr.qagtgtgt;gtgtgtgtgtgt,gtgtgtgtgtgtgt;qtatgCg^qaatqtaac:aagcac.^tr-e:taqceacccagtqtcagattc cagcttcttatagtggagtgggaggggagtgtagtttctggaccttcgatgagccgtcaggctcatcagggccagggaggtaggcctgcctgtgcgatttccctgggaagtctggacata gtgatgaccattttcaggggcatcagaaaagccaggctgactctcaggtccaccctcccctccagtatctgttttcctcagagacatacacttccctccgctggatctacctgcttggcc cacccttgctggttccgtagttccaattcCcggagagaatccttgagttctgccagggtataggaagcaaacagctcagtcaggtgaccttgtggttctggggaccctccatgggcagca

3840 3960 4060 4200 4320 4440 4560

JExonlv

• gggtcccttagaccctgtcccttgacggaaatggaagtggggttgccCcggtgggcactcatcccccgactcttccacttgcagACATATMCACCAACGCTCAGGTCCCCGACAGCGCG ThrTyrAsnThrAsriAlaGlnValProAspSerAla

Kxon V

4680

+

GGCACTGCCAClSCCTACTrGTGTGGCGTGMGGCCAACGAGGGCACAG GlyThrRlaThrAlaTyrLeuCysGlyValLy5AlaAsnGluGlyThrValG]yValEerAlaAlaThrGlyArgThrflrgCy3AsfiThrThrGlr!GiyA3nGauValThrSerIleLeu CGCTGGGCCAAGGATGCTGgtgagccatcgcaggcagactcctgggttgtctggattgcagtctcctcgtccgtgcaCggcagggtggagtcgggccaaggaaccaggggactcctgctg ArgT r p A l a L y s A s p A l a G

4800

aacagaggccgcatggagcacctgggttcgtgcgcttcactacagagtgcaggacgtgaagcccatgcacggttttgagagaaggtcaaaqgaccaagccagggccagcctgcagaggtt caqaqtqatqttcccttataqcac^c^t^,gar;ac:a^apacacacacacacacacacacacacacacacacacacaccacacacaccqttactttaaaqqctcaaaqqctqqacaqtctat tq^fi^pfitggtagcatacatttccctcaqtaGtcctctqqctcaacaqcLtctaaatctatqttttatctttqctoccctqttaccttCtqqqcaacccctcccataqqcqqctttcc ctttccacctacatttcggatggattttctcctgcaactctaaatctgaccgtctccctccgaggcatgacggatcctcctctctccaagacctagccctcacaaatccccaagtcccac rr^agr^rtr^rgrrpaqnrarhgqr^qVrr.qqrir.rttt.at.t^ taaaaaaatttaaaaaaaaagccaactggggacaaggaccttcaacatttggacacgtggattcccaattttgggggccggattaattcaaagcattagaaccaatccctaacactcctt ccttctgcctccagcacttgaaggcaggttgacagagctccttctcttccacttgacgcttcgggccaccatggcttcttacaggcataggctccataagtgctggaaggaggagaggac ccatgccaccctgagtagaaggaaaggcgttcagctgcatctgCcttcatgtaacctagggaaactcaggcccaggcaaccagaggaaccgattgacgggagattacctaCggcagggag

5040 5160 5280 5400 5520 5640 5760 5880

ggagaggtacctatgcagggacttccttatc^accaggggtgctaacctgtcctgtcctcac^^

6000

4920

lyLyaSerValGlylleValThrThrThrArgValAsnHlsAlaTrirProSer

Figure 1. Nucleotide sequence of the structural mTNAP gene (GenBank AF285233). Dots above nucleotides and underlined amino acid residues indicate differences from either or both of the previously described mTNAP cDNAs. Simple sequence repeats are underlined. A region of extensive polymorphism, to be compared with the equivalent sequence from 129/J DNA (Fig 3), is marked by brackets. The untranslated exon I and the 25 kb long first intron are not included in the figure.

Polymorphisms in the Mouse Tissue-Nonspecific Alkaline Phosphatase Gene

239

ExonVI ZAGCCTACGCACACTCGGCCGATCGCGACTCGTACTCGGATAACGAGATGCCACCAGAGGCTC

6 1 2

°

laAJaTyrAlaitiaSerAlaAapArgAspTrpTyrSerAspAsnGlaMetProProGluAIaLeuSeEGlnGlyCysLysAspIleAlaTyrGlnLeuMetHisAsnlleLysAspIle vrgtgagtqgagcagc^gcagaggatgtgatagggtgqgqqagqqgtcttagccccaggqtggcctaggaaggcccggataccccacacacctggqagagcCcccaacccattaaggg

6240

.Ctgtggttggaacaggcagcccatgtaaggcttctc:tgcaatggagt:t,iH.giiggt.caagtc;tcatgtggctgccatgccccaagtgctcagagattatt;ctgt.aaattagnacctqat

6360

rt-y-arrv-rrt»ararflr*qt9E«^irrq*l-«r:y;,ir:/i^^

6480

gacacttgeacagagtgcacagcctgcBcatTagagctgacatgtlaqaaaagagctagatcatgctctccctttcccatccgtgctctgccttttgctagctgtgtgactttggg^c

6600

[ttacttaacctctctqagcctccccttggacaagKcttctgcccactggacctgctccaacagctcactggaacaactaagtaagaaaatcccagtgaqtttggcacagagcctggcc

6720

iggag.-ar(-ra^^ag^ancqTg-irArqraanr:^fi^c;cgt.c;:rcatccccgtccccqtcctcaqtoctaacacctGaacacccaqqcagqttcttaGacqcaqgcttcl:cagctcttqq [agaatttctgaattgctctttctctgctatgaggttgaccgtgacttagaatccctgaccKCgaagtatggggccagtgaggctgagagagagctcagacgcaggttccatagctctc ictctatgtctccaactccatggctcttttacaagtagcccaagtagccggcgagcgctgctctcaggatcactgggccctgttgacgcctcttaggtgagagccacccgggacacagc icccatctggctttatctccacctgccaggcagccctccccctccttcgccctcccatttcccacccctgtgccc(^cccacaggctgcgggaggcccagatgaggtcatgtggagaac :gggtcccgtagcacaagaggtqggagatgaggC9attacagtaactctttagccagctcaggcaggattggctgaggcagggagaggcactttggaggaaaaagctttggcggcagtg igggggctgggtctcctagagtcatgagggcatgagccaaqgagtgcaaaggggaGctcccaaagccccctctgccctcactctcccaattccatCtccccctctgCcacggctacccc i»a»rrrrrrag«rr-q^rtttq«*rr;i»aa*^r*.gqtgt(^^^^^ tccaaaggcaaacagcaagtctgttatttgggagtttaactgagaccgtgtgaagtctccactgggccgggaacatgctatcggggcaggaagattgggggcttcagttccatgctga latgccaaggggtcaggttctatgggaacaggagaaggcgcagtgcctggagtcctttctcccagacccagtctgaactatccaagctgattgccacatgatgggagtaagaccctccc -cctgtctgtcttcattgttcccttccagaggacatcttccattgtaggaccagcagcaacattggatgactgcttttcttctctggatagtcattaccattttatagactaggaaact iagtcctgtatgatcaaagaaatagacccatggtl.actaagcagtagaattagatctttactt<^gtgtcctCggccttcctgggtacctttcccatgtccaccgccggctctcccctt Kccctcgaaccctgattttgccacgggaaatagaatagagagagaactttggcccatctggagacagtggacctcaggtcctcgaaccctgattttgccatgggaaatagaatagaga igaactttggcccatctggagacagtggatctcaggtgtcagccaggggcagcaagtcgagctggtgcccacggtgacagaaggacacagtatggtggtcaccatgggctgtggctgtt ^ttcttggcccctgctttaaaaatagttaggacttcaaacatggatttcctaggaaaacaaacaaccccgcgtccatccgtccaccctcgcatgccagccacagtggctaaaattagc .ggcagtgattacatagacgtgaggcggccggiigtttqcccaggcagtgggctggttcccagctgggctgccccaagacctccctgtctgaggccactcactgtagaagacactccgga :cccatCcctggctccacccatacagcagagcactcaggggggcttctggtagagagacagtgggtttctgggtttgaatcgtatctgtgccacgcatgctgtgtggctgaggggcagg latcccttttctgtgtctataqcgagagagcgtagatcactttcctggcggcttggttgggagattgagtgagttactcggtcttactggctcaggacacacagcaaatgggctttcct :ggggtagggatgttagctgttcgagaacactaagattccacccagaagccccaagccacagtcctgctaggtcagcgattttatctttccaatctgaataaatcgttcagagagggag :ctgcccagcccttgcctagctgacaac:cgtccacacctccacccagactctcatgtctgc;Lttcc:aagtcggaacccctggggcttctggacaattt;agaaccctgggtgtgatggct

6840 6960 7080 7200 7320 7-440 7560 7680 7SO0 7920 804O 8160 8280 8400 8520 8640 8760 8880 9000

ExonVII » # a crtttgtCGCtcagGTGATCATGGGTGGCGGCCGGJ^TACATGTACCCGAAGAA^ Vallle^tGlyGlyGlyAcgLysTyrMetl^rProLysAsnArgThcAspValGluTvcGliiLeuAspGluLysAiaArgGlyThrArgLeuAspGlyloeuAspL

9120

:caagqctgcaggcggacgcggctatggttctgtgggccgtgggacacccccatccacgcatcaaatgttctctgcca

9240

gtgagagtccagatgttcctctgtttgctctaccgtactgacctgtgaccgacagaoactcctgctccctgtaggaagggcacacgtggggatggtgacagagtcagggagtcacctgga tggacccgagacaggctgggtacctgacacagggaatgtcgtgtgcctggacggcaggaagttgggacctggaggcacgcagaggggatacaacagaagacctgaacgaatggaatggga gggtgtggqcaggcagcctgagcagatgatqgqaqaacatggctcagcaaaagt.gtactggctgaggcggagtgtaaagcccccagagtacacccctgggccagggcaccagtgggctga gagggaggcgggagagggttaggccaagctctggttcctctgtgaggcctctggcctttaaatgtgccgctgtgggccagccgggcagctgggtctggaaatgactgtggtcaggcttga agacagtggggagaagcctggcagccgctttgatgtcactctgaggttgggtttctcagggttccatgttccccagggaqgctcagtggcatgagaggctgagctaagagtaaggtgggc agggctggggacagccccgtggctgtaLaggagCgCgggagtgagggactcgcttctctttctttccctcctggggcagcttggataggcacaccaaagagacatcctcctcttggctcc agttgctgtgatcaggaccaggaaqccatcctcttgccccqgqactctgtatagantcctggacacgggagggccctgtactactgccatgagccctacccggtagcactgttggtccca gattctggacatcagtagaggggctagaagtgtttctcaataaccctgatgtgtacgtgcctcagtgtccttatctgtgaaatggggccagtcttgagagactgctgtgaatactggatg tcatagcagggccctggggatctgagctggaCtcttcccttcctccccccccccccccagagcaacctagaggttagaaaaaaaaaaaaaccacctctaaggccactgccagagaaggca gagctgtcttgaagacacacagaaattcctgcgatgctgggcccatacactgtgcccctggataatccaggctgtctcattctatgtcctggggaggcagtgcctcctttatcccctcct agatgggttccccagacacagccagtggccttcactccctcctggctccccagagacccccaaacaccactcaattcctttttgtaggtattcacggttctctgtgacctaCgccgtaac accctcacaggaaaaagcaactctctgatcccagatagatcagattcctactttgcctccagaacctccgaaagggtgtcacctccaccaggaagctctccttgatccactctccaggac ccacatatacctgggctgtatccaacaacctctgcatatctgtcatttcttgagtacagagatgctaggtgctcagggcaggatgttaaacgatgcaggggaatggaccaccctgactct

9360 9480 9600 9720 9840 9960 10080 10200 10320 10440 10560 10680 10800

gaggccagttgggggLtttccttgGcttctgggcctctgctgacacagggccttccacccctc^gCATTC^ H i s Se rHla TyrValTrpAsnArgThrGluLe ukeuAlaLetiAspProSerArgV

10920

TGGACTACCTCTTAGgtaagtagagaaggggcgagGctcggctgcagttgtaggagatagtctccaattgcgagaccttccagggtgtggagggaggaaaggtttggtttgtggagttaa a1AspTyrLeuLeuG

11040

gctgtgcagcagatgagecccagttgtcataaaaaacaaaaacaaaaacaaaaaaacccaacaagttcggagaagggagagcacacagcacaccagggtgaaaccagccaggccaagct.t ctgctggtccttgagtaacccctgttccctagagagtaggggcoagcaaagggaggggcactEcagagcctggtggctcacaCcgccttggaggtcagctcctgctggtgGaaggCggca tggcacagagggttcgctgggctctttggagctgctgtttggccgctgcatgqtgacgagatgagagacagagagcagaagagacagtgacccctgggCttctgccgcttcactgcacag qccactgtaccctcagacct-cacgtgtaaatatgcacattgcttatgttgtcccccactgagcccagtccagagttcactgcctcagagatgtggtgctgcggtctcttggggttgagtc acaqatctqccctttcacccactqaaaacaQcaaqttqttqqqc^GcrAanaat^tcct^ cataggcacccccgccctatgccGctgtgaggtgggctctaggcccctgCggcctggtgaggtaCqctatagaCcctgtgcGCCtgtgtgatggcctgtaggcctcatgttcctaatgaq gtgaactctaggtGtGtgtgcccatgatgaggtgagctctaggtccccatgcccctgccaaggtgagctgtaggccttgtgcGcctggtgaggtagctatgccccccatgcccctggtaa aggtgttctt:i;agaccctgtgctactggCgaqgtqagctqcaqaccctqt.agGcctggtggtaggqgtggggttgaaggccccatgctcctgggqaagtaqgctgtgaagagaatgaaaa Gttcqaqtccattcccqqatccqcaqaqcttcccaggtaagtatctactccaagttgagtgtaggggtcttgcatcctctgtgacaccctggqctccttggggtcctcacagtaggctcg

11160 11280 11400 11520 11640 11760 11880 12000 12120

Figure 1 (continued).

240

N. Frohlander et al.

gggagctcttggcacCcatattcttgccatgtccacagGTCTCTCTOTGCCCGGGGACATGCAGTATGMTTGMTCGGAACAACC^ACT^ACCCTTCGCTCTCCGAGATGGTGGAGG'r lyLeoPheGluPraGlyAspMetGlnTyrGluLeuAsnArgAsnAsnLeuThrAspProSerLeiJSerGlijHetValGluVa

12240

GGCCCTCCAGATCCTGACCAAGAACCCCAAAGGCCTCOlCTTGCTGGTGGAAGgtaggaagCggtggagacaaccciiLccagaaacatgacgtgggttaggcgatggaacgtacagggaa lAlaLeuGJijIleLeyThrLysAsnErQLYaGlyPhePheLeuLeuValGluG

12360

ctgccagcttcggttgaagagtggatgcgaagatgctctgcaagatgtggaattcagaggagaggggcttcctggcagaggagaagccatcttcctttgagacgtc:t.tcctctcagt;cct ggctgtggaggaccatccagcctgtaagcgtcacccatgcaaaagaagttccacatgcactgtggctgtcactggccagggagatgacatcgggttcttgtccacatgctaccatccatg gccgtgggaccatcactaagcatttttgtaaaaatgagaccaaaacttgccccccgtcctagagcggtgacttggccccatcttgctcagaagcgagcacaaaggattgagggaggaggc tcagtcatctccctggtgcctgci^caiicacatcaggacttcagttcagatcctagaacctttgCgaaaaagtcaggcacggggttatgtggcgtggcacaaagctggatccctgagac ttgatggccagccagtctagctaactccagtgagagaccctgcctcaaaaagaaaaggtggaggggcaggggagatggctcagcagttaagatccagttcctgcagaaqacccaagttca gttcccagcacccacatcagacagttcacgaccacgtgagcctctgtccaggcgagagagcctggataaaggttatgtaaattatctgggcacagaagtgtggttaaaaattgtgaaaat agctgggcataaatttctgcgccaaaaaatccaatctagggtatcggacaccctctttggcccttacaggcacccctacacacaggcacagagagatagagagagacagagacacagaat aaaaacagacaaaaacaqaqaaaaaaataaaaccaaqcKaggtggtgqtgflfrgcaaqtcttcaatcr^ ttctaaaacaaccaaggctacaggagagaaatcctgtctcaaaacaaaacaacaaacaaacaaaaaacaataaataaataataaaatgcttaaactaaataaagggggaaagtggttgag agaagacacgcacacacccacgtggggccatgcacacagggcacagagtagccagcaactccataagccctgggggagcagagagcaaaccttacccatcagtgttccattgcgaccatt gtaattacatagtagccaaagcaacaataggaggtaacagagttccagaatcttccagaaatttctcagcKcttgccctccatggtcccttgggtggccctgctgagatatgcatattct t a c t c c a t o a a o a ra r. a t a n a t ^ t. cifc
12480 12600 12720 12840 12960 13080 13200 13320 13440 13560 1368D 13800 13920 14040 14160 14280 14400 14520 14640 14760

tgagggagatgttaatctgatgacagcgtcccggggcctgtgagggoagggctccctcccctgtccctcaCggaggcctttgccttgatgtccctagGAGGCAGGATTGACCACGGACAT lyGlyArglleAspHiaGlyHia

14B8Q

CATGAGGGTMGGCCAAGCAGGCTCTGCATCAAGCAGTGGAGATGGACCAGGCCAT^ Hi^GluGlyrJy3AlaLysGl^JVlaI^uilisGluAlaValGlu^tetJ.3pGlnAlaIleGlyLysAlaGlyAla^fetThcSerGlnLysA3pThrLeuTh^ValVaiThrAlaA^pHi3Se^

15000

CACGTTTTCACATTCGGTGGATACACCCCCCGGGGCMCT'CCATCTTTGgtgagcagcctgtccctgggatiacagcgacggcctgtcccatgagaactgatgacagttgagataGtggaa HisValPheThrPheGlyGlyTyrThrProArgGlyAsnSerllePheG

15120

ggggaggggagaatccaagagagctgaacacctgggcgatgggtgggtgtgcagcaaagagaacatgggggaggggcatggggaggaaggagaccattggaaggatgaaagacctggcat ttgtgtgctgtcttcatggaggtagctctgactcctaaggaggatcacaagaaataccaagccacagaccccggggagctgggctagggccaggctgccacttctccattgcgaccttgg

15240 15360

•

• •

•

Kxon XI

•

gccccttccetacgcagGTCTCGCTCCCATtKTGAGCGACACGGACAftG lyLeuAlaProMetValSerAspThrAspLysLysProPheThrAlalleLeuTyrGlyAsnGlyPtroGlyTycLysValValAspGlyGlnAcgGluAsnVa

15480

CTCCATCGTGGA'ITACGgtgagaccgcagaggccagggctgggaggggaaagggtctccatccctcgaggctgggctgacggggtcctggaggagtccagattttgagtcagaaccccag lSerMetValAspTyrA

1560Q

tttgggctctctcctgtacttgagtaaotctgaccttgcacacctgattcctccattgaaagactcttaccctctggggactttaccctggaagatgccatgggtctggcgagcctagct tgatgcctgccctgtgtaggtaccacaCgtactttattcgatgaatcctctgtaaggggctaagagcgtggtctcccgggtgaggcagacttggtttgaagccagttgcgtttggtgaaa tgtccccccctggttgctgaaccctggcttccctctcctgtaaaaagggcctccaagccaccctgagggtgaggaaaaggttccctctctcacagctccggggttcagataggcactcaa

15720 15840 15960

•

aagatatttctgacctacagCTCACAACAACTACCAGGCCCAGTW^ laHisAsnAsnTytGlnAlaGlnSerAlaValPcoLeuArgHisGla'rhrHiaGlyGlyGluAspValftlaValPheAaaLysGlyPrQMetAlaHlsLe ExonXll m mm GCTTCACGGCGlCCA'raAGCAGAACTACATTCCCCATCTGATGGCGT^ uI^uHisGlyValHisGluGlnAsnTyrlleProHisValMetAlaTyrMaSerCysIleGlyAlaAanLeuAspHlsCysAlaTrpAlaGlySerGlylhtAlaProSecProGlyAl

16080

16200

CCTGCTGCTTCCACTGGClXnGCTCTCCCTACGCACCCTGTTCTGA^ aLeuLeuLeuProLeiiAl^ValLeuSerLeiiAxsThrLeuPhe'**

16320

CTCAAr^GAGAGGTCCAGGCAACTTCCAGCAGGAACAGAAGTTCGCTATCTGCCTTGCCTGm CTTTGGCCAGCAGGGCAGCnTi'CTCTCTTGGGCAGGCAAGACACAGACTGCACAGATTtrCAAAGCACCTTAaTT^ TKTAGATCTG&CCTTCTCTCCTCCATCCCrTCCCTTCCCTCTGGMCACT^ AGJOt:CCAGGAAGCCACCTCCGGGGTTGGCKrTW7\CCCAGGGrc CCrrrCACTGMGTGGCTCTCCTGTrTGGAATAGCGCGGTGGGGTGGGGGAGAAGAfU^AAAGAAAGAAAAAAATTTlT^ AcarTTAAATAAAACATCCCAAATATTTCTgaggccagagttgagtctttgtggtcagtgggaaagtgaccaaaataggcccatcgctgaggaaagaactggacccggggtgagggcatg ccaagcagagatgccggggcttttgccaggcaaggggggcttcaccgctcaccagctgtgatgctgggatccactatgtttcagtgccactgcgtatggcaatgggaagacgatccactt

16440 16560 16680 16800 16920 17040 17160

gtgggtcagtgtgaggagagagagtc<^ttgtc&ctaaacccacagcagggaggg^

17290

atgtagccacttactcctgccttcatcgacacatgcCgacggtgccaggcacxygcataagggctcaatggttgccCggcagtggctggtgagaagcttagcccgttgaggcactggccgt cacttaagcctaacaact caagtttgatccccagaatt c

17400 17439

Figure 1 (continued).

Polymorphisms in the Mouse Tissue-Nonspecific Alkaline Phosphatase Gene

Apal

EooRI

ill

till!?

Hfndlll

Pst I

Apa I

SlilS

Sill!

5s?

241

2> r- "A

''IF * r l tw

L.A..,

L

B

1

i

- c

-->

Figure 2. Evidence of genetic polymorphism and intron-size variation in the mouse TNAP gene. A: Southern blot analysis of genomic DNA from Balb/c, C57BL/6J and 129/J mouse strains digested with Apa I and probed with the 5' 720 bp of the mouse TNAP cDNA [2]. Asterisk indicates the absence of an expected fragment. B: Digests of genomic DNA from CD1, C57BL/6J, Balb/c, 129/J and FVB/N mouse strains using Eco RI, Hind III and Pst I probed with the full-length mouse TNAP cDNA. Numbers to the left of each panel designate size, in kilobases. Asterisks indicates fragments not accounted for. C: Genomic DNA from Balb/c, C57BL/6J and 129/J mouse strains digested with Apa I, Bam HI and Hind II and probed with the 3' 1.7 kb fragment of the mouse TNAP cDNA [2]. Asterisks indicate fragments not accounted for.

ggcataggatgtgatagggtgggggaggggtcttagccccagggtggcctaggaaggccc jtSmal m m g^ggtaccccacacacctggaagagctcccaacccattaagggatttgtggttggaacagg

ItKtml

•

60 120

„rtrt

cagcccttgtaaggtttctctgcagtggagctatgaggtcaagtctcatgtggctgccat It Pst I # gccccaaatgctcagagattattctgtaaattagaacctgatggtgcaccctccacacac agtgtacactgatamfiamsil^

180

gcatgcagtaatacccca.cac.aGacacacacacacacacacqtcactcacaatat€ittct

360

taggggctgcccggccctatgtattgacacttgcacagagtggcacatgcctgcacacag

420

agctgacatgtc

432

240 300

Figure 3. Nucleotide sequence of the polymorphic intron 6 from 129/J mouse DNA (GenBank AF285234). The 5' 0.4 kb of intron 6 from 129/J was obtained through PCR amplification. Nucleotide substitutions as compared to the corresponding Balb/C sequence are indicated by dots, unique restriction sites arc labeled, and polymorphic simple sequence repeats are underlined.

242

M. Frohlander et at

Kpn I

Pst!

P

in

sag 4.13.8-

2.1-

L

f- •_

Figure 4. Ajwi / and Pst I polymorphism, Kpn I and P s t ! digests of genomic DMA from Balb/c, 129/J and C57BL/6J mice probed with a 1.2 kb Hind II fragment from the Balb/c TNAP gene. Balb/c fragment sizes (kb) indicated.

LIPOXYGENASES AND CYCLOOXYGENASES OF THE TESTIS OF RAT S.NEERAJA, P . R E D D A N N A , A N D P . R . K . R E D D Y

Department

of Animal Sciences, University of Hyderabad, Hyderabad- 500 046, INDIA. Email-

[email protected]

Arachidonic acid (AA), the most abundant polyunsaturated fatty acid in mammalian systems is oxygenated in rat testis mainly by two pathways - the cyclooxygenase (COX) pathway leading to the formation of prostaglandin F2« and the lipoxygenase (LOX) pathway to produce mainly 12-HETE in the seminiferous tubules and 5-HETE in Leydig cells. LOX activity is found to be present both in the cytosol and microsomes with cytosol having thrice the activity compared with that of microsomes. Studies on the COX isozymes revealed the constitutive expression of COX-2, an inducible form expressed in response to inflammatory and mitogenic stimuli, in the spermatogonia of the seminiferous tubules. COX-2 cDNA from rat testis was synthesized by RT-PCR using primers based on 5' and 3' ends of the published human sequence to clone and further characterize COX-2 in rat testis.

1

Introduction

Eicosanoids are a family of oxygenated derivatives of eicosapolyenoic fatty acids, such as arachidonic acid (AA). AA the most common fatty acid precursor in mammalian cells, is incorporated as an ester into the membrane lipid complex. After an appropriate physiological signal, or after calcium ionophore treatment, cellular phospholipases and lipases of distinct pathways cleave AA from the membrane phospholipids, making it available as a substrate for eicosanoid production [1]. Apart from the normal P-oxidation, AA is enzymatically oxygenated by two important pathways: COX pathway leading to the production of prostaglandins, thromboxanes, prostacyclin and the LOX pathway generating hydroperoxy- and hydroxy-eicosatetraenoic acids, leukotrienes and lipoxins. Lipoxygenases are classified as 12-, 15-, and 5-LOX depending on the position of insertion of the oxygen molecule. Two distinct forms of cyclooxygenases have been identified, which are encoded by two different genes [2]. COX-1 is expressed ubiquitously, whereas COX-2 has a more restricted expression pattern [3]. Furthermore, in contrast to constitutive nature of COX-1, COX-2 is highly inducible by mitogens, interleukin-1 alpha, cytokines, phorbol esters, lipopolysaccharide, growth factors like epidermal growth factor, human growth factor etc [4-6]. Both the isoforms of COX are glycoproteins and function as homodimers [7-9]. Aminoacid analysis indicates 60% homology between the two isoforms. Eicosanoids serve as local hormones and are extremely potent biologically active compounds with bewildering variety of actions. The type of stimulation or inhibition depends on the metabolite, its concentration, and the metabolic activity of 243

244

S. Neeraja et al.

the cell and the involvement of various other factors. The message may be transmitted via a membrane receptor by activating the guanynyl cyclase system, mobilization of free cytosolic calcium or via the membrane ion channels. The available literature on the metabolism, biochemistry and pathophysiological effects of eicosanoids indicate that they play significant role in virtually all mammalian organ systems, including reproduction. Most of the studies on eicosanoids and reproduction, however, were concentrated on the external addition of either prostaglandins or leukotrienes without basic understanding of the enzyme systems involved and their cellular localization. The present study is, therefore, aimed at analyzing the cyclooxygenase and lipoxygenase pathways in testis. 2

2.1

Methodology

Materials

All fine chemicals were purchased from Sigma Chemical Company (St. Louis, USA) or from Bio-Rad Laboratories (Richmond, USA). Percoll, density gradient marker beads, low molecular weight markers for SDS-PAGE, DNA molecular weight markers, nylon membranes and nitrocellulose membranes were purchased from Amersham-Pharmacia Biotech (Uppsala, Sweden). Superscript reverse transcriptase, PCR kits were purchased from GIBCO-BRL. All other chemicals were bought from local companies and were of high quality. oc-32P and random primer labelling kit was purchased from Jonaki (BARC), Hyderabad. Wistar strain rats bred in the animal house were used in this study. 2.2

Lipoxygenase activity and its products

Lipoxygenase enzyme activity was measured polarographically [10]. Typical reaction mixture contained 2.9 mL of potassium phosphate buffer pH 7.4 and 100 uL of the enzyme. Reaction was initiated by addition of 10 (J.L of the substrate (133 uJVl arachidonic acid, final concentration). One unit of enzyme is defined as one umole of oxygen consumed per minute. Lipoxygenase products of the testis were extracted into hexane: ether (l:l,v/v) and were subjected to HPLC analysis. The individual peaks were collected and checked for the presence of characteristic conjugated diene spectrum with maximum absorbance at 235nm. The peaks showing the characteristic absorbance were co-chromatographed with the standard HPETEs and HETEs prepared as described [16].

Lipoxygenases and Cyclooxygenases of the Rat Testis 2.3

245

Estimation ofprostaglandins on TLC

Testicular homogenates in Tris buffer pH 8.0 were extracted into chloroform and applied on to pre-coated silica-G plates. The TLC plates were run in chloroform: methanol: acetic acid: water (90:8:1:0.8) and the products separated were developed with iodine vapours and identified based on the Rf values as described [11]. 2.4

Production of antiserum

300ug of Cox-2 specific peptide (cys-tyr-ser-his-ser-arg-leu-asp-iso-asn-pro-thr-valleu-iso-lys, purchased from Genescape, U.S.A) conjugated to keyhole limpet hemocyanin, emulsified in Freund's complete adjuvant in 1:1 ratio was used for immunization of rabbits. Booster injections (lOOug) were given after lOdays and repeated every 15 days with Freund's incomplete adjuvant. Rabbits were bled a week after the final booster injection, serum collected, and IgG was purified using protein-A agarose column. 2.5

Western blotting and immunohistochemistry

Protein content in the crude enzyme preparation was estimated according to the method as described [12]. The proteins in the crude extract were separated on SDSPAGE [13] and stained with silver stain [14]. Proteins separated on SDS-PAGE were transferred on to nitrocellulose membrane [15] and immunohistochemistry was done according to the method described [17]. 2.6

Leydig cell isolation and Northern hybridization

Isolation of Leydig cells was carried out according to the procedure described [18]. Total RNA from testicular cells was isolated by the guanidine-thiocyanate-phenolchloroform extraction method [19] and from it polyA RNA was isolated by using oligo dT columns (Amersham-Pharmacia) [20]. PolyA RNA (lug) of rat testis and total RNA (30ug) from rat air pouch as control were electrophoresed, in 1.2% denaturing formaldehyde gel, blotted and probed [20]. 2.7

Plasmid isolation and probe preparation

Plasmid pVL941 containing Cox-2 cDNA in DH5oc was a generous gift from Prof. Shozo Yamamoto, Tokushima University, Japan. Plasmid was isolated from 1.5mL culture (20) by alkaline lysis and was precipitated in isopropanol. Restriction digestion of the plasmid pVL941 was done with Bam HI to obtain a 2Kb COX-2 cDNA probe. This fragment was gel eluted and labelled with the random primer

246

S. Neeraja et al.

labelling kit. Cold dATP, dTTP, dGTP and a- P dCTP and 2 units of Klenow were used in the reaction. 2.8

Reverse transcription polymerase chain reaction (RT-PCR)

Total RNA (lug) from rat testis was taken for first strand cDNA synthesis [21] using 2.5 units of superscript, ImM concentration of dNTPs and with 5'GTC CAG GAA CTC CTC AGC3' [CO-1] & 5'CTG GGC CAT GGG GTG GAC TTA3'[CO3] as the forward primers and 5' TAA GTA CAC CCC ATG GCC CAG 3'[CO2]& 5'AGA CTT CTA CAG TTC AGT CGA 3'[CO-4] as reverse primers in the PCR reaction. These primers were designed based on human cyclooxygenase cDNA. 3

Results and Discussion

Earlier studies from our laboratory have shown that lipoxygenase activity is present both in the cytosol and microsomes of adult rat testis with the activity being higher in cytosol [22]. Analysis of the products with arachidonic acid as the substrate revealed the formation of 12-HPETE as the major product, both in the cytosol and microsomes. Substantial quantities of 5-HPETE and 15-HPETEs were also detected. The exact cell types involved in the formation of lipoxygenase pathways were not identified. In the present study both cytoplasm and microsomal fractions were found to contain lipoxygenase activity in the testis of 30 day old rats. Cytosol contained three times the specific activity of the enzyme (1.56 units/mg) compared to that of the microsomes (0.37 units/mg). The HPLC profile of LOX products from seminiferous tubules showed a peak with a characteristic spectrum and absorption maximum at 235nm (Fig 1A). This peak was co-chromatographed with standard HETEs and was determined as 12-HETE. The endogenous product profile of LOX pathway from Leydig cells gave two peaks with retention time 13.46 minutes and 21.88 minutes (Fig.IB) with an absorption maximum at 235nm. These peaks were identified as 5HPETE and 5-HETE by co-chromatography with the standards (data not presented). These studies demonstrate that 12-HETE is mainly produced in the seminiferous tubules by 12-LOX pathway and 5-HETE in the Leydig cells by 5-LOX pathway. Measurement of COX was done in total testicular microsomes and cytosol as well as in the homogenates of seminiferous tubules and Leydig cells. The data showed that COX-2 is present in the cytosolic fraction of total testis and also in the microsomes. Both the homogenates of seminiferous tubules and Leydig cells contained COX-2 constitutively. The enzyme has the same molecular weight (72kDa) as that of the expressed human COX-2 protein (Fig. 2). These studies demonstrate for the first time that cyclooxygenase-2 is constitutively expressed in rat

Lipoxygenases and Cyclooxygenases of the Rat Testis

247

testis. Separation of the products on TLC showed two bands with Rf values 0.23 and 0.45, which correspond to the Rf values of standard PGF2«and PGE2 respectively. Based on its intensity, PGF 2a is identified as the major product. Immunohistochemistry showed that the enzyme is present in high concentration in the spermatogonial cells as compared to the Leydig cells in the testis (Fig. 3). However the physiological significance of the constitutive expression of COX-2 in testis is not clear. Northern analysis in the present studies showed that the polyA RNA of the testis of rat was smaller in size compared to that of the air pouch (Fig. 4). In order to further characterize COX-2 from testis, total RNA and primers mentioned in methodology were used for synthesis of cDNA by RT-PCR. Primers CO-1 and CO-2 resulted in a 600bp fragment and primers CO-3 and CO-4 resulted in a 1.2kb fragment of COX-2 cDNA (Fig. 5a, b). These two fragments are being ligated and cloned into a yeast vector for the expression and characterization of a functional protein.

i

* x

¥ms

Figure 1. HPLC profile of endogenous lipoxygenase products. Column- Silica C18 ( 0.46 X 25 cm), flow rate 2mL/min, absorbance 235nm, solvent- hexane: isopropanol: acetic acid . A) Seminiferous tubules (1000: 7:1) B) Leydig cells (1000: 20: 1)

From the above studies it can be concluded that AA is differentially oxygenated in different cell types of testis; by 12-LOX pathway in seminiferous tubules, 5-LOX pathway in Leydig cells and COX pathway in spermatogonaia. Further studies, however are required to understand the role of AA metabolites in different cell types of rat testis and their possible interactions.

248

S. Neeraja et aL

IH^^^^^HH!

Figure 2. Western blot analysis using a polyclonal antibody specific for COX-2. Lanes- 1) COX-2 protein (human COX-2 expressed in SF9 cells) 2) total testicular microsomes 3) total testicular cytosol 4) seminiferous tubules homogenate 5) Leydig cell homogenate.

Figure 3. Immunohistochemical detection of COX-2 protein in rat testis. A) Control treated with preimmune sera. B) Immunologically stained with a polyclonal rabbit anti-COX-2 antibody (primary antibody) and a goat biotinylated anti-rabbit antibody (secondary antibody).

Lipoxygenases and Cyclooxygenases of the Rat Testis

249

Figure 4. Northern blot analysis of rat testis mRNA for COX-2 expression. Lanes- 1) 30ug of total RNA from rat air pouch as control 2) & 3 ) l u g of mRNA from rat testis.

a

b

Figure 5. RT- PCR of rat testicular RNA using COX-2 specific primers. A) Lanes- 1) 60Obp amplified fragment 2) Mol Wt maAer (GIBCO- BRL lOObp ladder) B) Lanes- 1) 1.2Kb amplified fragment 2) Mol Wt marker (Amersham-Pharmacia 1Kb ladder

4

Acknowledgements

This work was supported by the grants from the Council for Scientific and Industrial Research, Government of INDIA (Project No. 37(0987)/98/EMR-II). One of us (SN) was supported by a Research Fellowship from University Grants Commission.

250

S. Neeraja et al.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22.

Bell, R.L., Kennedy, D.A., Stanford, N. and Majerus, P.W. Proc Acad Sci USA 76 (1979) pp. 3238. Wen, Z.P., Warden, C , Fletcher, B.S., Kujubu, D.A., Herschmann, H.R. and Lusis, A.J. Genomics 15 (1993) pp. 458. O'Neill, G.P. and Ford-Hutchinson, A.W. FEBS Lett. 330 (1993) pp. 156. Kujubu, D.A., Fletcher, B.S., Varnum, B.C., Lim, R.W.and Herschmann, H.R. J Biol Chem 266 (1991) pp. 12866. Fletcher, B.S., Kujubu, D.A. Perrin, D.M. and Herschmann, H.R. J Biol Chem; 267 (1992) pp. 4338. O'Banion, M.K., Sadowiski, H.B., Winn, V.J. and Young, D.A. J Biol Chem 266 (1991) pp. 23261. Barnet, J., Chow, J.and Ives, D. Biochim Biophy Acta 130 (1994) pp. 1209. Smith, W.L., and Otto, J.C. J Biol Chem 269 (1994) pp. 19868. Percival,M.D., Quellet,M., Vincent,C.J., YergeyJ.A., Kennedy, B.P. and O'Neill. Arch Biochem Biophys 315 (1994) pp. 111. Grossman, S., Ben-Aziz, A., Bodoviski, P., Ascarelli, I., Gertler, Y., Birk, Y. and Bondi, A. Anal Biochem 34 (1970) pp. 88. Salmon, J.A., and Flower, R.J., Methods in Enzymology 86 (1982) pp. 477. Lowry, O.H., Rosebrough, N.J., Fair, A.L. and Randall, R.J. J Biol Chem 193 (1951) pp. 265-275. Laemmmli, U. Nature 227 (1970) pp. 680-685. Blum, H. and Beier, H. Electrophoresis 8 (1987) pp. 93-97. Toubin, H., Staehelin, T. and Gordon, J. Proc Natl Acad Sci USA 76 (1979) pp. 4350-54. Reddanna, P., Whelan, J., Reddy, P.S. and Channa Reddy, C. Methods in Enzymology 187 (1990) pp. 268. Methods in Molecular Biology 34: Immunocytochemical Methods and Protocols. Moyle, W.R. and Ramachandran. J Endocrinology 93 (1973) pp. 265. Chomczynski, P. and Sacchi, N. Anal Biochem 162 (1) (1987) pp. 156-9. Sambrook, J., Fritsch, E.F. and Maniatis, T. Molecular Cloning 1, 1989. Sambrook, J., Fritsch, E.F. and Maniatis, T. Molecular Cloning 2, 1989. Reddy, G.P., Prasad, M., Sailesh, S., Kumar, Y.V.K. and Reddanna, P. Prostaglandins 44 (1992) pp. 497.

EFFECT OF HETEROGENEOUS SPERM AND HYBRIDIZATION OF DNA FRAGMENT IN ALLOGYNOGENETIC SILVER CRUCIAN CARP X I A D E Q U A N 1 , X U E G U O X I O N G 2 , AND ZHANG L I 2

'Freshwater

Fisheries Research Center, Chinese Academy of Fishery Sciences, Wuxi 214081 Jiangsu, China

2

Institute of Developmental

Biology, Chinese Academy of Sciences Beijing 100080,

China

The Fang Zheng silver crucian carp is a kind of bisexual triploid fish which can bring forth gynogenesis under induction of allogynogenetic fish sperm. The Fl generation of it, obtained under induction of sperm of wild carp, is called allogynogenetic silver crucian carp. The event that the heterogeneous fish sperm can also bring forth numerous biological effects in the filial generation, as it induces gynogenesis in fish, is called effect of heterogeneous sperm (heterogeneous sperm effect). RAPD was applied with 154 random primers in analyses of polymorphism in population of Fang Zheng silver crucian carp (?), wild carp (cJ) and their filial generation, allogynogenetic silver crucian carp. The electrophoretic bands related to of heterogeneous sperms effects were sought and used as probes after cloning, for Southern blot hybridization with the amplification products and genomes of above-mentioned tested fish. The results indicated that there exists a highly homogeneous fragment in the genomes of allogynogenetic crucian carp and wild carp, which is lacking in Fang Zheng silver crucian carp. It indicated the possible hybridization between the genome of the DNA fragments of male and female nuclei during fertilization. It provides a possible mechanism for the effect of heterogeneous sperm.

1

Introduction

The Gynogenetic development is one of the important topics in fishery practice. The gynogenetic fish possesses advantages of high output, fast growth and relatively high purity in pedigree. Therefore, the research on the mechanism for gynogenetic development in fish is of profound and lasting theoretical significance and application value in freshwater aquaculture. Fang Zheng silver crucian carp (Carracius auratus gibelio) mainly distributes in Heilongjiang River waters, and is a kind of natural gynogenetic bisexual triploid fish. It possesses the following properties: 1) It needs stimulus of artificial insemination to accomplish parthenomixis of female nucleus; 2) The sperm nucleus doesn't form male pronucleus in fertilized egg after entering and doesn't fuse with the female pronucleus of the egg; 3) the characteristic of phenotype in the filial generation tends to be that of maternal fish which means the offspring inherits its maternal fish's inheritance materials. Hence, Fang Zheng silver crucian carp is a model fish for research on the mechanism of gynogenetic development in fish. Although Fang Zheng silver crucian carp contains 5%-10% of male fish in its self-reproducing population, it could still accept stimulus from heterogeneous sperm

251

252

D. Xia et al.

to reproduce offspring as it accepts homogeneous sperm from the male fish of the same species to reproduce offspring. The cytological observations showed that the pronucleus development was inhibited after the heterogeneous sperm entered the egg. The male nucleus kept relatively independent of the female pronucleus though the former was gradually moving towards the latter, even making tight contact with the female nucleus. In recent years, some researchers found that the haploid chromosomal set of the sperm of the same species fused with its maternal diploid nucleus to form a compound tetraploid silver crucian carp. But nothing of the fusion of the chromosome set of heterogeneous sperm with that of silver crucian carp has reported reported. In short, it seems that the induction of heterogeneous sperm doesn't participate in the inheritance of the development in the recipient egg. However, a great number of observation proved that the entry of heterogeneous sperm into the egg not only induced gynogenesis of the egg, but also caused the egg to bring forth biological effects in filial generation which were reflected in aspects of the reproduction effect, characters, growth rate, sex ratio, isozyme pattern and change of enzyme activity as well as the occurrence of chimera. All of these phenomena are significantly different from the maternal inheritance related to gynogenesis. These phenomena of the filial generation, which were induced by fertilization with heterogeneous sperm, showing not only the common basis of heredity from the maternal fish (Fang Zheng silver crucian carp) but also the detailed hereditary characters different from the maternal fish to a certain extent, was called "effect of heterogeneous sperm". We believe that this effect of heterogeneous sperm is of importance in maintaining and strengthening the competitive capability for survival of the gynogenetic variety of fish. Obviously, the appearance of the "effect of heterogeneous sperm" must have their mechanism and molecular basis. The successful breeding of remote hybridizations resulting in Sorgham/Rice and Maize/Rice indicated that the heterogeneous DNA sequences could be recombined and integrated into the egg cells of the mother plant causing characters which would occur in filial generation. Thus it points out the possibility for insertion and recombination of heterogeneous DNA. A hypothesis of "hybridization of DNA fragments" has been proposed by Professors Jiang Yigui and Xue Guoxiong. It assumed that, on the level of the whole event during fertilization, in general, the heterogenous nucleus is excluded by the nucleus of silver crucian carp's egg when the former enters the space of the latter. But, on the molecular level of fertilization, there happened the hybridization of DNA fragments between the nuclei of heterogeneous sperm and silver crucian carp's egg which leads to difference in inherited charaters from maternal fish to the filial generation. Jiang and Xue also believed that the integration of the same species doesn't bring forth distinct difference of inherited characters from maternal fish to the filial generation due to the high homology of the hereditary material in parent fish, while the hetrogenous sperm effect will be expressed under remote hybridization.

Heterogenases Sperm Effect in Allogynogenetic Silver Crucian Carp

253

This paper presents the experiment results obtained by using RAPD analysis: 1) The analyses of the difference between the genomes of female fish of Fang Zheng silver crucian carp and male fish of wild carp and their hybrid offspring, the allogynogenetic silver crucian carp; 2) The electrophoretic bands of DNA fragments related to the effect of heterogeneous sperm, were cloned and used as probe in Southern hybridization with tested genomes for investigating the possibility of hybridization of DNA fragments and the molecular mechanism of genesis of heterogeneous sperm effect. 2

Results

The experimental materials, Fang Zheng silver crucian carp ($), wild carp (c?) and hybrid filial generation fish derived from the hybridization between Fang Zheng silver crucian carp ($) and wild carp (<$), were obtained from the Institute of Aquaculture of Heilongjiang Province. DNA of 10 fish for each group of three experimental materials were extracted from individual fish and sample of fish pools of population in three groups of experiment for RAPD analysis, respectively. 2.1 Optimization of RAPD The parameters of RAPD cycles, concentrations of Mg 2+ , Taq enzyme and primer, purity and concentration of template DNA and difference of RAPD of individuals among 3 experimental materials were optimized to obtain the best experimental condition and high repeatability of RAPD performance. Twenty primers were used for each experimental group and each primer contained 10 bases. Nine groups of primers containing a total of 180 primers were used in RAPD analysis. With these primers except the primer No. 26, 2,350 electrophoretic bands were amplified in an average of 5 bands for each primer with a size range of 300-2500 bp. There were 3 classes of amplified bands appearing: 1) The band patterns between Fang Zheng silver crucian carp and allogynogenetic silver crucian carp were very similar to each other with few exception; 2) Some amplified bands appeared only in the sample DNA of allogynogenetic silver crucian carp and wild carp and 3) The amplified bands appeared only in that of allogynogenetic silver crucian carp; but not in those of the other two kinds of fish. 2.2 Analysis of concordance ratio The analysis of data was carried out following the formula of concordance analysis by Nei et al. Concordance rate =2xNab/(Na+Nb) X 100%

254

D. Xia et al

Nab: number of common amplified DNA fragment of individuals a and b, Na: number of amplified DNA fragment of individual a, N b : number of amplified DNA fragment of individual b. Table 1 shows that the concordance ratio between wild carp and allogynogenetic crucian carp, 38.8%, is higher than that between Fang Zheng silver crucian carp and wild carp, 28.0%. It means that, at the level of DNA, the genetic distance between the former two (B/C) is closer than that between the latter two(A/B).

Heterogenases Sperm Effect in Allogynogenetic Silver Crucian Carp

F

F3 F4 m F7 m V L F Y L F Y L F Y L F Y L

Figure 1. RAPD patterns. F: Fang Zheng silver crucian carp, Y: Allogynogenetic silver crucian caip, L: Wild carp, Ctrl: DMA-free negative control^, M: ln^DNA/Eco RI+ Hind HI Panel A: Amplified band patterns with primers F3» F4, F6, F 7 and F8. Panel B: Amplified band patterns with primers Do, DM, D15 and D16. Panel C: Amplified band patterns with primers Z» and Z19.

255

256

D.XiaetaL

Figure 2. Differentially ampified DNA Fragments among 3 kinds of tested DNA samples derived from 3 experimental material fish. Panel A, B, C and D represent: amplified band patterns with primers A», Eos, B12 and By respectively. F: Fang Zheng silver crucian carp. Y: Allogynogenetic silver crucian carp. L: Wild carp. M; luglDNA/Eco RI+ Hind III. Arrow: amplified bands with difference.

Table 1. Number of amplified patterns of genomic DNA and concordance ratio among them. Total DNA fragments

1 FZ silver I crucian carp A

7 4

Wild carp B

8 0

I Allogynogenetic silver crucian carpC

8 0

Number of common DNA fragments Wild carp B

Allogynogenetic silver crucian carpC

26

622

_—

312

—

Concordance ratio % Wild carpB

28

Allogynogenetic silver crucian ! carpC 77.30

—

38.80

—

—

23 RAPD results Seven repeatable amplified "bands with difference" were found in these experiments, which were amplified with the primers A19, Bo, D15, Eo5s P10, P i2 and Z ^

Heterogenases Sperm Effect in Allogynogenetic Silver Crucian Carp

257

respectively. And5 they were amplified from the allogynogenetic silver crucian carp and wild carp in the same migration rate, while no amplification band displayed in the electrophoretic gel in Fang Zheng silver crucian carp. The amplified bands with the difference were firstly amplified, enriched and purified. The final products were then used as hybridization probes for the identification to homogeneous sequences through dot-blot hybridization. The results showed that the primer A19 revealed signal of positive reaction only in the hybridization to the genome of wild carp while the primers E05 and 2 ^ showed positive reaction in the hybridization to the genomes of allogynogenetic silver crucian carp and wild carp. B 15 showed the signal of positive reaction only in the hybridization to allogynogenetic silver crucian carp, and the remainder of 3 primers revealed themselves in dots with varying intensity of positive reaction. Because of that the primers Eos and Zm showed positive reaction in dot-blot hybridization and the intensity of dot-blot hybridization of Eo5 was markedly stronger than that of Z^, the analysis of band with difference which was amplified with Eos in RAPD was made. The DNA fragment was about 700 bp long.

Figure 3. Electrophoresis pattern of thrice amplified primer Eos.. Lane 1: positive control. Lane 2: known DNA reference of 500 bp. Lanes 3 & 4: amplified bands with difference for thrice amplified Eos. M: 1 ng ADNA/Eco RI + Hind III.

2.4 Primer E05 The 700 bp band amplified from primer Eos was cloned and recombined to plasmid, and then the recombinant was identified through Southern blot hybridization.

258

D. Xia et al

Figure 4. Identification of recombinant by double restriction endonuclease. Lane 1—recombinant identified by double restriction endonucleases. Lane 2—control for product of PCR amplification. Lane 3—recombined plasmid, pBluescript. M: 1 ugXDNA/Eco RI + Hind III.

Figure 5. Identification of recombinant by Southern blot hybridization. Autoradiograph of Southern blot hybridization membrane of the above recombinant.

Heterogenases Sperm Effect in Allogynogenetic Silver Crucian Carp iO 1

20

30

CAGGGAGGTC CCATGOGCCT

40

GACTGTCTAT

CAAACTGCAG

4J

AGTl'GAACAC AGGAGTGTTA GGCCACTCAA

CGCCTCAGTG

si

AGAGGAGCTC TAGCTCAGAG AACGTGCATG ATGGTTGAGA

m

259

GGATGACITA AATCTCAGGA GACTCTGAGA CATGCGAGAC

I(ii

ATGTGTAGCG CGTCTGGCAC GCGCAGCTTC

TCGAAGCAAA

201

TACCTAGCCA GGATATCAGG GAGCACATCA

CAAAGCATCA

2AX

TACCTGAAAA

Total nuntbei of bases is: 250, DMA sequence compositkm:

74 A,

58 C;

70 G;

48 T; 0 OTHER

Figure 6. DMA sequence of the band of 700 bp DNA fragment, amplified with primer Eo5.

The FAST BLAST software and data bank of DMA were used in identification of homology of the 700 bp DMA fragments. The result indicated that there was no homogeneous DMA sequence with the homology above 70% to be found between the 700 bp fragment and the data bank. 2.5 Southern blot hybridization The DMA genomew of the above-mentioned 3 kinds of experimental fish were digested with Eco R I , electrophoresed and transferred to hybridization membrane. Southern hybridisation was carried out on the membrane using the 700 bp fragment with difference as the probe which was obtained from the cloned and amplified E05. The results are shown in Fig. 6.

^^ Figure 7. Southern blot hybridization of the genomes. Panel A: Lane 1- control for products of PCR amplification. Lane 2- restricted genomic DNA of Fang Zheng silver crucian carp by Eco RI, Lane 3 restricted genomic DNA of allogynogenetic silver crucian carp by Eco RI, Lane 4- allogynogenetic silver crucian carp. Panel B: Autoradiograph of Southern blot hybridization membrane of the sample of panel A.

260

D. Xia et al.

It could be found in Fig 7 that, in the amplified bands loaded with DNA genome, no homogeneous DNA sequence of 700 bp existed in in giliation generations of silver crucian carp, while such homogenous DNA sequence existed in the bands of allogynogenetic silver crucian and wild carp. It suggested that this DNA fragment existed in the genome of allogynogenetic silver crucian carp originated from the genome of paternal fish of wild carp. 2.6 Observation One individual fish from each sample population of other allogynogenetic silver crucian carp and wild carp was taken randomly for Southern blot hybridization for the identification of the above-mentioned 700 bp fragment. The results showed that only one hybrid band was shown in the tested sample of wild carp. It indicates that that DNA fragment did not exist in all the individuals of allogynogenetic silver crucian carp, and the occurrence of it in the population might be a random event of integration of exogenous DNA fragment. Similarly, because 10 individual samples were taken for analysis in our experiment, the ratio of positive reaction in hybridization of DNA fragment (20.0%) was much higher than those obtained by cytological observation (0.66%, ratio of success in development of zygosis and fusion of female and male nuclei). It suggests that such hybridization of the DNA fragment might result in the heterogeneous sperm effect. 3

Prospect

The effect of heterogenous sperm that is brought forth in gynogenetic fish induced by hetrogenous sperm is a rather complicated problem. The event of hybridization of DNA fragment that we proved is just an important link of the course of performance of this effect in phenotype of hybrid fish. And, the problem: how this DNA fragment participates in the course of performance of this effect, whether through insertion or replacement of fragment for integration, or in the form of extrachromosome that exists in the zygote, and the problem that such hybridization of DNA fragment occurs at which stage of development, are unknown. We are going to investigate further on these problems through hybridization in situ and karyotype analysis. Reference 1.

Benter T., Optimization and reproducibility of random amplified polymorphic DNA in human. Analytical Biochemistry 230 (1996) pp. 92-100.

Heterogeneous Sperm Effect in Allogynogenetic Silver Crucian Carp 2.

261

Besnagd G., et al., Specifying introgressed region from H-argophylius in cultivated sunflower. Theoretical and Applied Genetics 94(1) (1997) pp. 131138. 3. Din Jun, Xie Yuefeng, The analysis of heterologous genetic materials in allogynogenetic crucian carp and its artificial hybrids. Acta Hydrobiologica Sinica. 17(1) (1993) pp. 22-26. 4. Dirlewanger E., et al., Theoretical Applied Genetics. 93(5-6) (1996) pp. 909919. 5. Fegan M., et al., Random amplified ploymorphic DNA markers reveal a high degree of genetic diversity in the entomopathogenic fangus Metarhizium anisopliae var. Anisopliae. J. Gen. Microbial 139 (1993) pp. 2075-2081. 6. Ge Wei, Shan Shixin, Fertilization biology of gynogenetic crucian carp (Carassius Auratus Gibelio), with a discussion on the reproductive modes of the naturally gynogenetic crucian crap. Acta Hydrobiologica Sinica. 16(2) (1992) pp. 97-100. 7. Kobayasi, H. Et al., On the hybrids, 4n ginbuns (C.auratus Langsdorfii)*Kinbuns(C.auratus subsp.) and their chromosome. Bull. Jap. Soc. Fish., 43(1) (1977) pp. 31-37. 8. Nei, M. et al., Proc. Natl. Acad. Sci. USA. 75 (1979) pp. 5269. 9. Pinto F. M., et al., Molecular genetic characterization of plant somatic hybrids. In-Vitro-Plant 31(2) (1995) pp. 96-100. 10. Tibayrene M., et al., Genetic characterization of six parasitic protozoa parity between random-primer DNA typing and multilocus enzyme ecextrophoresis, Proc. Natl. Acad. Sci. USA 90 (1993) pp. 1335-1339. 11. Yamashita M.H., et al., Breakdown of the sperm nuclear envelopes is a prerequisite for male pronucleus formation: direct evidence from the gynogenetic crucian carp, Carassius auratus longsdorifii. Dev. Biol. 137 (1990) pp. 155-160. 12. Zhang X. Y., et al., Characterization of genomes and chromosomes in partial amphiphiods of the hybrid Triticum aestivum X Thinopyrum ponticum by in situ hybridization, isozyme analysis and RAPD. Genome 39(6) (1996) pp. 1062-1071.

GENE EXPRESSION DURING CARROT SOMATIC EMBRYOGENESIS NAIHU W U 1 , FENGQIU D I A O 1 , M E I Q I ' ' 2 , Y U L A N C H E N G 1 , 2 , L E I Z H A N G 1 , M E U U A N H U A N G 2 , AND FAN CHEN1 1

Institute of Developmental

Biology, Chinese Academy of Sciences, Beijing 100871, P. R. China Email:

[email protected]

2

College of Life Sciences, Peking University, Beijing 100871, P. R. China

A regulated developmental system of carrot was established and was used to study gene expression during somatic embryogenesis. An improved cDNA representational difference analysis (RDA) was developed to isolate somatic embryogenesis related genes. ABA was thought as an important factor in sucrose signal transduction after endogenous ABA levels of carrot somatic embryos were assayed under different sucrose concentration in MS media and deregulated culture. A new cDNA fragment of LEA gene induced by ABA was obtained from carrot somatic embryo under regulated state. The result supports that the regulatedderegulated cultivation of carrot somatic embryo is similar to the dormancy -germination process of seeds.

1

Introduction

Higher plant embryogenesis has been studied intensively during the past century. The morphological and anatomical changes was described in detail to characterize the embryonic development. Since the initial description of somatic embryo production from carrot callus cells more than 40 years ago [1], this unique developmental potential has been recognized both as an important pathway for the regeneration of plants from cell culture systems and as a potential model for studying early regulatory and morphogenetic events in plant embryogenesis. In 1993, Mei-juan Huang et al [2] established a regulated developmental system of carrot somatic embryo. It was found that transfer of carrot callus cells into MS culture containing 6% sucrose led to arrest of development at the stage of cotyledon embryo, resulting in a state resembling dormancy (regulated cultivation). Later, when the arrested embryo was transferred into MS culture containing 1% sucrose, it was reactivated to start normal post-embryonic development (deregulated cultivation). This regulation-deregulation effect shows the same temporal sequence as the dormancy-germination process of seeds in nature. Dormancy and germination of seeds are important events in ontogeny in higher plants, yet it has been beset with difficulties to study these two events. For example, great heterogeneity of germination is found among individual seeds and it is extremely difficult, if not impossible, to control the threshold for initiating germination [3]. Now, the established system appears to be able to overcome the above difficulties, thus 263

264

TV. Wu et al.

promising to serve as a good model system for studying mechanisms underlying dormancy-germination of seeds. 2 2.1

Isolation of Somatic Embryogenesis Related Genes from Carrot Improved cDNA representational difference analysis (RDA)

In recent years, many new approaches have been proposed for isolating development-related genes without knowledge about their encoded products [4]. Among them, cDNA representational difference analysis (RDA) is widely used. Combining subtraction hybridization and PCR amplification, this technique was first used in detecting deleted DNA fragments in cancer cells by N. A. Lisitsyn et al. [5]. Later, M. Hubank and D. G. Schatz [6] succeeded in applying it to clone differentially expressed genes. Compared with other procedures, RDA shows the advantages of low false positive rate and high reproducibility. To prevent the Driver cDNA from being amplified, however, large amount of Driver cDNA has to be cut with Dpn II before hybridization to remove the adaptors at both ends. After the first round of PCR amplification of the hybridization mixture of Driver cDNA and Tester cDNA, single-stranded DNA have also to be removed with mung bean nuclease or SI nuclease. And it is only after the second round of amplification that the differential products can be obtained. Moreover, since both these nucleases are non-specific in cutting double stranded DNA, loss of target products may result. Finally, with increase of rounds of PCR amplification, it is very likely that the reproducibility may decrease, the false positive rate increase, and the cost of experiment multiply. To overcome these shortcomings, we have developed a novel cDNA RDA procedure (Figure 1). The key point is the high performance removal of the Driver cDNA by making use of biotin-labeling followed by coupling with StreptavidinParamagnetic Particles (SA-PMP, product of Promega Inc.) and final absorption with Magnetic Stand. The 5'-end of primer are first labeled with biotin during PCR amplification to form Driver cDNA. The 5'-end of each strand of Driver cDNA thus acquires a molecule of biotin. The Driver Bio-cDNA so prepared could then be subjected directly to subtraction hybridization without needing removal of the adaptors with Dpn II. Capitalizing upon their strong affinity for biotin, SA-PMPs are added to the mixture of hybridization to couple with the double-stranded DNA fragments of the biotinylated DrivenTester and DrivenDriver. After absorption with Magnetic Stand, there will be left no double-stranded DNA of either DrivenTester or DrivenDriver, with only the specific double-stranded DNA of TestenTester remaining in the solution. Since no single-stranded DNA is produced during PCR, there will be no need for digestion of the single-stranded product with mung bean nuclease or SI nuclease and for the second round of amplification. The target differential product can be directly obtained with only one round of amplification.

265

Gene Expression During Carrot Somatic Embryogenesis Driver

Tester I

Reverse transcript ion 3

cDNA

I

I

n

Digest with Dpn II R-Dpn-24/R-Dpn-l2 EKK1 |^J Adaptor I ligation

KVA^

~SSI ^ - • > " 1

r^c

n ^ Tt-'ETOn Primer

ESS tWWJ

Melt HI! ArapUfy

Primer 7 B-ESa-J • «^WI

T^^l

ETfSIE

•ww Digest with Dpn H

Z^S_B

Adaptorll ligation J-Dpn-240-Dpn-12 ^ [

3E3

°°°°°
« •nm*

Add streptavidin PMPs Magnetize with Magnet Stand T RemoveDiiveiBb-cDNA

\

Digest with Dpn II

Difference product • -

Clone and analysis

Figure 1. Schematic diagram of improved cDNA representational difference analysis (RDA) procedure.

266

N. Wu et al.

The merits of this method are: 1. Driver cDNA can be directly used in hybridization without prior digestion with Dpn II. 2. The specificity of amplification is greatly enhanced as the differential products could be obtained from hybridization mixture after only one round of PCR amplification. 3. Loss of target products is prevented since mung bean nuclease or SI nuclease is not needed to remove the singlestranded DNA that would otherwise appear. 4. Less time is needed for the operations. 5. Amounts of reagents needed can be much reduced, such as Dpn II, Taq DNA polymerase, primers, and dNTPs. As indicated by our test, the procedure developed by us is promising as a good method for isolating and cloning development-related genes both in higher animals and plants. 2.2

Gene cloning related to somatic embryogenesis

The carrot (Daucus carota L.) selected for experimentation was a commercial variety from Japan. The carrot somatic embryo was prepared according to the procedure as reported by Mei-juan HUANG et al [2]. The callus cells were directly suspended in the regulation culture. After they were induced into forming somatic cell embryos, the cultivation conditions were gradually adjusted in such a way to synchronize their development. In the regulated culture, development of the somatic cell embryos was stopped at the cotyledon stage. They were then transferred to the deregulation culture where, after a certain period of further culture, they were taken out for the experiment. In the present study, we synthesized cDNA from mRNA extracted from both carrot somatic embryos that had been cultured in regulated and for 12 hours in deregulated medium. The synthesized cDNA, after digestion with Dpn II, was ligated to adaptor I, R-Dpn-24/12 (R-Dpn-24, 5' AGCACTCTCCAGCCTCTCACCGCA-3'; R-Dpn-12, 5'-GATCTGCGGTGA-3'). Labeled and unlabeled R-Dpn24 were used to amplify Driver cDNA and Tester cDNA respectively. Purified Tester cDNA, after digestion with Dpn II, was ligated to adaptor II, J-Dpn-24/12 (J-Dpn-24, 5'-ACCGACGTCGACTATCCATGAACA-3'; J-Dpn-12, 5' GATCTGTTCATG-3'). The first round of hybridization was preformed after combining the Driver Bio-cDNA with Tester cDNA in the ratio of 40:1. With addition of TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0) to 50 u l the hybridization mixture was extracted with phenol/chloroform (1:1). It was precipitated at -20°C with 150 u.1 of ethanol and 5 u.1 of 3M NaAc. After centrifugation for 5 min at lOOOOg, the precipitate was dissolved in 0.75 ml of 0.5 x SSC. 0.9 ml of SA-PMP was absorbed with Magnetic Stand to remove the protection solution, washed with 0.45 ml of 0.5 x SSC three times, and finally resuspended in 0.15 ml of 0.5 x SSC. Then the hybridization mixture dissolved in 0.5 x SSC was added to the SA-PMP solution and placed at room temperature for 10 min to allow full combination of SA-PMP with Driver Bio-cDNA to form SA-PMP- Bio-cDNA. After absorption by Magnetic Stand for 30 sec, the supernatant was carefully drawn out. The above procedure was repeated with new SA-PMP to completely remove the double-stranded DNA of

Gene Expression During Carrot Somatic Emhryogenesis

267

DrivecDriver and Driver.Tester. 20 p.1 of supernatant were used for DPI amplification, using the same procedure as reported by M. Hubank and D. G. Scfaatz [6] except that the number of cycles was changed to 15. The PCR products were combined, cut with Dpn II and ligated to adaptor III, N-Dpn-24/12 (N-Dpn~24, S'-AGGCAACTGTGCTATCCGAGGGAA^*; N-Dpn12, S'-GATCTTCCCTCG-S1) for the second round of hybridization. Next, after ligation to adaptor II, the third round of hybridization was performed with a higher ratio of Driver Bio-cDNA and Tester cDNA. After three rounds of hybridization, distinct bands can be observed on the lane using as Tester cDNA of the carrot somatic embryo that had been cultured for 12 hours in deregulated medium ^Figure 2). The amplified products with the adaptor II were recovered and re-amplified by PCR using J-Dpn-24 as primers. Several DMA fragments were separated and recovered by agarose gel electrophoresis (Figure 3). Thesefragmentswere named as NR-1, NR-2 to NR-7. They were subcloned into pBluescript SK- vector and sequenced (Figure 4). M

A

B

C

D

E

F

G

H

Figure 2. Isolation of specific expression cDNA fragments by improved representational differential analysis (RDA). M: standard molecular weight marker (pBR322/Msp I). A, C, E, G: quiescent embryos as tester. B, D, F, H: somatic embryos in deregulated culture for 12 hrs as tester. A, B: amplification of representational population; C, D: amplification after the first round of subtractive hybridization; E, F: amplification after the second round of subtractive hybridization; G,: amplification after the third round of subtractive hybridization.

After sequencing the cDNA fragments, the homology comparison were performed with the GenBank/EMBUDDBJ data bank by a BLAST search. The result was shown that these cDNAfragmentswere allfirstfound from carrot. NR-1, NR-3, NR-6, and NR-7 have a high homology to die cDNA sequence of DnaJ gene from S. tuberosum (77%); glycolytic glyceraldehyde»3-phosphate dehydrogenase (91%); LEA76 typel protein gene from A. Thaliana (62%); and xyloglucan endotransglycosylase from tomato (68%) respectively. No higher homology sequences were found for NR-2, NR-4 and NR-5 sequences.

268

N.WuetaL M

1

2

3

4

Figure 3. Re-amplification of specific cDNA fragment. M: standard molecular weight matter (pBR322/Msp I). Lane 1-4: re-amplification of specific expression cDNA fragment named as NE5» NR6, NR1 and NR7.

Northern blotting was performed to detect the characterization of gene expression. Labeled with 32P-dCTP, the fragment was used as probe to undergo Northern hybridization with total RNA extracted from both carrot somatic embryos that had been cultured in regulated and for 12 hours in deregulated media. As demonstrated with the test, the gene was expressed specifically only in carrot somatic embryos that had been cultured for 12 hours in deregulated medium, thus verifying the validity of our method (Figure 5). 3

Effects of Sucrose Regulation Culture on Endogenous ABA Levels of Carrot Somatic Embryo

The plant hormone, abscisic acid (ABA), modulates growth and development of plant, particularly during seed formation and during response to environmental stresses involving loss of water [7]. Based on the study of the regulated developmental system of carrot somatic embryo [8], ABA may play an important role during the growth and development of carrot.

269

Gene Expression During Carrot Somatic Embryogenesis

50 NRl(321bp)

NR2(241bp)

NR3(219bp)

NR4(187bp)

NR5(420bp)

NR6(393bp)

NR7(307bp)

10 60 70 GATCATTGTT GAGCACACTG GTCTTCGTCA TATGCCTCTT TTCGATATTC ACATCATGCA TGTCTGTCAT CTGAATTGAA TTGCACTGCT CAGGGGTCAA TAAAGCTTTC CTCGCATAAA ATTTATCCCC TTAAATTGAT GATCACAAGA CGATGATGCG AAGTCACTGT AGTTGGCTGT ACGCCCGCCT TGTGATGCCC CTCTCATGTG CTGGCTCTTG TTGAATACTG TGACTGGATT GATCAAATTA AAGCTAGCTA GAATCCTAGG TTACACCGAA TGAAGCAGGT CGAGCATCTT AACTTTGTCA AGCTTGTTTC CCGAGTGGTT GACTTGATC GATCATAAGT TTTAGGATCT AGCTTAGAAT CATGAGCATT CATGCCACTC GAATGAGCTA AGAAATCTAC GCTGACCTCG GATCACTTGA TGTGATGTTA TAAGATATTC ACAAGAACTT ACTCTGCTCT TGTTAACAAC GATGGACCCT TGGCAATGGT AGCAAAAGCA CGCCCAGAAA CTCTCAATCA CCCCAGCCTT CCTTTCCCCA CTTCACAAGA GTTGTGGCCA TACAGAAGGC TTGAGCTCAG CTAATTGATC GATCTATGGC GTCTCATCAG GCGCATGCTC AGGAGAAAAC GGCCCAAGC TGCCAAGGACA ACCGGACAGT TGAGTCCAAG GCGGGCGGTG AAGGACAAGA AGACAGGAGG AGCCATGCAG GAGTCTGCCA AGGAGACTGC CATGTCTTCG GCGGCTGAGC GATCAAACCA GAGCTGGAAT AACACATTCG TGTGGATTGT CTCGAAGTCA ATCTCATCAT TGGTGACAGT GCCAGCAGAG AATCTTCCGA ACAAATACTC TTGGTCAAGA GACAAAGTGA GCCGATC

20

30

40

CACCCTCTGA GGCAGCTTGC GTGTTGTTTC GGCCTCGGAG GGACTCTCCA TGGCCTCTGA CGATC GAGGATGACT GAATGGAGCT AGTCATCAGC GGAAAGGGCA GTTTTCACCA TCAAGGAGGA GATGATGTGG CGATGGAAGG GTGGTACGAC

GCACCGCCAT TGTTTCCGAC TTCACATTCA GTAACACAGC AATCTACGCT TACATTGGCA

TCATATCTTC GCATCTCCTC TCCAGTTCCA TTCAAGAGCT AAAGTGAATG TTCCTTCATC

TGGGACATCC CTAGTCCAGT AATCCACATG CCCCTATTGA AGAAAATGAT GTCCGAGGGA TTTCCACAGA CCGAAATTGC AACGAATGGG

ATTTATGTTG CGGTCTTGAC CTGGCGTGGA TTCCTGATTG C AAGCTCAAGG ACTTTGTGGG TCTCAATGAT GATACAGCAC

TTCGCAACAT TCCAGCTGTG TATTATCACC TCGACGGTGA AGGTATAACA GTAAAGCACT ACAAAGAAAG AGCATCATTG TCTGGTGCAT CACCATCTCC GCTCTGGAAA CTACAAGAGG

GAAAAACGCC ACAGCAGAAC GTAACTTGAC GACGATC GCACGGCACT ATGCCGTCCA GCTCANCCAC GTTAATATTC CATGGAGAAT ATCTTACAAA TAGCCTCATG TATAACGACC

AGACCCAACA CTTTATTCAT TTGTTCCCCA

GAACAGAGCT AGGACAGATG AGGCCTCCGA GTTCAGACAG CATGTGAGAC GCCACCAAGG TGTGGCAGGG AGGTGAAGGG TGTTGTTCTT ATATGGACTC GTCCGGCTCC TTGCCAGGGA ATTCTTGGAC GGTCTCGACC

ACAAAGCTGG GCTGATACAA AATGGCTGGA GCAGCTATGT GGCTCAGGCG AGAAGGCTTC AAAGAGAAGA TATGGCTAAG TGTCACCTTT CCCGAAGAGT TTGGGAAGAT CCAGCTTGAG TGAAATCCCG GCCATTCAGT

TGAAGCCAAA TGAAAGACAA TCTGCCAGGG TTCGGACAAG GCCAAGGAGA TGAGATGGGA CCGGGGGGCT ATC GCCTTGCGCA TCCCCAAAAA AGGTAAAATG CTGCATGTCA AGCCAGAAGT ATCTTGGCAC

Figure 4. Nucleotide acid sequences of specific cDNA fragment.

GAGGATGTTC TCAGGTCCGA TTTCNCTGCA GTAGCAATTC ATCAAGTACA GAAGGCCTGT GAGCGACATA ATGTGCAGCA

270

N.WueiaL 1

2

Figure 5. Northern blot using NR-4 as a probe. 1. Quiescent embryos; 2. Somatic embryos in deregulated culture for 12 hrs.

Endogenous ABA levels of carrot somatic embryos and their organs were assayed under different sucrose concentration in MS media using ELISA. The results showed that endogenous ABA level increases during development of somatic embryo and reaches highest at cotyledonary stage. ABA levels of somatic embryo have little difference among different treatments at early developing stage and become larger when cotyledonary • embryos grow bigger. Endogenous ABA concentration of carrot somatic embryo maintained a higher level under 6% sucrose culture solution (Figure 6). The result revealed that a mysterious relationship exists between the inhibition of the carrot somatic embryogenesis by sucrose and the high level of endogenous ABA concentration.

0

250

t'm

200

60S

a

150

308

1

100

S ©

108

< so < 0

0

2

4

6

S

10

*Bsm (d) Figure 6. Changes of endogenous ABA levels of carrot somatic embryos under different sucrose concentration in MS media.

271

FW)

Gene Expression During Carrot Somatic Embryogenesis

^O) o> c

250 r 200 \ 150

Regulation

+*

c 0)

4-»

100

C

o o <

50

De-regulation

OQ

<

0

Figure 7. Effect of de-regulation culture on endogenous ABA levels of carrot somatic embryo.

Endogenous ABA levels of carrot somatic embryos and their organs were assayed under regulated and deregulated culture condition. Endogenous ABA level decreases obviously once regulated embryos into deregulation culture (Figure 7). Content of ABA in regulated embryo and their organs does not change in two months. These results implicate that sucrose concentration in media induces the changes of ABA levels in somatic embryo, while these changes depend on the different developing stages. High level of ABA in medium can maintain the quiescent embryo. There are some different effects of exogenous ABA and high concentration sucrose on carrot somatic embryos. ABA is probably an important factor in the transduction of sucrose signal. 4

Gene cloning of LEA protein from carrot and analysis of its expression features

Lea proteins (late embryogenesis abundant proteins) are a kind of proteins of low molecular weight that are formed during the late stage of seed development as the seeds are maturing in dormancy. Widely found in various seeds in higher plants, they are characterized by high affinity for water and high heat stability. Specific patterns are found in the expression of most lea proteins during the development of seeds. As a rule, the accumulation of lea proteins in seeds starts at the end of seed maturation and the initiation of desiccation, reaching a peak during the subsequent stage of desiccation, only to disappear rapidly within 2 days after germination [9]. Based on the known conservative sequences of lea gene, a pair of primers was first designed. The upperstream primer (USP, 5'-ATGGCGTCTCATCAGGAACAG-3') and the downstream primer (DSP, 5'-TAGCCATACCCTTCACCTGCT-3') were synthesized respectively. Total RNA was extracted from the

272

N. Wu et al.

carrot somatic embryo using TRIzol Reagent. DNase I was then added to remove the remaining DNA. cDNA was synthesized by reverse transcription, after preliminary denaturation under 94°C for 3 min, PCR was run for 30 cycles in the following manner: 94°C 1 min -> 52°C 1 min -» 72°C 3 min and finally it was extended under 72°C for 10 min. A unique band was obtained from the cDNA of carrot somatic embryo under regulated state. The RT-PCR product was subcloned into pBluescript SK- vector by T-A methods. The cDNA fragment was named as Dc226. Then, 381 bp nucleotide acids of the Dc226 fragment was sequenced with the dideoxy method and performed in Figure 8 (the primers underlined). atggcgtctcatcaggaacag CAAAGGCCAT GCTCAGGAGA ACAATGAAAG ACAAGGCCCA CCGAAATGGC TGGATCTGCC CAAGGATCAG ACAGGCAGCT GTGAAGGACA AGACATGTGA AGAAGACAGG AGGAGCCATG TTCTGAGATG GGAGAGTCTG GGGAAAGAGA AGACCGGGGG agcaggtgaagggtatggcta

AGCTACAAAG AAACAGGACA AGCTGCCAAG AGGGACCGGA ATGTTTCGGA GACGGCTCAG CAGGCCACCA CCAAGGAGAC GCTCATGTCT

CTGGTGAAGC GATGGCTGAT GACAAGGCCT CAGTTGAGTC CAAGGCGGGA GCGGCCAAGG AGGAGAAGGC TGCTGTGGCA TCGGCGGCTG

Figure 8. Nucleotide acid sequence of Dc226 (the primers underlined).

Homology analysis was based on comparison of the predicted amino acid sequence of our Dc226 (with 114 amino acid residues) and those of Dc3, Dcl4 and Dcl6, all members of the Dc3 family . The homologies shown were respectively 90.4%, 96.7% and 85.1% (Figure 9). Dc3 is a family of carrot lea proteins [10]. It is thus reasonable to regard Dc226 as the cDNA fragment of a new member of the Dc3 lea proteins. Northern blotting was used to analysis expression features of the Lea gene during carrot somatic embryogenesis. Total RNA were extracted from three groups of samples: those cultured under regulated state, those deregulated for 12 hours, and those deregulated for 24 hours. 30 |ig of RNA taken from each group were separated using formaldehyde electrophoresis and the products were transferred to filters of Zeta-Probe (Bio-Rad Inc.). As shown by Northern blotting using 32Plabelled Dc226 as probe, marked signals indicated lea gene expressed strongly during regulated culture while the expression was much reduced 12 hours after deregulation as indicated by the diminishing signal, which virtually disappeared 24 hours after deregulation (Figure 10). Northern analysis showed the same temporal sequence of gene expression in both carrot somatic embryos from regulated to deregulated state and natural seeds from dormancy to germination. During development, both zygotic embryos and somatic embryos of carrot undergo the following stages:

273

Gene Expression During Carrot Somatic Embryogenesis Globular Embryo -» Heart-shaped Embryo-* Torpedo-shaped Embryo -» Cotyledon Embryo. Dc3 Dcl4 Dcl6 Dc226 Dc3 Dcl4 Dcl6 Dc226

60 1 MAS Y Q D d S Y K Z ^ 1 60 1 MASHQDTJSYKME^GMQECT^ SYKAGE AKGHAQEKTGQMADliM KDKAQAAKDKASEJM AdsARDRTIVESKDQ.TGSJ 53 1

S

61:!YVSDKAGAVKDKTCET -LEJM^^-RIllQATKEKASElffiESAKETAVAGKEKTGGLMSSA 3 : WSDKAGAVKDKTCETAQAAIffiKTO^QATKEK^ ETGK3AAKEKTGGWi^TKEKASfeGESAKETAp!AGKEKTGGySSSA 61:YVSDKA@ 5 4 : YVSDKAGj-VKDKTCETAQAAKEKTGGiAMQATKEKASEMGESAKETAVAGKEKTGGLMSSA

Dc3 119 AEQVKGMAQQ Dcl4 63:AEQVKGMAQG Dcl6 121 AEQVKGMAQ<3 Dc226 113

118 62 120 112 128 72 130 113

Figure 9. Comparisons of predicted amino acid sequences between Dc226 and other members of Dc3 family.

When they reach the last stage of the above sequence, the zygotic embryos enter the maturation stage and become dormant; only under appropriate conditions (temperature, moisture and light), is the dormancy interrupted and germination is induced. On the other hand, somatic embryos develop directly into seedlings without the stage of dormancy [11]. As shown by our experiments, however, we could easily control the state of somatic embryo by merely changing the sucrose concentration in the culture. And great similarity was demonstrated between regulation-deregulation in somatic embryos and dormancy-germination in seeds as judged from changes both in morphology, gene expression and in endogenous hormone content. This suggests that in-depth study of the molecular mechanism underlying this regulation-deregulation system might shed light on the molecular details of gene expression dynamics in seed development. Compared with studies using seeds, this system of ours shows incomparable merits: high sensitivity, great convenience, easy synchronizabihty and easy accessibility to large amount of experimental materials. Using the method of representational difference analysis (RDA), we have been able to isolate specific carrot radicle development-related cDNA fragments from this system [12]. A plant hormone, ABA (abscisic acid) plays an important role in seed development. Studies on maize and Arabidopsis thaliana indicate that ABA promotes dormancy of seeds and mutants with impaired biosynhesis of ABA tend to show precocious germination [13]. There is ample evidence indicating that stresses of various kinds (salt stress, drought stress, osmotic stress) increase the content of endogenous ABA [14]. In addition, ABA induces many physiological effects, including changes of expression of lea gene [15]. In our experiment [16], ABA showed parallel effects as sucrose. It remains, however, to carry out further studies

274

N. WuetaL

to probe into the molecular mechanisms underlying the developmental airest of carrot somatic embryo at cotyledon embryo and to clarify the role of sucrose and ABA. A

i

IgiiMiiit

B

C

^^

mmmmmmiimm.. Figure 10. Northern blotting using Dc226 as probe. A: Total RNA underregulatedstate; B: Total UNA 12 hours after deregulation; C: Total RNA 24 hours after deregulation.

5

Acknowledgement

This work was supported by grant from National Natural Science Foundation of China (NSFC) and National Laboratory of Protein Engineering and Plant Genetic Engineering, Peking University. References 1. 2. 3. 4. 5. 6. 7.

Steward, F.C., Mapes, M.O., and Smith, J. Am. J. Bot. 45 (1958) pp. 693-703. Huang, M., Huang, S. et al. Chinese Science Bulletin 38 (1993) pp. 550-553. Bewley. The Plant Cell 9 (1997) pp. 1055-1066 Diao,F. Progress in Biotechnology 18 (1998) pp. 12-17. Lisitsyn, N., and Wigler, M. Science 259 (1993) pp. 946-951. Hubank, M. and Schatz, D.G. Nucleic Acids Res 22 (1994) pp. 5640-5648. Chanader, P.M, and Robertson, M. Annu Rev Plant Physiol Plant Mol Biol 45 (1994) pp. 113-141. 8. Cheng, Y„ Diao, F., Wu, N., et al. Acta Botanica Sinica. 41 (1999) pp. 761765. 9. Dure, L III. The Lea Protein of Higher Plants. In: Verma, Desh Pal S, ed. Control of Plant Gene Expression (CRC Press Inc., Bota Racon, 1993) pp. 325335.

Gene Expression During Carrot Somatic Embroygenesis

275

10. Seffens, W. S., Almoguera, C , Wilde, H.D., et al. Developmental Genetics, 1990,11:65-76. 11. Zimmerman, J. L. The Plant Cell 5 (1993) pp. 1411-1423. 12. Diao, F., Zhang, L., Huang, M., Wu N. Science in China, 2000, (in press ). 13. Leung, J., Giraudat, J. Annu Rev Plant Physiol Plant Mol Biol 49 (1998) pp. 199-222. 14. Giraudat, J. Parcy, F., Bertauche, N., et al. Plant Molecular Biology 26 (1994) pp. 1557-1577. 15. Rock, C. D., Quatrano, R.S. The role of hormones during seed development. In: Davies PJ (ed) Plant Hormones (Kliiver Academic Publishers, The Netherlands, 1995) pp. 671-697. 16. Qi, M., Chen, F., Huang, M., Wu, N. Chinese Science Bulletin, 2000, pp. 156160.

EPIGENETIC MODIFICATIONS IN MAIZE PARENTAL INBREDS AND HYBRIDS AND THEIR RELATIONSHIP TO HYBRID VIGOR AND STABILITY

ATHANASIOS S. TSAFTARIS, ALEXIOS N. POUDOROS, AND ELENI TANI

Department of Genetics and Plant Breeding, Aristotelian University ofThessaloniki, 54006 Thessaloniki, Greece E-mails A.S.T.: [email protected], ANP: [email protected] ET: [email protected] DNA methylation is an epigenetic genome-wide general regulatory mechanism that affects qualitatively and quantitatively the expression of many genes and has been considered important for the manifestation of heterosis. DNA methylation in maize was found to be genotype, tissue and developmental stage specific. Growth conditions affected the level and pattern of DNA methylation. Our studies indicated that hybrids were less methylated than their parental inbreds and remained less methylated under stress. These findings support the hypothesis that selection of inbreds may lead to gradual accumulation of methylated sites, which could be released and/or re-patterned when the lines are crossed to generate hybrids.

1

Introduction

Hybrid vigor or heterosis usually refers to the increase in size or growth rate of offspring over parents. This can be related with different characters of agronomic importance, including yield. Elucidation of the molecular mechanism(s) of hybrid vigor remains a major challenge. Biometrical approaches can only evaluate average genetic effects on heterosis while studies using physiological and biochemical approaches may contribute to a limited understanding of this phenomenon. However, studies closer to the molecular level may provide the most significant information, which may ultimately help to elucidate the genetic basis of heterosis. In order to obtain genetic information concerning heterosis at the molecular level, we examined the expression of 35 random genes in different maize tissues and developmental stages. As genetic material we tested three maize parental inbreds and two of their hybrids, one being highly heterotic and the other non-heterotic. Our results indicated significant quantitative differences in the expression of the 35 tested genes, among the 5 genotypes, in every developmental stage. The highly heterotic hybrid exceeded the non-heterotic as well as the inbreds in genome activity (average of the mRNA quantities for the 35 tested genes) at the first three developmental stages. Similarly, significant differences were observed after comparing the male parents of the two half-sib hybrids [1]. Thus, expression of the tested genes was developmental stage and genotype depended. These results indicated that stable (under different conditions) and increased expression of certain loci might be important in the phenotypic manifestation of vigor, underlining the 277

278

A. S. Tsaftaris et al.

significance of regulatory mechanisms controlling gene expression in the molecular basis of heterosis. A primary such genome-wide regulatory mechanism involves post-replicative covalent modification of DNA by methylation of cytosine bases. Typically in plants, the modification is methylation of cytosine bases in the dinucleotide CpG and the trinucleotide CpNpG, where N could be any of the four nucleotide bases [2]. Because the methylated sequence is palindromic, both strands of DNA can be methylated. The modification is inherited epigenetically because of the existence of a system that recognizes hemimethylated sequences (with one strand modified) and converts them to fully methylated (with both strands modified). Removing the methyl group can reverse the epigenetic state. DNA methylation is involved in a variety of cellular processes, such as differentiation, cell cycle progression, Xchromosome inactivation and genomic imprinting. The mechanism by which DNA methylation influences these processes relies on its ability to repress gene expression [3]. DNA methylation in plants has been shown to be involved in regulation of gene expression, creation of genetic and epigenetic variation, hybrid vigor, and transgene inactivation, among others [4]. In order to examine the possible role of DNA methylation in the manifestation of heterosis we performed detailed studies on the extent of cytosine methylation in maize genomic DNA, its variation among different genotypes (parental inbred lines and hybrids) and developmental stage. We also examined the role of stress conditions in the field as a factor affecting genomic DNA methylation in relation with heterosis. 2

DNA Methylation in Maize

The percentage of 5-methylcytosine in eucaryotic DNA varies over a wide range, from 0.03% in some insects, 2% to 8% in mammals, and can be as high as 50%, of total cytosine molecules in higher plants [5]. Maize DNA methylation has been estimated at the average level of 27.2% [6]. Variation of the level of DNA methylation has been reported in different plant species, including napiergrass (Pennisetum purpureum Schum.) [7], Daucus carota [8], and tomato [9]. In our studies we estimated the percentage of 5-methylcytosine in maize genomic DNA with HPLC in several genotypes, in different tissues, developmental stages and growth conditions [10]. The tested genotypes were classified in 3 groups. The first group included three inbreds and two of their hybrids one being highly heterotic and the other nonheterotic. Methylation level was estimated in the leaf at 4 different developmental stages. DNA methylation of these genotypes was between 27.2% and 30.6% with an average value of 28.4%. The second group included 10 S2 lines of similar pedigree, selected from cycle six (C6) of a population, originated from the F 2 of a single Fi

Epigenetic Modifications in Maize Parental Inbreds and Hybrids

279

hybrid. DNA methylation of the 10 S2 second cycle lines was examined along with DNA methylation of the original progenitor parental inbreds and their Fi hybrid. The one parental line (a very low yielding inbred) had 31.4% methylation, one of the highest values observed. The second parental line had 28.3% methylation and their Fj hybrid 27.4%. Methylation of the 10 S2 lines varied from 25.8% to 23.8% with average methylation 24.9%, and was significantly lower than that of their progenitors. This is an indication that breeding for higher yield in the original F 2 population for 5 continuous generations using the mass honeycomb selection design was accompanied with a concomitant lowering of the methylation level in the derived S2 second cycle lines. The third group of plant material included inbred lines of different pedigree, commonly used as parents in hybrid seed production. DNA methylation for these commercial lines varied from 32.1% to 27.7% and all lines had a higher methylation level compared to the S2 selected lines. Leaf DNA methylation of the inbreds of different pedigree was higher than the methylation of the shoot. The mean leaf methylation was 29.6% and differed significantly from the 26.8% mean methylation of the shoot. The methylation of several sections of the leaf of the heterotic hybrid (first group) was also examined. The leaf of this hybrid, as an organ, had 5-mC content 28.5%. Sections of the leaf found to be methylated at 27.6% for midrib and sheath, 31.6% for the upper blade section and 31.4% for the middle blade section. Methylation of the leaf was higher than that of roots. The roots had 24.6% methylation, one of the lower values recorded in this study. In conclusion, maize DNA methylation in different genotypes, tissues and developmental stages was found to vary from 32.8% to 21.8%. The average methylation of all samples was 27.6% similar to the 27.2% estimated by Amasino et al. [5]. 3

Changes of DNA Methylation during Maize Development

Twenty-five years ago Riggs [11] and Holliday and Pugh [12] proposed that DNA methylation was part of a system controlling gene expression in mammalian cells. Even though the term control suggests reversibility, subsequent work has shown that the most likely function of DNA methylation is permanent gene suppression mediated through the modification of CpG islands located in the promoter regions of genes. Accumulated data convincingly demonstrate that methylation of CpG-rich promoters can be extensive, developmentally regulated and result in downregulation, presumably by the setting up of an inactive chromatin configuration. Importantly, this permanent inactivation can normally only be reversed by passage through the germline or during early development, because there are no known examples in which a methylated CpG island has been shown to become demethylated after implantation. Therefore, the data are most consistent

280

A. S. Tsaftaris et al.

with the idea that CpG island methylation acts as a 'one-way street' to inactivate one or two alleles of a gene or to silence promoters permanently throughout the life of an organism [13]. It is unclear, however, how well the mammalian methylation erasure-resetting pattern fits plants, because it is difficult to measure cytosine methylation levels in early plant development due to the inaccessibility of very early plant embryos, which are encased in maternal tissue. It is possible, though, to measure DNA methylation content in pollen and post-embryonic tissues of varying age. Information from studies of this type suggests that there is a gradual rise in 5methylcytosine levels in post-embryonic tissues produced by meristems at positions further from the base of the plant (i.e. tissues of increasing age). Genetic studies of transposon systems in maize also demonstrate an age-dependent gradient of increasing epigenetic modification, which is correlated with DNA methylation [14]. We examined the role of development on DNA methylation using one group of genotypes including three inbreds and two of their hybrids one being highly heterotic and the other non-heterotic. Total methylation was estimated at 4 developmental stages. Samples were prepared from the three upper leaves including the interval shoot portion of plants growing in the field for 20 days (stage 1) and 32 days (stage 2) and the leaf of the first emerging ear at 47 days (stage 3) and 66 days (stage 4). The average DNA methylation increased from 27.48% at stage 1 to 29.3% at stage 2 for the three upper leaves of these materials. However, there was no difference for the leaf methylation between stage 3 and stage 4, which was 28.4% at both stages. 4

The Effects of Growth Conditions

In order to study the impact of the environment we estimated DNA methylation in a group of genotypes grown under two different plant densities: 1.5m (spaced) and 25cm (dense) distance between individual plants with a density of 0.513 plants/m2 and 18.5 plants/m2 respectively. The tested material included five S4 lines selected from the F2 of the commercial hybrid 'Pioneer 3183'. Two more unrelated lines were also included in the experiment for comparison. Samples were prepared at two developmental stages, an early stage where the density stress was negligible and a late stage where the density stress was fully imposed. In spaced planting, inbreds showed an average 27.18% and 27.97% methylation at the two stages, respectively, which is comparable to that of older commercially used inbreds, but significantly higher than the selected high yielding S2 lines from the F5 selected population (see above). In dense planting the same inbreds showed 27.54% methylation at the early stage and 29.19% methylation at the late stage. Thus, at the early stage when the density-induced stress is negligible, inbreds had no difference in methylation at the two planting conditions. Contrary, at the late stage when density-induced stress was

Epigenetic Modifications in Maize Parental Inbreds and Hybrids

281

established, methylation was significantly higher at the dense than that at the spaced planting conditions. The situation is different in hybrids. At the late stage in dense conditions a group of hybrids showed exactly the same methylation in dense and in spaced planting. This indicates that the hybrids are more resistant to density-induced stress as it is exemplified at least in their total genome methylation [10]. These results are in agreement with previous data obtained from studies of the role of growth conditions on the methylation of another individual sequence, namely the Ac element of maize. When we quantified the frequency of demethylation (thus activation) of a methylated Ac element, we found that for three consecutive years demethylation of the methylated element was significantly more frequent in plants grown under spaced than that grown under the more stressful, dense conditions. [15]. 5

Site-specific Methylation Changes Induced by Stress

The effect of density stress on the methylation at random sites of genomic DNA in maize was also examined using the Coupled Restriction Enzyme Digestion and Random Amplification (CRED-RA) assay. The CRED-RA technique is based on the assumption that a DNA fragment cannot be amplified if it contains a specific restriction site in the region between two primer binding sites and that site is cut by RE digestion prior to PCR. If DNA methylation of the restriction site prevents digestion at the site within the genomic fragment, the fragment can be amplified (Fig. 1). However the amplified product will be susceptible to cleavage because the restriction site will not be methylated during DNA amplification. Thus, comparing the banding patterns of a sample amplified without restriction and the same sample amplified after restriction can reveal the presence of DNA methylation. CRED-RA was used to identify methylation and to map methylation polymorphism in citrus [16] and maize [17]. In our study genomic DNA was isolated from the same genotypes grown under spaced and dense planting, and was digested with Hpa II prior to PCR amplification with random lOmer primers. PCR samples of the same genotype grown under dense and spaced conditions were electrophoresed side by side. Thus, the presence of a band in the sample from only one of the two planting densities indicates the presence of a methylated site for the tested genotype under this planting density. The presence of a band in both samples for the same genotype indicates either the absence of Hpall sites between the primer locations or that any Hpall sites in this region are methylated at both planting densities. Plant material included 6 inbred lines and 6 hybrids resulted from crosses of these inbreds tested with 10 primers, and 7 unrelated hybrids tested with 8 primers. The random lOmer primers (AB0320-kit 7) were from Advanced Biotechnologies, Surrey, UK. An example of the

282

A. S. Tsaftaris et al

amplification pattern detected with the CRED-RA assay in these materials is presented in Fig.2.

( Coupled Restriction. Eni.ym«.Mgestlc»n - Random Amplification ) Primer B

CCGG Primer A DIGESTION WITH Hpa II

GLEAVAGl OF CCGG SITE til

NO GLEAVAGE OF C*CGG SITE PCR AMPLIFICATION

NO AMPLIFICATION

| EELECTROPHORESIS L OF PCR-PRODUCTS JCTS |

I

I

Figure 1. Schematic representation of the concept of the CRED-RA assay used for site-specific detection of DNA methylation.

&FACED Ikb A

B AsE

DENSE A

B AxB

Figure 2. An example of the CRED-RA analysis in DNA of two inbred lines A and B and their hybrid AxB, grown under dense or spaced conditions is shown. The sequences of 700bp and 450bp indicated with arrows are methylated in dense planting in line A but unmethylated in widely spaced planting. Hie 450bp sequence is methylated in the A x B hybrid under spaced but not under dense planting.

Epigenetic Modifications in Maize Parental Inbreds and Hybrids

283

The results after examining the accumulation of methylated sites under spaced and dense planting in inbreds and hybrids are shown in Fig. 3. The bar designated as equal in Fig. 3 represents either the absence of Hpall sites between the primer locations or that any Hpall sites in this region are methylated at both planting densities. More sites of this category are present in hybrids. The bar designated as dense in the figure represents the appearance of a band only in the dense planting indicating the presence of a site(s) methylated only in dense planting. Inbreds have significantly higher number of such sites than hybrids. The bar designated as spaced represents the appearance of a band only in the spaced planting indicating the presence of methylated site(s) only in dense planting. A low number of such sites are present in hybrids and even lower in inbreds.

80 Q Spaced • Dense • Equal

70 60 50 40 30 20 10 0 Hybrids

Inbreds

Figure 3. Estimation of methylated sites present in inbreds and hybrids under dense and spaced planting using the CRED-RA analysis. Dense represents the appearance of a band only in the dense planting indicating the presence of a site(s) methylated only in dense planting. Spaced represents the appearance of a band only in the spaced planting indicating the presence of a site(s) methylated only in spaced planting. Equal represents the presence of a band at both planting densities indicating either the absence of Hpall sites between the primer locations or that any Hpall sites in this region are methylated at both planting densities.

In conclusion significant differences were revealed between inbreds and hybrids examined for site-specific methylation using the CRED-RA analysis. Fi hybrids accumulated fewer methylated sites than inbreds. There were more methylated sites in inbreds under dense planting (more stressful) conditions in relation to spaced planting. These data indicate that hybrids may be more resistant to density-induced methylation in comparison to their parents.

284 6

A. S. Tsaftaris et al. Conclusion

Data accumulated from our studies on DNA methylation in maize show that there is extensive variation depended upon the genotype, the developmental stage and the growth conditions. Although these differences occur in a very narrow range, they could be significant if critical cytosine residues (e.g. in promoter regions of genes) are preferably involved. The lower DNA methylation in maize inbred lines bred for high yield could indicate that selection is changing their methylation level. Mass honeycomb selection was the breeding program from which the 10 second cycle S2 lines were originated. This selection procedure is based upon the yielding ability of the individual plant, grown under spaced planting conditions. The uniform low methylation of all the S2 lines tested suggests that their improvement may be related with decrease of methylation after 8 continuous generations of selection in wide spacing. DNA methylation of the second cycle S2 lines of similar pedigree was lower and varied less than that of the commercial inbreds of different pedigree. Apparently, the improvement of the commercial inbreds, which is almost exclusively based upon selection schemes testing their combining ability, seems to be not accompanied by a concomitant decrease of DNA methylation. A systematic evaluation of methylation in inbred lines of different eras in the history of maize, particularly today's inbreds bred for line yield per se, always in conjunction with their yield and combining ability, will be highly informative. Stressful growth conditions result in more methylated DNA (less expressed) and, in general, vigorous hybrids are more resistant to such density induced methylation and suppression of genome activity in their genomic DNA. This resistance of the hybrid genome to genome methylation under different stresses, and consequently, avoidance of suppression of many of its genes could be at the core of high F] yield and, maybe more importantly, F] stable yield. Taking advantage of existing criteria for evaluating stability of performance of different hybrids [18] experiments are underway for a systematic study of possible correlation between stability of performance and methylation changes in the genotype (A. S. Tsaftaris and colleagues, unpublished data). If such correlation exists then time and money could be saved from multiyear and multisite evaluation of the genetic material, since criteria and tests could be devised for fast evaluation of genotypes for tolerance to density induced integrated stress, securing later stability of performance. As emphasis in line selection has been shifting to more productive lines that could generate high yielding hybrids with stable yield in higher densities (substituting double and triple hybrids with single F! seeds), there was a concomitant shift in parental lines giving less heterosis, less inbreeding depression, with less methylation. What will be the result if this trend with crossbreeding species such as maize will continue (or even be intensified by selecting for line performance per se at the isolation environment that promotes demethylation of many genes)?

Epigenetic Modifications in Maize Parental Inbreds and Hybrids

285

To what extent epigenetic phenomena contribute to hybrid vigor will determine the answer to the above question. Bringing for instance in homozygous conditions, in an inbred, many gene pairs currently in repulsion phase linkage (which is apparently mostly responsible for pseudo-overdominant effects on hybrid vigor) will require time. Thus, in the meantime, exploiting them in heterozygous conditions in F! hybrids will continue to be a fast solution to the pressing and competitive demands for more productive crops. On the contrary, if epigenetic changes like DNA methylation also contribute to hybrid vigor, transition to more productive inbreds will be faster since demethylation of different genes is not generally related to their linkage. Selection for line per se, is what nature and breeders are doing with crops that are obligate inbreeders. Despite their homozygosity, high yielding inbred lines developed in these species are widely cultivated. References 1.

2.

3.

4. 5. 6.

7.

8.

Tsaftaris, A.S., and Polidoros, A.N., Studying the expression of genes in maize parental inbreds and their heterotic and non-heterotic hybrids. In: A. Bianci, E. Lupotto and M. Motto (eds.), Proc. XVI Eucarpia Maize and Sorghum Conference. (Bergamo, Italy, 1993) pp. 283-292. Gruenbaum, Y., Naveh-Many, T., Cedar, H., and Razin, A., Sequence specificity of methylation in higher plant DNA. Nature 292 (1981) pp. 860862. Yisraeli, J., Szyf, M., Gene methylation patterns and expression. In A. Razin, H. Cedar, A.D. Riggs, eds, DNA methylation. Biochemistry and biological .sigm'ficance, (Springer-Verlag, New York, 1984) pp. 353-378. Tsaftaris, A.S. and Polidoros, A.N., DNA methylation and plant breeding. Plant Breed. Rev. 18: (2000) pp. 87-176. Doerfler, W., DNA methylation and gene activity. Ann Rev Biochem 52 (1983) pp: 93-124. Amasino, R.M., John, M.C., Klaas, M., Crowell, D.N., Role of DNA methylation in the regulation of gene expression in plants. In G.A. Clawson, D.B. Willis, A. Weissbach, P.A. Jones, eds, Nucleic acid methylation (Alan R. Liss, New York, 1990) pp. 187-198. Morrish F.M., Vasil I.K., DNA methylation and embryogenic competence in leaves and callus of napiergrass (Pennisetum purpureum Schum). Plant Physiol. 90 (1989) pp. 37-40. Palmgren, G., Mattsson, O., Okkels, F.T., Specific levels of DNA methylation in various tissues, cell lines, and cell types of Daucus carota. Plat Physiol. 95 (1991) pp. 174-178.

286 9.

10.

11. 12. 13. 14. 15. 16.

17.

18.

A. S. Tsaftaris et al. Messeguer, R., Ganal, M., Steffens, J.C, Tanskley, S.D., Characterization of the level, target sites and inheritance of cytosine methylation in tomato nuclear DNA. Plant Mol Biol 16 (1991) pp. 753-770. Tsaftaris, A.S., Kafka, M , Polidoros, A., and Tani, E., Epigenetic changes in maize DNA and heterosis, in press. In: J. Coors (eds.), The genetics and exploitation of heterosis in crops (Am. Soc. Agronomy, Madison, WI, 1999). Riggs, A.D., X inactivation, differentiation, and DNA methylation. Cytogenet. Cell Genet.14 (1975) pp. 9-25. Holliday, R. and Pugh, J.E., DNA modification mechanisms and gene activity during development. Science 187 (1975) pp. 226-232. Jones, P.A., The DNA methylation paradox. Trends Genet. 15 (1999) pp. 3437. Richards, E.J., DNA methylation and plant development. Trends Genet. 13 (1997) pp. 319-323. Kafka, M. 1996. Control of gene expression in lines and hybrids of maize (Zea mays). Ph.D. Thesis. Aristotelian Univ., Thessaloniki, Greece. Cai, Q., Guy, C.L., and Moore, G.A., Detection of cytosine methylation and mapping of a gene influencing cytosine methylation in the genome of Citrus. Genome 39 (1996) pp. 235-242. Tsaftaris, A.S., Kafka, M., and Polidoros, A.N., Epigenetic modifications of total genomic maize DNA: The role of growth conditions. In: A.S. Tsaftaris (eds.), Genetics, Biotechnology and Breeding of Maize and Sorghum. (Royal Society of Chemistry, Cambridge, UK, 1997) pp. 125-130. Fasoula, D.A. and Fasoula, V.A., Gene action and plant breeding. Plant Breed. Rev. 15 (1997) pp. 315-374.

C/S-ELEMENTS AND TRANSCRIPTION FACTORS REGULATING ANTIOXIDANT GENE EXPRESSION IN RESPONSE TO BIOTIC AND ABIOTIC SIGNALS

JOHN G. SCANDALIOS*, AND LINGQIANG M. GUAN

Department

of Genetics, North Carolina State University, Raleigh, N.C. 27695-7614,

USA

E-mail: [email protected] * Corresponding author Although O2 is essential for aerobic life, when reduced, it produces unstable reactive oxygen species (ROS) that are highly toxic and, if unabated, can lead to severe physiological dysfunctions and cell death. At certain cellular concentrations, ROS are also known to play roles as second messengers in signal transduction pathways in both plant and animal cells. ROS are formed during normal metabolism and under extreme biotic and abiotic environmental conditions. Antioxidant enzymes play a key role in protecting cells against oxidative damage by scavenging and/or modulating ROS levels. However, the underlying mechanisms by which antioxidant genes perceive environmental cues to effect the antioxidant defense system are as yet not understood. The maize antioxidant defense system has been extensively investigated in our laboratory, and recent results suggest that component genes are significantly induced by many environmental factors, directly or indirectly, related to oxidative stress. Our studies indicate that the observed changes in antioxidant gene expression are likely caused by stress-mediated metabolic alterations leading to elevated ROS levels, and that each gene promoter contains multiple regulatory motifs to perceive different stress signals at different developmental stages and under oxidative stress. Hence, the maize antioxidant defense system is an ideal model to decipher the mechanisms by which these important genes are regulated to perceive external cues to effect a response by eukaryotes against the ravages of oxidative stress.

1

Introduction

The evolution of oxygenic photosynthesis altered the Earth's atmosphere and enabling the emergence and sustaining of aerobic life. However, the incomplete reduction of oxygen to water during normal aerobic metabolism generates reactive oxygen species (ROS) that pose a serious threat to all organisms. ROS are also crucial for many physiologic processes, and usually exist in the cell in a balance with antioxidants. But, excess ROS resulting from exposure to environmental oxidants, toxicants, radiation, or numerous biostressors, perturbs cellular redox balance (to a more oxidized state) and disrupts normal biological functions. This condition is referred to as "oxidative stress" and may be detrimental to the organism by contributing to the pathogenesis of disease and aging, and numerous physiologic dysfunctions leading to cell death.

287

288

J. G. Scandalios & L. M. Guan

To counteract the oxidant effects of ROS and to restore a state of redox balance, cells must reset critical homeostatic parameters. Changes associated with oxidative damage and restoration of cellular homeostasis often lead to "activation" or "silencing" of genes encoding regulatory transcription factors, antioxidant defense genes and enzymes, and structural proteins. We have elaborated on elucidating aspects of the antioxidant defenses in maize. We isolated and characterized the three catalase (Cat) and nine of the superoxide dismutase (Sod) genes and their products. The architecture of these genes has been resolved, and motifs have been identified in their promoters, which suggest that the differential responses of these genes to various developmental and environmental signals may be determined by specific sequences within each gene. How these genes are regulated at the transcriptional level by internal, genetically programmed cues and environmental signals, is a key to understanding the specific role(s) of each gene and its product in maize, and enhances our efforts in engineering organisms for increased tolerance to oxidative stress. 2

Formation of Reactive Oxygen Species

During respiration, molecular oxygen accepts four electrons to produce two molecules of water; however, because of spin restrictions, O2 accepts electrons one at a time [1], producing ROS, such as superoxide radical (02*~), hydrogen peroxide (H2O2), and hydroxyl radical (»OH). 02"~ can be produced from many biological sources. In mitochondria, C>2*~ is primarily generated at the site of the respiratory chain [2]. In plants, illuminated chloroplasts generate 02*~ by the occasional transfer of an electron from an excited chlorophyll molecule to molecular oxygen [3]. Most, but not all, H2O2 production occurs in peroxisomes [4] by oxidases that can remove two electrons from substrates and transfer them to 0 2 to form H 2 0 2 . The (J-oxidation of fatty acids and peroxisomal photorespiration reactions also produce H 2 0 2 [5]. The »OH can be formed through the transfer of an electron from 02"~ to H2O2 (Haber-Weiss Rx) or in a Fentontype reaction of H 2 0 2 with Fe2+ [6]. ROS can also be formed under various adverse environmental conditions. 3

Toxic Effects of Oxygen Free Radicals

ROS can cause significant cellular stress or severe metabolic dysfunctions [7]. 02" can cause peroxidation of membrane lipids, resulting in loss of membrane integrity and inactivation of membrane bound enzymes. H 2 0 2 can react with SOD and other metalloproteins, to release the metal ion from the protein resulting in loss of its biological activity. H 2 0 2 can inactivate some Calvin cycle enzymes by oxidizing critical thiols in the enzymes [3]. One very important feature of H 2 0 2 is that it can penetrate cell membranes rapidly [1] leading to toxic effects at different subcellular

Antioxidant Gene Expression

289

locations. The major cytotoxic effect of H 2 0 2 is due to its role in generating »OH [8], a powerful oxidant that indiscriminately attacks and damages proteins, lipids, and DNA [9]. Intracellular accumulation of ROS generally cannot be tolerated; consequently, organisms evolved both enzymatic and non-enzymatic antioxidant defenses. Two of the most important antioxidant enzymes are catalase (CAT) and superoxide dismutase (SOD). 4

The Maize Antioxidant Defense Genes: A Model System

Among eukaryotes, the CAT and SOD and their encoding genes have been most extensively investigated in maize [10], and is an ideal model system to investigate the underlying regulatory mechanisms for these important genes. Unlike animals, plants posses multiple genes encoding multiple, but functionally distinct isozymes allowing for precise identification of responses to a variety of signals. In maize, three unlinked structural genes; Catl, Catl, and Cat3 encode three distinct CAT isozymes (CAT-1, CAT-2, CAT-3). Expression of each of the Cat genes is highly regulated spatially, temporally, and in response to various environmental signals. Maize SOD exists as nine distinct isozymes encoded by the unlinked nuclear genes Sodl, Sod2, Sod3.1, Sod3.2, Sod3.3, Sod3.4, Sod4, Sod4A, and Sod5. SOD-2, SOD4, SOD-4A, and SOD-5 are cytosolic enzymes; SOD-1 is a chloroplast enzyme, while SOD-3.1, SOD-3.2, SOD3.3, and SOD-3.4 are compartmentalized in mitochondria [10].

5

ROS Production by Environmental Stresses

ROS have been implicated in the damaging effects of various environmental stress conditions due to such factors as xenobiotics (herbicides), ozone, drought, and pathogen attack [7,10]. Some of the stressors we have investigated include: (a) Xenobiotics, natural or synthetic substances that cannot be utilized by plants for energy-yielding processes. In most cases xenobiotics are toxic compounds; chemicals commonly classified as xenobiotics include pesticides, herbicides, and arsenicals [11]. Xenobiotics that interfere with chloroplastic or mitochondrial electron transfer systems produce ROS. Herbicides that block photosynthetic electron transport such as atrazine, allow excitation energy to be transferred from chlorophyll to carotenoids, which will be damaged progressively. Once carotenoids are destroyed, the light energy may be transferred to oxygen, generating ROS that can initiate membrane lipid peroxidation. Specific xenobiotics examined include: Paraquat, or methyl-viologen (MV), a redox active compound that interacts with photosystem I (PSI), acting as electron acceptor and interrupting electron transfer. MV, by accepting electrons from one of the iron-sulfur centers of PSI, forms the bipyridyl cation radical that is unstable and reacts rapidly with oxygen to form

290

J. G. Scandalios & L. M. Guan

superoxide. The plant is able to detoxify 02*~ via SOD, producing H 2 0 2 and 0 2 . H 2 0 2 is further detoxified by catalase. (b) Cercosporin is a fungal toxin produced by the phytopathogenic fungi Cercospora sp. in the presence of light. The toxin molecule absorbs energy and is converted to an excited triplet state that reacts with molecular oxygen producing both singlet oxygen ( ] 0 2 ) and 02"~. These ROS are directly toxic to the host's tissues, causing oxidation of cellular components, membrane leakage, and cell death [12]. (c) Ozone is a powerful oxidant and the major component of air pollution responsible for causing significant damage to plants. The detrimental effects of ozone are mainly due to the production of excess amounts of ROS through various chemical reactions leading to reduced photosynthesis, electrolyte leakage, DNA mutation, and accelerated senescence [13]. (d) Drought, osmotic stress and wounding greatly affect crop production. There are indications that ROS are produced under these stress conditions and mediate the stress response. The plant hormone abscisic acid (ABA) appears to mediate physiological processes in response to osmotic stress. ABA levels increase in tissues subjected to osmotic stress, salt, dehydration, and cold [14]. ABA is also involved in the wounding response in plants. Thus, ABA might be a common mediator for response to such stresses, and may act by inducing production of ROS in cells through currently unknown mechanisms.

6

Molecular Mechanisms in Response to ROS

To minimize the damaging effects of ROS, aerobes evolved numerous non-enzymatic and enzymatic antioxidant defense systems. The latter include CAT, peroxidases (Px), SOD, and glutathione S-transferases (GST). The mechanisms by which cells sense ROS are not well understand, but a number of transcriptional factors that regulate the expression of antioxidant genes are well characterized. In E. coli and other prokaryotes, the transcription factor OxyR activates a number of genes inducible by H 2 0 2 , while the transcription factors SoxR/SoxS mediate responses to 02*~ [15]. In yeast, there also exist two distinct adaptive stress responses, one towards H 2 0 2 and one towards 0 2 *116]. In higher eukaryotes, oxidative stress responses are more complex and modulated by several different regulators. In mammalian systems, two classes of transcription factors, nuclear factor KB (NF-KB) and activator protein-1 (AP-7) are involved in the regulation of the oxidative stress response [17,18]. Antioxidant-specific gene induction has been reported for a number of enzymes involved in xenobiotic metabolism, which is mediated by a regulatory motif common in the promoter region of these genes. This motif, named "antioxidant responsive element" (ARE), is present in the promoter of mammalian GST, metalothioneine-I, and MnSod genes [19-21]. An ARE motif has not been identified in any plant Gst gene. However, ARE-like motifs are present in the promoter region of the three maize Cat genes [22], but the effects of antioxidants on the expression of these genes are not yet resolved. In plants, ROS have been implicated

Antioxidant Gene Expression

291

in the damaging effects of various environmental stresses including light, radiation, xenobiotics, ozone, drought, and pathogen attack [7]. Many plant defense genes are activated in response to these conditions, including the three maize Cat genes. The presence of such motifs as ARE, NF-KB, and AP-1 in the maize Cat gene promoters renders them suitable for characterization of their potential regulatory roles in plants and may uncover the signal transduction pathways involved in a global regulation of the antioxidant response in plants. Due to page limitations, we will herein primarily discuss our results with the catalase genes {Catl, Cat2, and Cat3), but will refer to the Sod genes as necessary. 7

Structure of the Promoter Region of the Maize Cat Genes

There is no extensive similarity among the three Cat gene promoters. But, several regulatory motifs related to oxidative stress are present in each (Fig. 1). The ARE motif (PuGTGACNNNGC) has been found in the promoter region of all three Cat genes. The ARE responds to H 2 0 2 and phenolic antioxidants that undergo redox cycling, generating ROS [20]. The AP-1 motif (TGANTCA) that serves as the binding site of the mammalian antioxidant transcription factor AP-1, is also present in the Catl and Cat3 promoters. NFKB is a major transcription factor of defensive responses mounted by cells against diverse environmental challenges. It binds to a consensus sequence GGGPuNNPyPyCC located within promoters, or enhancers, of these genes where it functionally interacts with other transcription factors to regulate expression [23]. The N F K B binding site is present twice in the Cat3 promoter within 500bp from the start of transcription. It is also found in the promoters of Catl and Catl, but is located beyond 1500bp from the start of transcription. Xenobiotic responsive elements (XRE) with the core sequence GCGTG are present in mammalian cytochrome P450 gene promoters and are activated in response to xenobiotics [24]. XRE motifs are repeatedly present in the Catl and Cat3 promoter. No XRE-like sequence is found in the Catl promoter. ACGT-core regulatory elements recognized by basic leucine zipper transcription factors [25] are found in diverse gene promoters such as the Emla motif (ABRE 'CCACGTGG'. ABA responsive element) of the ABA regulated wheat Em gene [26]. The ACGT-core is present in the promoter region of all three Cat genes, but is more frequent in the Catl and Catl promoters. The ABRE is found in the Catl promoter in agreement with the fact that Catl is highly induced by ABA.

292

J. G. Scandalios &L.M. Guan

•^^yy^g^^

^ j # - ^ 1 < 5 £ > j C*tfM»ff«

Figure 1. Schematic of the motifs located in the promoter region of the three maize Cat genes. The location of each motif is relative to the transcription start site of each gene. ABRE, ABA responsive element; ARE, antioxidant responsive element; ACGT core, or leucine zipper protein binding site; AP-1, AP-1 binding site, NFKB, NFKB binding site; XRE, xenobiotic responsive element; MRE, metal responsive element.

8

Cat genes Respond Differentially to Environmental Stresses

Expression of the Cat genes during maize development, and in response to stress, has been extensively studied in this laboratory [22]. Herein, we discuss some recent findings on Cat gene expression in response to oxidative stress related signals. 8.1

The Catl transcript is increased in response to ABA, osmotic stress, and dehydration

Plant hormones play a crucial role in regulating plant growth and development. The expression of many plant genes is affected by changes in hormone levels in different tissues at different developmental stages. Abscisic acid (ABA) mediates physiological processes in response to osmotic stress. ABA levels increase in tissues subjected to osmotic stress because of high osmoticum, salt, dehydration, and cold. The effect of ABA and high osmoticum on Cat gene expression has been examined in developing and germinating maize embryos, and in leaves [27]. Steady-state levels of Catl transcript increased widi all applied ABA doses, with a maximum at lO^M ABA. Catl transcript also increased in response to osmotic stress (mannitol) in immature embryos and in young leaves. The Cat2 and Cat3 transcripts are downregulated by ABA and osmotic stress. These data suggest that Catl mRNA accumulation in response to ABA is not developmental stage-dependent and might represent a general stress response in which the Catl gene product increases to protect

Antioxidant Gene Expression

293

plants from AB A-mediated stress. We also examined the effect of dehydration on Cat gene expression in maize, in order to determine if ABA may be involved in such a response. Catl is induced after 4h of dehydration, continue increasing with a maximum in 12-24h. However, Cat2 and Cat3 transcripts, decreased after 4h and are almost undetectable after 24h.The pattern is similar to that of the ABA response in W64A leaves. Carotenoid-deficient viviparous mutants of maize have great potential for providing information concerning the regulation of gene expression in response to ABA and osmotic stress. Seeds of viviparous mutants germinate precociously while still attached to the ear, before maturity. Some viviparous mutants (i.e., vp5) are deficient in ABA [28]. Seedlings rescued from viviparous kernels contain less ABA than do wild-type seedlings and the ABA concentration does not increase in response to water deficit [29]. This ABA deficient mutant was used to determine if ABA is involved in the response of Cat genes to osmotic stress and dehydration. Like W64A, Catl is induced while Cat2/Cat3 are repressed in response to dehydration, in Vp5/- wild-type leaves. In vp5/vp5 mutant leaves, Catl is induced in response to dehydration while Catl is repressed; however, the pattern of Cat3 transcript accumulation is different from that in the Vp5 wild-type leaves. Cat3 transcript is greatly induced in vp5 mutant leaves in response to dehydration, while in wild-type leaves, Cat3 transcript accumulation is repressed, implying that ABA may mask the effects of dehydration on Cat3. That is, when ABA is present in high levels as in W64A and in Vp5 wild-type leaves (or when ABA is induced by dehydration), then Cat3 transcript is repressed by dehydration. However, when ABA levels are low as in the vp5 mutant (deficient in ABA), then Cat3 mRNA is induced by dehydration. This suggests that Cat gene responses to dehydration occur via at least two pathways: an ABA-dependent and an ABA-independent pathway [27]. Cat transcript accumulation in Vp5 wild-type leaves is similar to that of the W64A leaves in response to ABA and mannitol, with Catl increasing, while Cat2 and Cat3 transcripts are down-regulated. In vp5 mutant leaves, Catl mRNA also accumulates in response to ABA and mannitol.

8.1

Catl and Cat3 transcripts accumulate to high levels in response to toxin and xenobiotics

The Cat genes respond differentially, to the cercosporin toxin, which when photoactivated produces both ' 0 2 and 02*~. Embryos incubated in various toxin doses on plates for 24h in constant light, both Catl and Cat3 transcripts increased significantly at the higher doses of purified cercosporin, while Catl transcript increased only slightly [12]. A similar effect on Cat gene expression was observed with paraquat (a chemical inducer of O2"-). In embryos incubated on variable paraquat-dose plates for 24h in constant dark, Catl transcript increased slightly with increasing concentrations,

294

J. G. Scandalios & L. M. Guan

with a maximum at 0.1 mM paraquat. The Cat2 transcript increased dramatically with 0.1 mM and ImM paraquat, while Cat3 transcript decreased [22]. 8.3

Catl and Cat3 transcripts accumulate to high levels in response to wounding and ozone

In leaves, Catl and Cat3 transcripts increased significantly 12h or 24hr after wounding, while Catl transcript slightly decreased. Wounding is likely an indirect signal for Cat transcript accumulation, since it takes 12h to induce Catl and Cat3. We also used the ABA-deficient mutant vp5 to examine Cat transcript accumulation in response to wounding and found that the Catl response pattern is the same in vp5 and in wild-type Vp5 leaves, suggesting that ABA is not involved in the wounding response (unpublished results). W64A seedlings were exposed to acute and chronic doses of ozone (0 3 ). Acute exposure consisted of a single, 6h fumigation, at concentrations of 0, 100, 200, 300, 500 or lOOOppb 0 3 . Catl and Cat3 mRNAs generally increased, while Cat2 mRNA levels decreased, in response to increasing levels of 0 3 [31] None of the transcripts exhibited a significant change at low (<200ppb) 0 3 concentrations. As the concentration rose from 200 to lOOOppb, Catl levels increased steadily, while Cat3 levels rose more abruptly, at 200-300ppb, with less marked changes thereafter. Catl transcript levels decreased at 300ppb, and continued decreasing to 0 3 concentrations up to lOOOppb [31]. 9

Expression of Catl::Gus Constructs in Maize Culture Cells and in Intact Maize Tissues

Several deletion fragments of the Catl promoter were obtained by either restricted digestion or PCR amplification with the 5'-end at -638bp, -338bp, -169bp, -128bp and -95bp, respectively, relative to the transcription start site. The only difference in sequence between the fragments of 128bp and 95bp is that a putative ABRE element is deleted from the 95bp fragment. Each deletion fragment was then fused to the upstream region of the 35S minimal promoter (-46 bp)::intron::Gus construct (pIG46) generating serial Catl promoter deletion-Gus reporter plasmids. To locate any m-elements in the Catl promoter responsible for induction by ABA, the plasmids were individually delivered by particle bombardment into maize (var. Black Mexican Sweet; BMS) cells, for transient expression assays. Cell extracts were prepared from bombarded BMS cells treated with, or without ABA after bombardment. p-Glucuronidase activities in the cell extracts were detected by chemiluminescence (Fig. 2a) and calibrated by internal luciferase activities, as well as by colorimetric detection of GUS (Fig. 2b). Expression of the 638bp construct in cells treated with ABA was increased by 1.3-fold compared to cells without ABA; the 169bp and 128bp constructs were also induced, but the induction decreased as

295

Antioxidant Gene Expression

the deletion size increased (Fig. 2). In contrast, induction of the 95bp construct by ABA was almost abolished due to the deletion of the ABRE element, suggesting that the ABRE is at least one of the ds-acting elements responsible for the ABA induction of Catl.

***** ', *4*«*
sis

%m tat

ilflk.

95

Figure 2. (a). GUS activity in maize BMS cells after Holistic transfer of Catl GUS .constructs and treatment with ABA for 24h. GUS activity is the average of three independent experiments ±SJE. (b). Catl promoter fragments used for transient assays. Constructs were introduced into maize BMS cells by particle bombardment. GUS activity was determined by color staining after ABA treatment.

10 The ABRE Motif Interacts with Nuclear Proteins Isolated from Embryos Nuclear proteins binding to the ABRE motif of the Catl promoter were identified by gel retardation. Two oligos covering the ABRE motif (5'GAAGTCCACGTGGAGGTGG 3s) of the Catl promoter and the mutant ABRE (5f GAAGTaacatgttcGGTGG 3s) core of Catl were used. At least four binding complexes formed in the interaction between the ABRE probe and the nuclear protein extract prepared from 21dpp embryos of the vp5 mutant or its wild-type Vp5 sibling. Binding was competed by cold, "wild-type" ABRE, but not by mutated ABRE, indicating that the binding is specific for ABRE. The binding protein complex-1 (CBF1) appears to be the major protein complex in 21dpp embryos (Fig. 3). No major difference is observed in binding intensity of the 4-binding complexes between wild-type (Vp5f) and mutant (vp5fvp5) embryos, in agreement with previous observations that endogenous ABA does not play a major role in Catl expression via ABRE in 21dpp embryos. We also examined the ABRE motif binding to proteins isolated from 17dpp embryos. In untreated embryos, the protein complex distribution is not the same as in 21dpp embryos (Fig. 4). Complex-2 (CBF2) is the major binding complex in 17dpp embryos while CBF1 is the major

296

J. G. Scandalios & L. M. Guan

binding complex in 21dpp embryos. Upon ABA treatment* CBF1 was induced in the nuclear extracts from embryos of both the vp5 mutant and its wild- type siblings, compared to its absence in the control embryos of the vp5 mutant and less amount in the control VP5f- embryos. CBF1 was significantly induced in the embryos of VP5I- genotype in the presence of 11% mannitol in the MS media compared to the control, but was not in the embryos of the vp5 mutant (Fig. 4). However, CBF2 was only induced in mannitol treated embryos but not in ABA treated embryos. These results suggest that there are two different pathways for Cafl expression in response to mannitol, an ABA-dependent pathway likely through the CBF1 complex, and an ABA-independent pathway likely through the CBF2 complex.

*F& *-f$

1/*-

-

Nj&dL $%i$\

, , > * -*• * < *

>

t timpHiiwr

5

5 t% m

!*i M

*i*t* \mmz

-

$

m

2

**

t „ ~ ^

^

L\ •

„

..

t

^# •

£V*tf l*rwfe«r

Figure 3. Gel retardation assay of binding activities to the ABRE motif of Call in the nuclear protein extracts from 21dpp embryos of the vp5A?p5 mutant and its vp5/- siblings. Competitor: cold, wild type ABRE at 5,10 and 20 .fold in access; Mut. ABRE: mutated ABRE.

11 Call Transcript Increases in Response to Hydrogen Peroxide The direct effect of H 2 0 2 on Cat gene expression was examined 9dpi~leaves hydroponically treated with increasing H 2 0 2 (0 to 50mM) concentrations for 24h. Catl transcript increased with increasing H 2 0 2 concentrations, with maximum induction at 50mM H 2 0 2 , whereas both Cat2 and Cat3 transcripts decreased with increasing H 2 0 2 concentrations (Fig. 5). The accumulation pattern of the three Cat

297

Antioxidant Gene Expression

transcripts in response to H 2 0 2 is similar with previously reported ABA effects on Cat expression (27). Thus, we hypothesized that ABA may cause changes in endogenous H 2 0 2 levels. Consequently, we examined H 2 0 2 levels in cultured maize cells after ABA treatment. Maize BMS cells were treated with lO^M ABA for 0.5, 1, 2, 3, and 4h. After treatment, cells were collected and H 2 0 2 levels determined. H 2 0 2 levels increase in response to ABA between 0.5h to 3h. At 4h after ABA treatment, H 2 0 2 levels are almost the same between the control cells and ABA treated cells (Fig. 6). The reduced H 2 0 2 levels after 4h ABA treatment may due to activation of the Catl gene and protein. § /*£/RMing % s-*mpw\

r

A

M

rp&'VpS ip

c

A M ip

Hulk

pro!*- I

Figure 4. Gel retardation assay of binding activities to the ABRE motif of Catl in the nuclear protein extracts from 17dpp embryos of the vp5/vp5 mutant and its vp5/- siblings. Embryos were cultured for 24h on MS media with ABA (A), or mannitol (M). ip: embryos isolated directly from seeds. C: control.

12 Concluding Comments and Prospects Plants, because of their growth under high intensities of sunlight and a high cellular concentration of dioxygen, are subjected to the most severe oxidative stress relative to other organisms, which can result in membrane leakage, senescence, chlorophyll destruction, and impaired photosynthetic capacity. Aerobes evolved efficient antioxidant defenses among which are the enzymes superoxide dismutase (SOD) and catalase (CAT). All aerobes contain CAT to modulate and degrade H 2 0 2 . No

298

/. G. Scandalios &L.M. Guan

aerobes are known to completely lack catalase and survive in an 0 2 environment, underscoring the importance of CAT for living in an aerobic environment.

1 n

1

M

i i

I

^3K^P

«>i»»§8ss?*

xjiS&fcMOM^.

.j&aaafaaMj^^

^W11* -iPiiP^-

0

MI

^&&*$&s&*

Tifli 1 *

i 1 "ij^'

20

30 50

€, Hi*)

1

mM H,0*

Figure 5. Car mRNA accumulation in response to increasing concentration of H2O2. W64A seedlings (9dpi) were treated with increasing concentrations of H2O2 for 24h. Total RNA was isolated from treated leaves and Cat transcript levels determined by RNA-blots, with Cat gsp.

**>

*

l

i

$

tar

Figure 6. Hydrogen peroxide concentration in maize BMS cells after ABA treatment. BMS eels were treated^with 10"4 M ABA for 0.5, 1, 2, 3, 4h. After treatment, cells were collected and H2O2 levels determined using the luminol assay. H 2 0 2 levels are based on the average of three independent experiments.

The correlation between elevated Cat gene expression and oxidative stress might suggest that enzyme levels increase to provide better protection. Whether all stresses examined induce CAT via the same mechanisms is unclear. The changes in CAT expression under these environmental stresses or during development may be caused, in part, fay changes in ROS levels, or redox potentials. Thus, the maize cataiases provide an excellent opportunity to decipher the mechanisms involved in the regulation of these important defense genes. Elucidation of Cat gene structure and regulation during normal development and under stress may provide important insights into the molecular mechanisms for the differential expression of the Cat

Antioxidant Gene Expression

299

and other antioxidant genes, and the functional roles of their encoded proteins in eukaryotes. The data presented are helping to unravel the roles of CAT and the possible signaling role of H 2 0 2 in modulating Cat gene expression and other metabolic activities. Deciphering the mechanisms by which the Cat genes are regulated during development and under various stressful environments is critical in determining the various physiologic roles these important genes play in all aerobic organisms. We have established, in addition to other factors, that the Cat genes differentially respond to biotic and abiotic environmental signals related to oxidative stress. Our results also show that each Cat gene may respond differently to a certain signal, or all three Cat genes may respond similarly to several different stresses. Our results support the hypothesis that the observed Cat gene responses are mediated by oxidative stress signals, directly or indirectly. Further investigation of Cat gene responses to stress in different tissues and developmental stages will serve as a guide to define the precise molecular mechanisms by which the Cat genes perceive and respond to various signals. We isolated and characterized all three Cat genes. Sequence comparisons show that each Cat promoter is unique with no homology to other Cat promoters. Each Cat promoter contains a unique set of motifs known to induce gene expression in response to oxidative stress in other eukaryotes. Use of promoter deletions, gel retardation, southwestern and DNase footprinting analyses are helping to identify as-acting elements in each Cat promoter responsible for increased transcription under oxidative stress. Once relevant cw-elements have been identified (as we have done with ABRE and ARE), use of gel-retardation will lead to identification of frans-acting factors that interact with the DNA motifs to trigger expression of these and other antioxidant genes in tandem to effect a global response to oxidative stress. This will likely be a more effective defense mechanism than merely overexpressing any one gene at a given time and place. Our results are providing insights into the underlying mechanisms utilized by plant genomes to perceive stress signals and effect the appropriate defenses against environmental oxystress. These findings are relevant not only for maize, but for all other eukaryotes. Knowledge of the architecture of these genes and use of transgenics to decipher the underlying regulatory mechanisms for their developmental expression will provide a sound basis for efficiently engineering stress tolerant organisms.

13 Acknowledgment We dedicate this paper, with affection and respect, to the memory of our friend and colleague (JGS), and teacher (LMG), the late Professor Clement L. Markert. Research supported, in part, by NSF, EPA, and USDA.

300

J. G. Scandalios & L. M. Guan

References 1. 2.

3.

4. 5. 6. 7. 8. 9.

10. 11.

12. 13. 14. 15.

16.

Halliwell, B. and Gutteridge, J.M.C. Oxygen toxicity, oxygen radicals, transition metals and disease. Biochem. J. 219 (1984) pp. 1-14. Turrens, J. and Boveris, A. Generation of superoxide anion by NADHdehydrogenase of bovine heart mitochondria. Biochem. J. 191 (1980) pp. 421427. Asada, K. and Takahashi, M. Production and scavenging of active oxygen in photosynthesis. In Kyle, D.J., Osmond, C.B. and Arntzen, C.J. (eds): Photoinhibition (Elsevier Science Publisher, 1987) pp, 227-287. De Duve, C. and Baudhium, P. Peroxisomes. Physiol. Rev. 46 (1996) pp. 323357. Beevers, H. Microbodies in higher plants. Ann. Rev. Plant Physiol. 30 (1979) pp. 159-193. Elstner, E.F. Oxygen activation and oxygen toxicity. Ann. Rev. Plant Physiol. 33 (1982) pp. 73-96. Scandalios, J.G. Oxygen stress and superoxide dismutases. Plant Physiol. 101 (1993) pp. 7-12. Fridovich, I. Oxygen is toxic! Bioscience 27 (1977) pp. 462-466. Imlay, J.A., Chin, S.M. and Linn, S. Toxic DNA damage by hydrogen peroxide through the Fenton reaction in vivo and in vitro. Science 240 (1988) pp. 640642. Scandalios, J.G., ed. Oxidative Stress and the Molecular Biology of Antioxidant Defenses. (Cold Spring Harbor Laboratory Press, Plainview, N.Y., 1997) Beeler, T. Oxidation of sulfhydryl groups and inhibition of the (Ca""" and Mg++)ATPase by arsenoazo III. Biochim. Biophys. Acta. 1027 (1990) pp. 264267. Williamson, J.D. and Scandalios, J.G. Plant antioxidant gene responses to fungal pathogens. Trends in Microbiol. 1 (1993) pp. 239-245. Sanderman, H. Ozone and plant health. Ann. Rev. Phytopath. 34 (1996) pp. 347-366. Shinozaki, K., Yamaguchi-Shinozaki, K. Gene expression and signal transduction in water-stress responses. Plant Physiol. 115 (1997) pp. 327-334. Jamieson D.J. and Storz, G. Transcriptional regulators of oxidative stress responses. In: Scandalios J.G. (ed), Oxidative Stress and the Molecular Biology of Antioxidant Defenses. (Cold Spring Harbor Laboratory Press, Plainview, NY, 1997). pp. 91-115. Ruis H. and Roller F. Biochemistry, molecular biology, and cell biology of yeast and fungal catalases. In: Scandalios J.G. (ed), Oxidative Stress and the Molecular Biology of Antioxidant Defenses. (Cold Spring Harbor Laboratory Press, Plainview.NY, 1997) pp. 309-342.

Antioxidant Gene Expression

301

17. Angel P. and Karin M. The role of Jun, Fos and the AP-1 complex in cellproliferation and transformation. Biochim. Biophys. Acta 1072 (1991) pp. 129157. 18. Meyer M., Schreck R. and Baeuerle P.A. Hydrogen peroxide and antioxidants have opposite effects on activation of NF-KB and AP-1 in intact cells: AP-1 as secondary antioxidant-responsive factor. EMBO J. 12 (1993) pp. 2005-2015. 19. Kahl R. Phenolic antioxidants: Physiological and toxicological aspects. In: Packer L., Cadenas E. (eds), Handbook of Synthetic Antioxidants. (Marcel Dekker, Inc., New York, 1997) pp 177-224. 20. Rushmore, T.H., Morton, M.R. and Pickett, C.B. The antioxidant responsive element. J. Biol. Chem. 266 (1991) pp. 11632-11639. 21. Nguyen, T. and Pickett, B. Transcriptional regulation of the rat glutathione transferase Ya subunit. J. Biol. Chem. 265 (1990) pp. 14648-14653. 22. Scandalios, J.G., Guan, L. and Polidoros, A.N. Catalases in plants: gene structure, properties, regulation, and expression. In: Scandalios, J.G. (ed), Oxidative Stress and the Molecular Biology of Antioxidant Defenses (Cold Spring Harbor Laboratory Press. Plainview, NY, 1997) pp. 343-406. 23. Sibenlist, U., Brown, K. and Franzoso, G. N F - K B : a mediator of pathogen and stress responses. In Inducible Gene Expression. Environmental Stress and Nutrients. Baeuerle P.A. eds, (1995) pp. 93-141. 24. Pabo, C.O. and Sauer, R.T. Transcription factors: Structural families and principles of DNA recognition. Ann. Rev. Biochem. 61 (1992) pp. 1053-1095. 25. Armstrong, G. A., Weisshaar, B. and Hahlbrock, K. Homodimeric and heterodimeric leucine zipper proteins and nuclear factors from parsley recognize diverse promoter elements with ACGT cores. Plant Cell 4 (1992) pp. 525-537. 26. Guiltinan, M.J., Marcotte, W.R.Jr. and Quatrano, R.S. A plant leucine zipper protein that recognizes an abscisic acid response element. Science 250 (1990) pp. 267-271. 27. Guan, L. and Scandalios, J.G. Effects of the plant growth regulator abscisic acid and high osmoticum on the developmental expression of the maize catalase genes. Physiol. Plant. 104 (1998) pp. 413-422. 28. Neill, S.J.; Horgan, R. and Parry, A.D. The carotenoid and abscisic acid content of viviparous kernels and seedlings of Zea mays L. Planta 169 (1996) pp. 87-96. 29. Moore R, Smith, J.D. Growth graviresponsivenes and abscisic acid content of Zea mays seedlings treated with fluoridone. Planta 162 (1984) pp. 342-844. 30. Pena-Cortes, H., Sanchez-Serrano, J.J., Mertens, R., Willmitzer, L. and Prat, S. Abscisic acid is involved in the wound-induced expression of the proteinase inhibitor II gene in potato and tomato. Proc. Natl. Acad. Sci. USA 86 (1989) pp. 9851-9855. 31. Ruzsa, S.M., Mylona, M., Scandalios, J.G. Differential response of antioxidant genes in maize leaves exposed to ozone. Redox Report 4 (1999) pp. 95-103.

INDEX

A

c

Abscisic acid • 266 plant hormone • 270 Adaptation • 79 Alcohol consumption • 141 dehydrogenase (ADH) • 52 metabolism • 49 Aldehyde dehydrogenase (ALDH) • 49, 141 Allele • 197 All-fish gene • 44 Allozyme diversity • 75 genotype • 78 Altman, S. • xvii Alzheimer's Disease • 30 Amino acid • 185 distribution • 192 sequence • 186 Analysis mathematical • 186 Antioxidant-31, 287 Apoptotosis • 31 Arachidonic acid • 243 Atherosclerosis coronary • 89 plaque • 90

Calpastatin • 21 Calvin cycle • 105 Cancer cell • 158 therapy -161 Carbonic anhydrase • 157 Carrot callus cell • 263 Daucus carota L. • 265 somatic embryo • 263 Catalase • 288 cDNA•5 full length • 5 trophoblast • 203 Cell bacteria- 173 isolation • 172 red blood • 172 white blood- 172 Characterization trans vection • 197 Chromosome X-67 Cigarette smoking • 89 Compatibility histo- • 149 Cornea human • 50

B Biochip 171 Bioinformatics • 5, 6 Biosafety • 42

D Dehydrogenase lactate • xii DNA • 21 amplification • 174

303

array • 180 diversity • 73 levels • 74 methylation • 278 microarray -171 polymerase • 174, 181 sequencing • 179 Dosage compensation • 61 Drug discovery • 180 target • 179 D'Trends, Inc. • ix

E Electrophoresis gel • xii Embryogenesis • 263 Endothelial cell • 90 mitrix oxide synthase (eNOS) • 89 Enzyme • 23 ACE • 24 form • 21 Epigenetic state • 278 variation • 278 Esterase • xii Estrogen • 133 receptor • 134 Eutherian mammal • 61 Evolution • 39 genome • 73 molecular • 73 Expressed sequence tag (EST) • 6

304

Index

F

H

Feed utilization • 42 Fish silver crucian carp • 251 Force intermolecular • 133

Hematopoietic stem cells • 6 Heterosis • 277 genetic effect • 277 Human leukocyte antigen (HLA) • 185 Human genome project • 5 Hunter, R. • xii Hybridization DNA fragment • 253 Southern • 253

G Gene•5 activation • 134 duplication • 107 expression • 134, 180, 266 family-21, 5, 180, 233 function-21,180 homologue • 180 multiplicity • xii product • 179 regulation • 197, 293 silencing • 197 transfer • 40,105 tree • 103 yellow • 195 Genetic diversity • 73 drift • 73 polymorphism • 73 Genome human- 171,179 Genomics functional • 180 Glyoxylate cycle • 103 Gordon, J. • xiii Gynogenesis induced • 253

/ Incompatibility self- • 149 Information fusion -217 system -217 Isozyme • x, xi, 21,158 ALDH • 49 diversification • 235

Markert, C. • x, xi, 20 Marsupial • 61 Melanogaster Drosophila -195, 215 MEMS • 171 Metabolic knowledge -218 network -217 pathway-218 rules • 218 Metabolism anaerobic • 29 oxidative • 29 Microbody • 105 Mitochondria isoenzyme • 103 progenitor • 103 Moller, F. • xi, 21 mRNA-5 Mutation codon -217 null activity -215 single • 144 spontaneous • 216 tissue-specific • 197

o L Lactate dehydrogenase (LDH) • 25 Lens epithelial cells • 50 LIM • 10

Oncogenesis • 158 control of- 158 Oriental variant • 142 Oxidation photo • 30 Oxidative stress • 288

M

P

Maize antioxidant defense • 288 parental inbred • 277

Pathogen aviruence (Avr) -115 infection -115 Pathway • 103 evolution of-103

305

Index Pharmaceutical industry- 180 research • 179 Phenoloxidase -215 activity -215 Phosphatase alkaline • 233 Pollen • 149 genotype • 150 phenotype • 149 S gene- 151 Polymorphism allelic, haplotypic • 185 Protein isoform • 23 oxidation • 30

R Rat testis • 246 Reactive oxygen species (ROS) • 29 RNA • 21 RNase gene- 152 S- 150 self- -151

5 Selection natural • 73 Sequence repetitive • 234 Simulation metabolic network • 217 program • 189 Sperm heterogenous • 252 homogenous • 253 Spermatogenesis • 21, 24,60 Substitutions synonymous • 120 Superoxide • 90 dismutase (SOD) • 90 scavenging role • 90

T Testis • 21 cyclooxygenase • 245 lipoxygenase • 245 Tomato C/loci- 115 Pto gene -116 Transcription factor • 5, 288 Transgenic

fish • 39 Transgenics • xiii Transvection • 196 meiotic • 197 Tricarboxylic acid (TCA) cycle • 103 Trophoblast • 203 human • 206 suppressor cDNA • 203

u UDP-Nacetylglucosamine pyrophosphorylase • 23 UV-B absorption • 50

w Wu-Kabat algorithm • 185 formula • 189

z Zymogram • xii

GENE FAMILIES

f v_>/

/his

archival

volume

is

an

invaluable collection of rigorously

reviewed articles by experts in the fields of gene families, DNA, RNA and proteins, to commemorate the passing of a giant of science — Professor Clement L. Markert (1917-1999).

Studies of DNA, RNA, Enzymes and Proteins

In 1 959, Clement Markert and Freddy Moller developed the concept of the isozyme, which paved the way for extensive studies of enzyme, protein and gene multiplicity across all living organisms. This important scientific discovery has had a profound influence on the biological sciences for more than 40 years, and has provided the basis for regular international meetings to discuss the biological and biomedical implications of enzyme multiplicity. More recently, this concept has been extended to a wide range of gene families of DNA, RNA, proteins and enzymes.

www.worldscientific.com 4493 he

Proceedings Of The 15th Scandinavian Congress

Read more

1999 ISES Solar World Congress

Read more

Proceedings of International congress of mathematicians

Read more

Proceedings of International congress of mathematicians

Read more

Proceedings of International congress of mathematicians

Read more

Proceedings of the Ninth International Congress on Mathematical Education

Read more

The Woman's Congress of 1899

Read more

Acts of Congress

Read more

Acts of Congress

Read more

Acts of Congress

Read more

Congress and the Decline of Public Trust

Read more

Congress and the Cold War

Read more

Proceedings of the International Congress of Mathematicians, August 21-29, 1990, Kyoto, Japan (International Congress of Mathematicians Proceedings)

Read more

The Futurological Congress

Read more

The United States Congress

Read more

The American Congress

Read more

The Futurological Congress

Read more

The Futurological Congress

Read more

Mummy Congress the (Glassbook)

Read more

The American Congress

Read more

The American Congress

Read more

The American Congress: The Building of Democracy

Read more

The American Congress

Read more

The Futurological Congress

Read more

The Futurological Congress

Read more

Women, Partisanship, and the Congress

Read more

Trends in Acarology: Proceedings of the 12th International Congress

Read more

Proceedings of the 15th Scandinavian Congress Oslo 1968

Read more

Proceedings of the 1999 Congress on Evolutionary Computation: Cec99: July 6-9, 1999 Mayflower Hotel Washington, D.C. USA

Read more

Strengthening Congress

Read more

Recommend Documents

Proceedings Of The 15th Scandinavian Congress

1999 ISES Solar World Congress

ISES 1999 Solar World Congress Jerusalem, Israel July 4-9, 1999 Editor: G. GROSSMAN Conference Proceedings Volume 11...

Proceedings of International congress of mathematicians

Proceedings of International congress of mathematicians

Preface When we started planning the edition of the Proceedings of the International Congress of Mathematicians 2006 (...

Proceedings of International congress of mathematicians

Proceedings of the Ninth International Congress on Mathematical Education

Proceedings of the Ninth International Congress on Mathematical Education This page intentionally left blank Procee...

The Woman's Congress of 1899

Acts of Congress

Acts of Congress

Acts of Congress