THE LEUCOCYTE ANTIGEN
FactsBook Second Edition
Other books in the FactsBook Series:
A. Neil Barclay, Albertus D. Beyers, Marian L. Birkeland, Marion H. Brown, Simon J. Davis, Chamorro Somoza and Alan F. Williams The Leucocyte Antigen FactsBook, 1st edn Robin Callard and Andy Gearing The Cytokine FactsBook Steve Watson and Steve Arkinstall The G-Protein Linked Receptor FactsBook Rod Pigott and Christine Power The Adhesion Molecule FactsBook Shirley Ayad, Ray Boot-Handford, Martin J. Humphries, Karl E. Kadler and C. Adrian Shuttleworth The Extracellular Matrix FactsBook Robin Hesketh The Oncogene FactsBook, 1st edn Grahame Hardie and Steven Hanks The Protein Kinase FactsBook The Protein Kinase FactsBook CD-Rom Edward C. Conley The Ion Channel FactsBook h Extracellular Ligand-Gated Channels Edward C. Conley The Ion Channel FactsBook Ih Intracellular Ligand-Gated Channels Marion E. Reid and Christine Lomas-Francis The Blood Group Antigen FactsBook Kris Vaddi, Margaret Keller and Robert Newton The Chemokine FactsBook Robin Hesketh The Oncogene and ltm~our Suppressor Gene FactsBook, 2nd edn Jeff Griffiths and Clare Sansom The Transporter FactsBook
THE LEUCOCYTE ANTIGEN
FactsBook Second Edition A. Neil Barclay Marion H. Brown S.K. Alex Law Andrew J. McKnight Michael G. Tomlinson P. Anton van der Merwe MR C Cellular Immunology Unit, MR C Immunochemistry Unit and the Sir William Dunn School of Pathology University of Oxford, Oxford, UK
Academic Press Harcourt Brace & Company, Publishers SAN DIEGO
LONDON
SYDNEY
BOSTON
TOKYO
NEW YORK
TORONTO
This book is printed on acid-free paper Copyright 9 1997 by ACADEMIC PRESS
All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Academic Press 525B Street, Suite 1900, San Diego, California 92101-4495, USA http: //www. apnet, com Academic Press Limited 24-28 Oval Road, London NW1 7DX, UK http://www.hbuk.co.uk/ap/ ISBN 0-12-078185-9 A catalogue record for this book is available from the British Library
Typeset in Great Britain by Alden, Oxford, Didcot and Northampton Printed in Great Britain by WBC, Bridgend, Mid Glamorgan 97 98 99 00 01 02 EB 9 8 7 6 5 4 3 2 1
Contents Preface Abbreviations Dedication
VII VIII IX
Chapter 1 Introduction
Chapter 2 The Discovery and Biochemical Analysis of Leucocyte Surface Antigens
18
Chapter 3 Protein Superfamilies and Cell Surface Molecules
32
Chapter 4 The Architecture and Interactions of Leucocyte Surface Molecules
101
CD1 132 CD2 134 CD3/TCR 137 CD4 141 CD5 143 CD6 145 CD7 147 CD8 149 CD9 152 CD10 154 CD1 la 156 CD1 lb 158 CD1 lc 161 Integrin aD subunit _ 163 CDwl2 165 CD13 166 CD14 169 CD15 171 CD 15s 172 CD 16 173 CDwl7 176 CD 18 177
235 238 240 244 248 251 253 255 257 260 262 265 267 269 271 274 276 278 281 284 287 288
CD19 CD20 CD21 CD22 CD23 CD24 CD26 CD27 CD28 CD29 CD30 CD31 CD32 CD33 CD34 CD35 CD36 CD37 CD38 CD39 CD40 CD41
179 181 183 186 189 192 194 197 199 201 204 206 209 213 215 217 221 224 226 228 230 232
CD42a, b CD43 CD44 CD45 CD46 CD47 CD48 CD49a CD49b CD49c CD49d CD49e CD49f CD50 CD51 CD52 CD53 CD54 CD55 CD56 CD57 CD58
V
CD59 CD60 CD61 CD62E CD62L CD62P CD63 CD64 CD65 CD66 CD68 CD69 CD70 CD71 CD72 CD73 CD74 CDw75 CDw76 CD77 CD79/BCR CD80 CD81 CD82 CD83 CDw84 CD85 CD86 CD87 CD88 CD89 CD90 CD91 CDw92 CD93 CD94 CD95 CD96 CD97 CD98 CD99 CD100 CD101 CD102 CD 103 CD104 CD105 Index
VI
290 292 293 295 298 301 304 306 309 310 314 316 318 320 323 325 327 329 330 331 332 335 337 339 341 343 344 345 347 349 351 353 355 359 360 361 363 365 367 369 371 373 375 377 379 381 384
CD106 CD107a, CD107b CDw 108 CD109 CDll7 CD 120a, CD 120b CD134 CD135 CDw137 CD138 CD147 CD148 CDw 150 CD151 CD152 CD153 CD154 CD161 CD162 CD 163 CD166 l14/A10 2B4 4-1BBL Aminopeptidase A B-G Chemokine receptors c-kitL CMRF35 antigen DEC-205 DNAM-1 ESL-1 F4/80 FasL FceRI FLT3 ligand FPR Galectin 3 G-CSFR GM-CSFR GlyCAM-1 gp42 gp49 HTm4 IFN?R IL-1R
386 389 392 393 394 397 400 402 404 406 408 410 412 414 415 417 419 421 424 427 429 431 433 435 437 439 441 443 445 447 450 452 454 456 458 461 463 465 467 470 473 475 476 478 479 482
IL-2R IL-3R IL-4R IL-5R IL-6R IL-7R IL-8R IL-9R IL-10R IL-11R IL-12R IL-13R IL-14R IL-15R IL- 17R Integrin f17 subunit KIR family L1 LAG-3 LDLR LPAP ltk Ly-6 Ly-9 Ly-49 Mac-2-BP Macrophage lectin _ MAdCAM- 1 Mannose receptor MARCO M-CSFR MDR1 MHC Class I MHC Class II MS2 NKG2 family OX2 OX40L PC-1 PD-1 RT6 Sca-2 Scavenger RI and II _ Sialoadhesin Thrombopoietin receptor WC1
486 490 493 496 498 501 503 506 508 510 512 514 516 517 520 522 524 528 531 533 536 538 540 543 545 548 550 552 554 557 559 562 564 567 569 571 573 575 577 579 581 583 585 588 591 593 596
Preface Due to the very large increase in data on leucocyte antigens since the first edition was published, all the entries have been completely revised. In addition, about a further 70 new entries have been added. The introductory chapters have been revised and reorganized. The chapter in the first edition on gene locations has been deleted in the second edition as these data will rapidly become outdated as the genome project progresses. The authors acknowledge all those that helped in the completion of the second edition of this book. In particular we thank our colleagues for their patience and advice during the period in which this book, known colloquially as the CD book, was being written. Many colleagues around the world communicated unpublished information and helped with parts of the manuscript. These included Iain Campbell, Marco Colonna, Paul Crocker, Simon Davis, Anthony Day, Mark Gorrell, Peter Gough, Vicky Heath, David Jackson, William James, Lewis Lanier, Peter Linsley, Don Mason, Steve Rosen, Dave Simmons, Antony Symons and Masahide Tone. We also thank Tessa Picknett of Academic Press for her contributions. The authors acknowledge their funding by the UK Medical Research Council in the MRC Cellular Immunology Unit (ANB, MHB, MGT, and PAV), the MRC Immunochemistry Unit (AL) and the Sir William Dunn School of Pathology (AJM). The authors hope that there are a minimum of omissions and inaccuracies and that these can be recitified in later editions. We would appreciate if such points were forwarded to the Editor, Leucocyte Antigen FactsBook, Academic Press, 24-28 Oval Road, London NW1 7DX, UK.
From top left: Marion Brown, Alex Law, Andrew McKnight; bottom left" Michael Tomlinson, Neil Barclay and Anton van der Merwe.
VII
CCP CS CSF CTL EGF fMET Fn GAG GAP GCSF GMCSF GPI IFN7 Ig IgSF IL- 1 etc. kDa LRR LDLR mAb MHC M-CSF mIg MLR NGFR N-linked glycosylation NMR NK O-linked glycosylation PI 3-kinase PLC-71 PBL PCR PMA R
RTPCR SRCR SD SDS-PAGE SF TM4 TCR TNF WWW
VIII
Complement control protein Chondroitin sulfate Colony-stimulating factor Cytotoxic T cell Epidermal growth factor Formyl MetLeuPhe Fibronectin Glycosaminoglycan p21 r a s GTPase-activating protein Granulocyte colony-stimulating factor Granulocyte-macrophage colony-stimulating factor Glycosyl-phosphatidylinositol Interferon ? Immunoglobulin Immunoglobulin superfamily Interleukin 1 etc. Kilodalton Leucine-rich repeat Low-density lipoprotein receptor Monoclonal antibody Major histocompatibility complex Macrophage colony-stimulating factor Membrane immunoglobulin Mixed lymphocyte reaction Nerve growth factor receptor Asparagine-linked glycosylation Nuclear magnetic resonance Natural killer Hydroxyl-linked glycosylation Phosphatidylinositol 3'-kinase Phospholipase C-71 Peripheral blood lymphocytes Polymerase chain reaction Phorbol 12-myristate 13-acetate Receptor Reverse transcriptase polymerase chain reaction Scavenger receptor cysteine-rich Standard deviation Polyacrylamide gel electrophoresis in sodium dodecyl sulfate Superfamily Membrane protein with four transmembrane regions T cell receptor for antigen Tumour necrosis factor World Wide Web
DEDICATION
This book is dedicated to the m e m o r y of
Alan F. Williams
This Page Intentionally Left Blank
THE INTRODUCTORY CHAPTERS
AIMS OF THE BOOK The primary aim of this book is to provide a compendium of the molecules that are found at leucocyte surfaces. The book includes entries for 206 antigens. Prior to the entries there are three chapters that provide a perspective for thinking about cell surface molecules. These chapters deal with the architecture of cell surfaces and with the domain types which can be found in leucocyte surface proteins. A systematic approach to the naming of leucocyte antigens was made with the "cluster of differentiation" designation (CD) for human leucocyte antigens identified with monoclonal antibodies (mAbs) I. MAbs were submitted to workshops and were placed into groups based on fluorescent labelling patterns for different leucocyte populations. In this classification, antigens are given a CD number and we provide in this book, entries on the CD antigens defined in the 5th International Workshop in 1993. We also include entries on antigens from any species which are known to be expressed on the surface of leucocytes, that have been characterized at the level of their primary sequence, including those that were given a CD number at the 6th International Workshop in 1996 and also those that have not been given a CD number. Molecules that have been sequenced but for which there are no additional data are not included. The CD numbering was extended to endothelial antigens in the 6th International Workshop in 1996 but these are not included. The cytokine receptors often have more than one polypeptide chain and the cytokine receptor designation is given first together with the CD numbers. Thus CD25 appears under IL-2R rather than between CD24 and CD26. The cytokine receptor for a particular CD number can be found from the index. Table 1 lists all the entries in the book and Table 2 contains a brief summary of those CD antigens for which entries are not included, that is the endothelial antigens mentioned above, those CD numbers between CD 1 and CD 166 which have not yet been allocated and CDs in the range CD 131-CD 166 which had not been included on the above criteria at the time of completion of the book.
ORGANIZATION OF THE DATA Name The CD nomenclature is used for both the human antigens and their homologues in rat and mouse. This nomenclature is now being used for other species homologues as illustrated in the recent swine workshop on leucocyte markers 2. In some cases the antigens are known by several other names and the most common of these are also given. The term homologue is reserved for equivalent molecules between species and is not used to indicate related molecules in a superfamily. In cases where the CD nomenclature is not widely used (e.g. cytokine receptors and integrins) the commonly used names are used in the introductory chapters.
Molecular diagram This gives a visual representation of the molecule and includes details such as the mode of membrane attachment, the protein domains that are present, and the degree of glycosylation. Figure 1 shows the symbols that are used to represent the
Table 1 Distribution of domains in leucocyte surface molecules. The names of the
antigens are those given in the entries in Section II and the abbreviations of the domain types are those defined in Fig. 1. The data are for single polypeptides although some will form homo- or heterodimers. Some of the cytokine receptors have polypeptide chains in common and these are indicated in each case. Extracellular regions
-,.~
TM regions
Cytoplasm
~
Antigen name CD1
1
I
132
CD2
1
I
134
CD3 6, 7, e (orT/ ]
1
I
137
1
I
137
1
I
137
1
I
137
CD4
1
I
141
CD5
1
I
143
CD6 CD7 CD8c~
1
I
145
1
I
147
1
I
149
fl
1
I
149
CD9
4
III
152
CD10
1
II
154
CDlla
1
I
156
CD1 lb
1
I
158
CDllc
1
I
161
1
I
161
II
165 166
TCRa, fl, 7, 8 pre-TCR~
Integrin aD C D w l 2 (NS) CD13 CD14 CD 15 (CliO) CD 15s (CHO) CD16 C D w l 7 (CHO) CD18
1 1
171
172 1 or 1
CD22 CD23
I
173
176
CD19 CD20 CD21
169
15or 16 50r7
1
I
177
1
I
179
4
III
181
1
I
183
1
I
18~
1
II
189
CD24
1
192
CD26
1
II
194
CD27 CD28
1
I
197
1
I
195
CD29
1
I
201
CD30 CD31
1
I
204
1
I
20~
CD32
1
I
205
CD33
1
I
213
Introduction
Extracellular regions
Antigen name
~ K ~ ~ to ~3 ~a ~
CD34 CD35 30 CD36 CD37 CD38 CD39 CD40 CD41 CD42a CD42bc~ CD42bfl CD43 CD44 CD45 CD46 4 CD47 CD48 CD49a CD49b CD49c CD49d CD49e CD49f CDS0 CDS1 CD52 CD53 CD54 CD55 4 CD56 CD57 (CliO) CD58 CD59 CD60 (CHO) CD61 CD62E 6 CD62L 2 CD62P 9 CD63 CD64 CD65 (CHO) CD66a CD66b CD66c CD66d CD68
~ ~
~
~ ~
~ ~ "~ ~ ~ ~ ~ ~a ~a'S ~a ~a
TM regions
~
~
~
~ ~
~
~
~
2151
1
I
2 4 1 3 1 1 1 1 1 1 1 1 1 5
VI III II III I I I I I I I I I m
217 221 224 226 228 230 232 235 235 235 238 240 2 244 248 251 253 255 257 260 262 265 267 269 271 274 276 278 281 284 287 288 290 292 293 295 !298 301 304 !306 309 310
1 G{
G{
0{ 0{
1 1 1 1 1 1 1
I I I I i I I
1
I
1 4 1
iII I 1
2
1 1 1
5
~~
I
2
5
~,
~3 1
1
Cytoplasm
1 or 1 I 1 or 1 1
I
1 1 1
I I I
I
1
I
4 1
III I
1
I 1
310
1
310
1
I
310
1
I
314
Extracellular regions
Antigen name
~ ~
K ~ eJ o~ ~ * D ~a ~ ~ ~3 ~
~~
TM regions
~ .~~ ~ ~~ ,o ~ ~ ~a ~a ~a ~a
~
~u e ~ ~
CD69 CD70 CD71 CD72 CD73 CD74 CDw75 (CHO CDw76 (CHO CD77(CHO) CD79a and fl BCR e.g. IgM (H chain) CD80 CD81 CD82 CD83 CDw84 (NS) CD8S
~
Cytoplasm
~ ~ ~3 1
~
~~ ~
II II II II
1
1 1 1 1
II
1 1 1 4 4
I I I III III
1
I
{NS)
CD86 CD87 CD88 CD89 CD90 CD91 CDw92 (NS) CD93 (NS) CD94 CD95 CD96 CD97 CD98 CD99 CD100 CD101 CD102 CD103 CD104 CD105 CD106 CD107a/b CDwl08 (NS) CD109 (NS) CD117 CD 120a/b CD134 CD135 CDw137
1 3
I 1
7
III
1
I
1 22
31
1
7 2
4 3 2
1
I
1
1 1 7 1
II I I III II
1
I
1 1 1
I I I
1
I
1
I
1
I
1
I
1
I
1 1 1 1 1
I I I I I
1
1
~0 ~ 3316 18 320 323 325 327 329 330 331 332 332 335 337 339 341 343 344 345 347 349 351 353 355 359 36C 361 363 365 367 369 371 373 375 377 375 381 384 38~ 38~ 392 393 394 397 40C 402 404
Introduction
Extracellular regions
Antigen name
~ ~
~
~ ~
CD138 CD147 CD148 CDwl50 CD151 CD152 CD153 CD154 CD161 CD162 CD163 CD166 l14/A10 2B4 4-1BBL Aminopeptidase A B-G Chemokine Rs c-kitL CMRF35 DEC-205 DNAM- 1 ESL-1 F4/80 FasL FccRIa
c~ o~ ~ ~ ~
~ ~
~
~
Cytoplasm
~0
~
.~
~
~
~ e ~ ~
~
~
~ ~
~~
2 10 2
9
10
fl ? FLT3 ligand FPR (fMLPR) Galectin-3 G-CSFR/CD 114 GM-CSFR/CD116 CDwl31 GlyCAM-1 gp42 gp49 HTm4 IFN7R AF-1 IL-1R/CD121a CD121b IL- 1RAcP IL-2R/CD25 2 CD122 CD132
~ ~ ~ - ~ ~~ ~ ~a ~a ~a ~a
TM regions
1 1 1 2
4
1
1
2
~ ~ L9 1 1
I I
406 1408
1
I
1 41(]
1 4
I III
1
I
412i 414 415
1
II
417
1
II
419
1
1 1 1
II I 1 I
1
I
1 1 1 1 7 1 1 1 1 1 7 1 1 4 1
I II II I III I I I I I III II I III I
421 424 427 429 431 433 435 437 439 441 443 445 447 450 452 454 456 458 458 !458 461 !463 465 467 470
1
I
7 0 1 1
III
1
I
I I
0 2 2 2 2 3 3 3 1 1
1 1
47(]
1 4 1 1 1 1 1 1
I III I I I I I I
1
I
473 475 476 478 479 479 482 482 482 486 48~
1
I
48~
1
Extracellular regions
TM regions
Cytoplasm
r
% Antigen name
~ ~
K
IL-3R/CDw123 CDwl31 IL-4R/CD124 CD132 ] IL-13R a [ IL-5R/CDw125 CDwl31 IL-6R/CD 126 CD130 IL-7R/CD 127 CD132 IL-8Ra/CDw128
1 2 1 1 1 1 2 1 1 1
I I I I I I I I I I
III I
1 1
1 1 2 1 4
I
1 1
5 1 1
1
1
I
CD132 1 1 IL- 17R Integrin f17 KIR family (e.g. CD158a and b) (e.g. NKAT3) L1 3 LAG-3 LDLR 3 LPAP ltk Ly-6 Ly-9
I
1 1
I I I I I I I
1
Ly-49 Mac-2 binding protein Macrophage lectin MAdCAM- 1 Mannose R MARCO M-CSFR/CD 115 MDR1 MHC Class I c~ fl2M
503 503 506 506 508 510 510 512 514 514 516 517 517
III
IL-8RB IL-9R~ (CD129) CD132 IL-10R IL-11R a CD130 IL-12R fl IL-13R a CD124 IL-14R (NS) IL-15R a CD122
490 490 493 493 493 496 496 498 498 501 501
I
1 2 1 1 1 1 2 1 4 1 1
I I I I I I I I I
1
I
1
II
1
543 545 548
? 1 1 1
517 520 522 524 524 528 531 533 536 538 54C
I
55C 552
1
I
554
1
II
1 12
I III
1
I
0
II
1
557 555 562 564 564
Extracellular regions
.,
TM regions
~
~
Cytoplasm
~ ~o eS ~
~ ~ mS .~ ~ ~
Antigen name MHC Class II a fl MS2 NKG2 OX2 OX40L PC-1 PD-1 RT6 Sca-2 Scavenger RI Sialoadhesin Thrombopoietin R WC1
1 1
1 1 1
2 1 2 1
1 1 1 1 1 1 1 1 1 1
1 1 17 2
I I I II I II II I
2 11
1 1 1 1
II I I I
567 567 569 571 573 575 577 579 581 583 585 588 591 593
T M , T r a n s m e m b r a n e regions; N u m b e r , t h e n u m b e r of p r e d i c t e d passes t h r o u g h t h e m e m b r a n e ; GPI, w h e t h e r or n o t t h e m o l e c u l e has a GPI anchor; Type, w h e t h e r it is a t y p e I, II, III or VI transm e m b r a n e p r o t e i n (see C h a p t e r 4); C y t o p l a s m , n u m b e r a n d t y p e of d o m a i n s in t h e c y t o p l a s m ; Integrin, a or fl chain. In t h e a n t i g e n s e c t i o n t h e t e r m s N S i n d i c a t e t h a t t h e s e q u e n c e has n o t b e e n d e t e r m i n e d a n d C H O t h a t t h e d e t e r m i n a n t is d e p e n d e n t on c a r b o h y d r a t e .
various domains and their N- and O-linked oligosaccharide structures. A summary of the domains present in the leucocyte molecules is given in Table 1 which also includes the page numbers of the individual entries. For molecules attached to the cell surface by glycosyl-phosphatidylinositol (GPI) anchors, the anchors are shown in the diagrams by an arrow. The orientation of molecules with protein transmembrane sequences is evident from the labelling of the N- and/or the C-termini. This scheme for the diagrams is chosen in preference to that being standardized by SWISSPROT in order to illustrate features such as glycosylation which may be important in leucocyte function. The SWISSPROT system is aimed more at describing the modular content of proteins 3,4. The criteria used for assigning sequences to domain types or superfamilies are discussed in Chapter 3. In those cases where no assignment can be made, the extracellular region or domain is shown as a circle containing a question mark with the diameter of the circle being proportional to the size of the extracellular region or domain. If the protein sequence contains a high proportion of Ser, Thr and Pro residues and is probably heavily O-glycosylated (see below), it is shown as an extended structure to distinguish it from regions likely to have a folded conformation. It is not possible to show the positions of all the disulfide bonds in the diagrams and these bonds are shown only for the inter-sheet disulfides of Ig superfamily (IgSF) domains since this has been traditionally done in illustrating this domain type. The majority of the cytoplasmic domains do not have recognizable domains and are represented by squiggly lines whose length is proportional to the sequence length. The number and approximate positions of potential N-linked glycosylation sites are deduced from the presence of the sequences Asn-Xaa-Thr or Asn-Xaa-Ser where Xaa is
8
Table 2 CD antigens for which there are no individual entries in this book Name CD67 CDw78 CD 110 CD 111 CDll2 CD 113 CD 118 CD 133 CDw136
Reassigned to CD66b MHC Class II Not assigned Not assigned Not assigned Not assigned Interferon ~/fl receptor Not assigned Macrophage-stimulating protein receptor CD139 Germinal centre B cells CD140a,b PDGF receptor CD 141 Trombomodulin CD 142 Tissue factor CD 143 CD 144 CDw 145 CD 146 CDw 149 CD 155
Angiotensin-converting enzyme VE-cadherin
CD 156
ADAM-8
CD 157
BST-1
CD159 CD 160 CD 164 CD 165
Not assigned Not assigned MGC-24 AD2/gp3 7
MUC- 18 Poliovirus receptor
Distribution
Mr (kDa)
Broad 180 Broad Activated monocytes and endothelium Endothelial subsets Endothelium, stromal Endothelium, activated T cells Broad Monocytes, macrophages, thymocytes Monocytes, macrophages, granulocytes Monocytes, neutrophils, endothelium Haematopoietic cells Thymocytes, thymic epithelial cells
209, 228 180 100 45 170 135 25, 90, 110 115 80-90 80 42-45
80 37
Note: CDw78 has been found to be an MHC Class II antigen (Slack, J.L. et al. (1995) Int. Immunol. 7, 1087-1092).
any amino acid except for Pro. The nature of the Xaa may affect the degree of glycosylation. Thus Pro prevents glycosylation and in vitro studies indicate that Trp, Asp, Glu and Leu give relatively inefficient glycosylation s. Asn-Xaa-Ser-Pro and AsnXaa-Thr-Pro are variably glycosylated and in the entries they are assumed to be glycosylated unless there are contrary biochemical data 6-s. O-linked glycosylation occurs at Ser or Thr amino acids but there is no sequence motif that invariably indicates this type of glycosylation. However O-linked glycosylation is usually found in stretches of sequence with a preponderance of Ser, Thr and Pro 9,1o. In many cases there are biochemical data to support the assignment of O-glycosylation to the Ser, Thr, Pro rich regions. Glycosaminoglycans (GAG) have been identified in three leucocyte m e m b r a n e glycoproteins CD44, CD138 (syndecan) and F4/80. The n u m b e r of these sites has not been confirmed biochemically and the symbols indicate only their approximate n u m b e r and positions. Where the GAG has been characterized, for example, the chondroitin sulfate in CD44, this is shown in the diagram.
TYPE
SIZE
TYPE
SIZE
(approximate amino acids)
Complement control protein (cce)
G
60
(approximate amino acids)
LDLR
~]
40
Ly-6
Cytokine receptor (R)
70-90
100
C)
MHC Epidermal growth Factor (EGF) Fibronectin type II (Fn2) Fibronectin type III (Fn3)
40 ~
Scavenger receptor
110
55 Somatomedin
[ FS']
Immunoglobulin (Ig) V set IgImmun~176 set
100
1~
Immunoglobulin ~ c 2 : Ig C2 set
(~)
40 125
100
TNF
110
Tumour necrosis factor receptor (TNFR)
40
100
TY ~ k'naI
270
Phosphotyrosine phosphatase (PTPase)
250
90-100
Lectin C-type
120
Galectin or Lectin S-type
140
OTHER SYMBOLS USED N-glycosylation sites O-linked glycosylation /
LRR repeats
Link
~
[" Lk t,,,,
24
,1
90
Chondroitin sulfate Glycosoaminoglycan GPI anchor in lipid hilayer
T~TTITTT T ~ ~
Figure 1 Icons used for the protein domains and repeats which are present in leucocyte membrane proteins. These icons are used to depict all the molecules described in Section H. Additional abbreviations used in the domain diagrams are: S, signal sequence; TM, transmembrane sequence; CY, cytoplasmic sequence; G, GPI anchor signal sequence.
10
Size of the processed form of the molecule The calculated relative molecular weight (Mr) of the mature fully processed polypeptide backbone is given together with the Mr values obtained from polyacrylamide gel electrophoresis in sodium dodecyl sulfate (SDS-PAGE)under reducing and/or unreducing conditions. There can be considerable variation according to the conditions used for the SDS-PAGE and the values are typical of those in the literature.
The degree of glycosylation The number of potential N-linked glycosylation sites in the extracellular portions of membrane glycoproteins is shown as determined by the presence of the consensus sequence Asn-Xaa-Thr/Ser (see above). To obtain a rough idea of the weight of the glycoprotein likely to be accounted for by N-linked oligosaccharides an average value of 3000 Mr can be used per N-linked structure. It should however be noted that some oligosaccharides can be considerably larger, due, for example, to the presence of repeating lactosamine units or complex blood group antigens and not all potential N-linked sites are necessarily occupied. The number of O-linked sites cannot be directly predicted from the sequence and thus the following terms are used to describe the extent of O-glycosylation: 1 Unknown:
There are no data from sequence or immunochemical studies to indicate the level of O-glycosylation. 2 Nil: No O-linked oligosaccharides indicated by biochemical analysis or the Mr determined by SDS-PAGE analysis is fully accounted for by the polypeptide Mr and any N-linked oligosaccharides. 3 Probable +: Sequence data suggest the likelihood of O-glycosylation. 4 + abundant: There are biochemical data to indicate a high level of O-glycosylation.
Gene location and size The chromosome location of the human gene from the Genome Data Base 11,~2is given together with the approximate size of the region encompassing the exons.
Domain and exon organization A diagrammatic representation of the positions of domains and exons is given for those proteins containing clear domains. These diagrams are not drawn to scale. Domain positions are defined by indicating the positions of two key conserved residues within the domain. The first residue is shown with two flanking residues on the Cterminal side and the second with two flanking residues on the N-terminal side. The sequences which are shown in the upper part of the domain diagram can be used to identify the domain within the full sequence given at the end of the entry. The key conserved residues for each domain type are marked with asterisks in the superfamily sequence alignments given in Chapter 3. Internal positions are identified rather than domain boundaries since in some cases the end of one domain overlaps with the beginning of another (e.g. between CD4 domains 1 and 213). The identification of conserved internal residues allows a domain to be analysed by comparison with the alignment of other superfamily members shown in Chapter 3. Also the ends of the domains seldom have conserved residues.
11
Exon junctions are identified by three or four residues of protein sequence in the lower part of the diagram. The amino acid(s) encoded at the splice junction is aligned with the intron/exon boundary marker. The type of intron/exon boundary is indicated as type 1 (splicing after the first nucleotide of the triplet), type 2 (splicing after the second nucleotide) and type 0 (splicing between the codons)14. Thus if the junction was type 1 and a Trp residue was shown at the boundary of an exon/intron splice site, the point of junction would be after the T of the TGG codon. By comparing the intron/exon boundaries and the domain designations it can be seen whether a given domain is encoded within one exon. Where exons are known to be alternatively spliced a space is inserted between the rectangles that represent the exons; otherwise exons are drawn contiguously.
Tissue distribution A brief description is given of the cells and tissues that have been clearly shown to express the antigen. In many cases a full analysis of the tissue distribution has not been done. A failure to mention a cell type cannot be taken as meaning that it is known that the cell type does not express the antigen. The distribution of the CD antigens in humans is reviewed in reports from the Leucocyte Typing Workshops is and a computerized database of tissue distributions based on the 5th Workshop, called LDAD is available for downloading by anonymous FTP from the NIH site, e.g. using the URL ftp://
[email protected]/. The Leucocyte Typing Workshop reference is used for each of the entries on the CD antigens but is not explicitly cited. There are often considerable differences in the tissue distribution between species and this is indicated where data are available. A pullout diagram of the distribution of the human CD antigens on leucocyte populations is enclosed inside the back cover.
Structure Biochemical and structural data are summarized in this section.
Ligands and associated molecules Many of the cell surface proteins interact through their extracellular domains with soluble proteins (e.g. cytokines)or other cell surface proteins (adhesion molecules). Those that have been identified are listed together with proteins that interact through the membrane or cytosolic regions of the antigens.
Function Functional data are summarized in this section.
Database entries Accession numbers are given for the amino acid sequences in the PIR (Protein Identification Resource) and SWISSPROT databases and the nucleic acid sequences in the GENBANK and EMBL databases. Every new sequence submitted to the databases is given an accession number. These are then incorporated into entries which are
12
given a specific identifier. One problem is that identifiers can change if, for instance, two entries are merged or when sequences become fully annotated. Therefore the only definitive way to identify a sequence is through the accession numbers and these are given in the entries. Where there are multiple accession numbers for a sequence the primary number is given. In some cases genomic but not cDNA sequence is available and is spread over several database entries. In these cases more than one accession number is given. Apart from some early entries EMBL and GENBANK use the same accession number for each entry. If available, the accession numbers are given for human, rat and mouse homologues but not generally for other species.
Retrieval of sequences from databases Most sequence analysis software will identify and retrieve sequences from local computers using the accession number. Sequences can also be obtained directly through the Internet from computer servers at the database centres, again using the accession number. These procedures are changing rapidly as more servers become available. The simplest and most powerful methods use the World Wide Web (WWW) browsers such as Netscape | A selection of currently available WWW sites of relevance to protein and DNA sequences and protein structures is given below. It should be stressed that new resources are becoming available all the time. New WWW sites can often be identified by checking the links available from wellestablished sites 16-~s (see below). A guide to the Internet has been published on the WWW by Elsevier Trends Journals 19 and there are regular articles in the "Computing Corner" of TIBS, in Current Biology and various books (e.g. ref. 20). Several of the major databases are available from more than one site, i.e. "mirrored" copies exist. It is usually quicker to use a local site. For instance EMBL is available from many sites with one official EMBnet node in each European country 2~ and the SWISSPROT database is based in Geneva 22 but a mirror version exists in Cambridge, UK ~6. The Protein Database (PDB) is based in the US but there are mirror sites in China, Israel and the UK (see ref. 23). In addition to retrieving data directly from the databases many sites offer services for searching databases using programmes such as BLAST or BLITZ, together with other modelling and alignment programmes.
WWW sites for DNA and protein databases
Entrez (http://www3.ncbi.nlm.nih.gov/Entrez/) An interlinked database of D N A sequences, protein sequences, protein structures and a subsection of MEDLINE containing references including abstracts, for those references with sequence-related information.
European Bioinformatics Institute (EBI) (http://www.ebi.ac.uk/) A wide range of databases including the Protein Data Bank (PDB), SWISSPROT, EMBL and many more specialized databases.
GenBank (http://www.ncbi.nlm.nih.gov) The complete nucleotide database along with many other databases and searching tools such as BLAST (see below for Email).
13
Introduction Genome Database (GDB) (http://gdbwww.gdb.org/) The central repository for genomic mapping data resulting from the Human Genome Initiative. OlVlIM@, On-line Mendelian Inheritance in Man
(http://www3.ncbi.nlm.nih.gov/Omim/) This database is a catalogue of human genes and genetic disorders with many links to the Entrez database (see above).
Protein Data Bank (PDB) (http://pdb.pdb.bnl.gov/) An archive of experimentally determined three-dimensional structures of biological macromolecules compiled at the Brookhaven National Laboratory. SwissProt (http://expasy.hcuge.ch/) This contains all the SWISSPROT entries. The database has links to entries in the protein structure database (PDB) and references in ENTREZ (see above).
Email servers for biological databases A comprehensive list of useful servers is given by SWISSPROT on WWW http://expasy.hcuge.ch/info/serv_ema.txt A good starting point is to send Email containing the single word HELP to the Email address (e.g. see below).
A. Obtaining DNA and protein sequences from the EMBL and SWISSPROT databases The Email address is:
[email protected] The commands used are: GET to specify the request and NUC or PROT to specify the EMBL or SWISSPROT databases, respectively. A typical file might look like: GET PROT:P01830 GET NUC:X03152 This file would request the sending of two entries, one from SWISSPROT (protein) and one from EMBL (DNA).
B. Obtaining protein sequences from the PIR databases The Email address is:
[email protected] The file should contain the command GET followed by the accession number. A typical file would look like: GET A02107
C. Obtaining DNA sequence from the GENBANK database The Email address is:
[email protected] List the DATABASE required, the command
and then the identifiers or accession numbers in a single column. A typical file would look like: DATALIB GenBank Begin J02852 J02855
El
CO01 +H3N-- C - - H
I
CH3
CO0-
I +H3N-- C - - H I CH2 ! CH2 I CH2
I i C ~--~-NH2+ I
CO0-
I +H3N--C--H I CH2 I C o//\NH2
CO0-
I +H3N--C--H I CH2 !
CO0-
I I CH2 !
+H3N--C--H
C \O-
O~
SH
N--H
Alanine (Ala, A) COO-
l
+H3N-- C - - H
I I CH2 I
CH 2
C
O// NNH 2 Glutamine (Gin, Q) COO-
l
+H3N-- C - - H
I I
CH2
/\ H3C
CH CH3
NH2 Arginine (Arg, R)
Asparagine (Asn, N)
CO0-
CO0-
I +H3N--C--H I CH 2 I CH2 i
I +H3N--C--H I H
C
O// NOGlutamate (Glu, E) COO-
I +H3N--C--H I CH 2 I CH2 I CH2 I CH2 I
Aspartate (Asp, D) COO-
I +H3N--C--H I CH z I C--CH I I
+HN~c/NH
Glycine (Gly, G) COO-
I *H3N--C--H I CH 2 I CH2 I S I
H Histidine (His, H)
Cysteine (Cys, C)
COO-
I I H--C--CH3 I CH2 i
+H3N-- C - - H
CH3
Isoleucine (lie, I)
COO-
COO-
I +H3N--C--H I CH 2 I
I l
+H2~I--C--H H z C ~ ..,,,,H.C2
cm
(.~
CH3
NH3+
Leuc,ne (Leu, L) COO-
1
+H3N--C--H
I
H--C--OH
I
H
Serine (Ser, S)
Lysine (Lys, K)
Methionme (Met, M)
CO0-
CO0-
I *H3N--C--H I
H--C--OH
I
CH 3
Threonine (Thr, T)
I *H3N-- C-- H I CH 2
Phenylalanine (Phe, F)
OH
Tyrosine (Tyr, Y)
CO0-
CO0-
I +H3N--C--H I
I I
+H3N-- C - - H
CH 2
I
~"~'~N/IclH
(
Proline (Pro, P)
CH H3C
CH 3
H Tryptophan (Trp, W)
Valine (Val, V)
Figure 2 Chemical formulae for the amino acids found in proteins. The single letter amino acid code is used in the entries and the three letter code when discussing particular amino acids in the text. The side-chains are shown in bold.
15
Introduction D. Database searching Several sites allow database searching by BLAST, BLITZ and FASTA. For example for details send "help" message to [email protected] [email protected] [email protected]
Sequence The human amino acid sequence is given if it is known. If a sequence has only been obtained in a different species, then this is given. The single letter code for amino acids is used and this is defined in Fig. 2 together with their chemical formulae. In most cases where structural and functional data are discussed it is with respect to sequence numbering of the fully processed form of the molecule. If the Nterminus has not been defined by sequence analysis the signal sequence for secretion is predicted from comparisons with sequences where the position of signal cleavage has been determined. The consensus rules for defining cleavage sites are reviewed in ref. 24. The signal sequence is shown on a separate line. For molecules likely to have transmembrane sequences the proposed hydrophobic regions are underlined. For GPI anchors, signal sequences are predicted using criteria discussed in Chapter 4 and the GPI signal sequence is shown on a separate line below the proposed processed sequence. Thus the amino acid length of the predicted mature protein is given (and used in the Mr value given at the beginning of the entry) as well as the length of the GPI signal sequence if present. Variants produced by alternative splicing are shown only where they are known to be expressed.
References It is not feasible to give a comprehensive list of references. The references that are given are recent ones that should allow access to the rest of the literature and key references are highlighted in bold. In some cases these are recent reviews but in others it is the most recent paper containing relevant references.
References 1 Bernard, A. et al. (1984) Leukocyte Typing. Springer-Verlag, Berlin, pp. 1-814. 2 Saalmuller, A. (1996) Characterization of swine leucocyte differentiation antigens. Immunol. Today 17, 352-354. 3 Bork, P. and Bairoch, A. (1995) Extracellular protein modules: A proposed nomenclature. Trends Biochem. Sci. 20, Suppl. March, CO3 4 http://swan.embl-heidelberg.de:8080/Modules/. Modules in extracellular proteins. s Shakin-Eshleman, S. et al. (1996) The amino acid at the X position of an Asn-X-Ser sequon is an important determinant of N-linked core-glycosylation efficiency. J. Biol. Chem. 271, 6363-6366. 6 Bause, E. and Hettkamp, H. (1979) Primary structure requirements for N-glycosylation of peptides in rat liver. FEBS Lett. 108, 341-344. 7 Kornfeld, R. and Kornfeld, S. (1985) Assembly of asparagine-linked oligosaccharides. Annu. Rev. Biochem. 54, 631-664.
16
8 Gavel, Y. and von Heijne, G. (1990) Sequence differences between glycosylated and non-glycosylated Asn-X-Thr/Ser acceptor sites: implications for protein engineering. Protein Eng. 3, 433-442. 9 Wilson, I.B.H. et al. (1991) Amino acid distributions around O-linked glycosylation sites. Biochem. J. 275, 529-534. lo Gooley, A.A. et al. (1991) Glycosylation sites identified by detection of glycosylated amino acids released from Edman degradation: the identification of Xaa-Pro-Xaa-Xaa as a motif for Thr-O-glycosylation. Biochem. Biophys. Res. Commun. 178, 1194-1201. 11 http://gdbwww.gdb.org/gdb/gdbtop.html. The Genome Database. 12 http://www.hgmp.mrc.ac.uk/gdb/gdbtop.html. The Genome Database. 13 Ryu, S.E. et al. (1990) Crystal structure of an HIV-binding recombinant fragment of human CD4. Nature 348, 419-426. 14 Sharp, P.A. (1981) Speculations on RNA splicing. Cell 23, 643-646. is Schlossman, S. et al. (1995) Leucocyte Typing V. Oxford University Press, Oxford, UK, pp. 1-2044. 16 http://www.ebi.ac.uk/. European Bioinformatics Institute (EBI). 17 http://www.fmi.ch/biology/research_tools.html. Pedro's BioMolecular Research Tools. 18 http://www.ncbi.nlm.nih.gov/. National Center for Biotechnology Information (NCBI). 19 http://www.elsevier.com/locate/trendsguide. The Trends Guide to the Internet. 2o Swindell, S. et al. (1996) Internet for the Molecular Biologist. Horizon Scientific Press, Wymondham. 21 Harper, R. (1996) EMBnet: an institute without walls. Trends Biochem. Sci. 21, 150-152. 22 http://expasy.hcuge.ch/. SWISSPROT. 23 http://pdb.pdb.bnl.gov/mirror sites.html. Protein database. 24 von Heijne, G. (1983) Patterns of amino acids near signal-sequence cleavage sites. Eur. J. Biochem. 133, 17-21.
17
2 The discovery and biochemical ana.lysis of leucocyte surface antigens EARLY STUDIES ON CELL MEMBRANES In the 1960s and early 1970s there were few useful techniques for the analysis of the cell surface molecules of eukaryotic cells and important early data were obtained from studies on cells that had relatively simple plasma membranes such as human erythrocytes. Studies with the electron microscope established the concept of a lipid bilayer encompassing eukaryotic cells and it was demonstrated by radiolabelling inner and outer membranes of erythrocytes, that the membrane proteins glycophorin and Band 3 spanned this bilayer 1. This was later confirmed by the sequencing of glycophorin which established the presence of a stretch of hydrophobic amino acids sufficient to traverse the bilayer 2. The concept of a fluid membrane with proteins surrounded by lipids and free to move in the bilayer was proposed 3, along with the concept that signal transduction would occur by an event outside the cell somehow being transmitted to the interior via a surface molecule with a transmembrane sequence and a cytoplasmic domain. These ideas were given credence by the observation that molecules could mix between membranes when cells were fused together 4, and by the finding that reactions with antibodies could lead to a cell surface molecule capping at one pole of a cell with cytoskeletal proteins accumulating underneath the cap s. The interpretation of the capping phenomenon has not become simple with further study, for example capping can occur via glycolipids or GPIanchored molecules that do not traverse the bilayer. Despite these complications the phenomenon was of considerable conceptual influence in the early 1970s. The early studies on the erythrocyte and other model systems did not give a general method for analysing complex cell membranes. However considerable progress was possible with the introduction of techniques for solubilizing membrane molecules using detergents 6 and affinity chromatography 7 (reviewed in ref. 8). It was established that ionic detergents like sodium dodecyl sulfate (SDS)would bind to both hydrophilic and hydrophobic parts of protein sequences and this led to the technique of SDS polyacrylamide gel electrophoresis, which allowed resolution of proteins roughly in proportion to their molecular weight 9. SDS was of minimal use in the purification of membrane molecules since it often destroyed biological activities and led to the loss of antigenic determinants. Weakly ionic detergents like deoxycholate were more useful since they bound predominantly to the hydrophobic parts of cell surface molecules. They did interfere with some ionic interactions, notably the binding of histones to DNA and thus deoxycholate could not be used on whole cells but instead membranes had to be prepared prior to extraction. The advantages of deoxycholate were that it gave very good solubilization of molecules and that it had a small micelle size such that its binding did not interfere with the behaviour of solubilized molecules in fractionation techniques such as gel filtration. Also it could be removed from proteins by dialysis against deoxycholate-free buffer. Non-ionic detergents bound only to hydrophobic domains forming a micelle of lipid-like molecules around the hydrophobic region. These detergents have the disadvantage that they interfere with molecular properties due to their large micelle size and are difficult to remove from proteins. In addition they sometimes yield membrane molecules in complexes, rather than as a single molecule plus detergent.
18
The discovery and analysis of leucocyte surface antigens MHC Class I
Thy-1
Z, """ Non-ionic detergent
(e.g. NP40, Digitonin)
Weakly ionic detergent
(e.g. Deoxycholate)
Figure 1 Detergent binding to cell membrane proteins. The binding of detergent to the solubilized monomeric forms of two cell surface antigens, one GPI-anchored and one with a conventional transmembrane is illustrated. The non-ionic detergents can form large micelles and complexes with more than one protein. The size of deoxycholate micelles is generally much smaller but depends on the pH and salt concentrations. The symbols for each molecule of non-ionic detergent are (cD) and for the weakly ionic detergent, deoxycholate (z=>r). For purification studies this was a disadvantage lo, but what was formerly a problem is now exploited as a way of studying multimolecular complexes in the membrane that may be of biological relevance. The differences between deoxycholate and non-ionic detergents in solubilizing cell surface molecules are illustrated in Fig. 1. The technique of affinity chromatography was developed in the late 1960s with the highly effective cyanogen bromide coupling method being first reported in 1967 7. However, whilst antibodies against known antigens could be purified, the converse of purifying a cell surface antigen with antibody affinity chromatography was not considered to be useful. Poor results with the method were probably due to the inability to raise high titre, specific sera against molecules that were not available in pure form (reviewed in ref. 8). In contrast, affinity chromatography with lectins was developed as a most useful technique for purifying cell surface glycoproteins 11 and was a key element of the first studies in which leucocyte antigens were purified.
MOUSE I M M U N O G E N E T I C S A N D THE SEROLOGICAL APPROACH TO CELL SURFACE ANTIGENS Mouse H2 antigens were discovered by Gorer using haemagglutination assays on red
19
A B~
F
Immunize strain y animal with strain x cells to get anti-B ~
A Be
F
Figure 2 Mouse immunogenetic approach to the identification of cell surface antigens. Production of alloantisera by the immunization of strain "y" mice with cells from strain "x" that differ only in a polymorphic determinant of the antigen "B" cells, with the first antiserum that reliably identified a polymorphism being an absorbed rabbit anti-mouse erythrocyte serum (reviewed in ref. 12). Then sera were raised between mouse strains and the concept was introduced of backcrossing a strain to isolate one polymorphic antigenic determinant against a background of molecules identical and thus non-immunogenic between the strains (Fig. 2)13. In this way one molecule could be confidently recognized from a complex mixture of other molecules. Agglutination assays could not be used to study nucleated cells but an effective cytotoxicity assay was developed 14. With this assay, alloantigens of mouse thymocytes and lymphocytes were sought, leading to the discovery of the Tla (MHClike molecules)is and Thy-1 antigens 16 and later to the Ly-1 (CDS), Ly-2 (CD8~) and Ly-3 (CD8fl) antigens ss. Other antigens to be discovered at an early stage were the rat ART-1 (or Ly-1 antigen) 17'18 and the mouse Ly-5 antigen that were later both established as polymorphic determinants of the CD45 antigen. Studies on Thy-1 antigen were of major importance in delineating B from T lymphocytes 19 and work on the Ly-1 and Ly-2,3 antigens led to the discovery of subsets amongst T cells 2o Studies on mouse alloantigens presented a major step forward in the analysis of lymphocyte surfaces but this approach had considerable problems. It was applicable only to rodents, where inbred strains could be produced, and sera were often weak and dependent on the use of the cytotoxicity assay for analysis. Many sera contained heterophilic and anti-viral antibodies and specificity was probably often achieved by the antibody to an alloantigen giving the final increment of binding necessary to achieve cell death in the cytotoxicity assay which is of an all-or-none nature. In quantitative binding assays alloantisera often gave poor specificity. The dependence on cytotoxicity assays was a major problem for biochemical studies involving the detergents necessary for solubilization (see below) because these detergents in trace amounts also caused cell lysis.
QUANTITATIVE SEROLOGY AND XENOGENEIC ANTIBODIES The first quantitative binding studies were done for Thy-1 and H2 antigens using 3Hlabelled antibodies 21 and in the analysis of the amount of surface Ig on B cells using l~SI-labelled antibodies 22. Anti-Ig antibodies could be purified by affinity chromatography but saturating binding was not obtained with such antibodies due to aggregation involving the uncharacterized interactions of the Fc regions of acideluted antibodies. However l~SI-labelled purified antibody in the form of F(ab'b. fragments did give saturating binding 23 and also avoided any problems due to
20
Preadsorb antibody to easure antigen by inhibition t . c~~ /
x§ anti-X wash
X anti-X a d d ~
125I anti-lg
perqxi,dase ann-lg
fluore.s~ence ~ t h r o c y t e ~ ant~-ig
complement
wash and detect by
y-counter enzyme assay flow cytometry -" analysis
~
rosettes
cell killing
analysisand cell fractionation
Figure 3 Binding assays for cell surface antigens. binding by Fc receptors. Thus purified F(ab')2 antibodies became the preferred reagent for the serological detection of antigen by an indirect binding assay using a variety of markers to reveal the reactions (Fig. 3). Protein A was also used as an alternative to F(ab'b. anti-immunoglobulin as a second reagent 24. Binding in the presence of detergents was possible if glutaraldehyde-fixed cells were used and antigen in extracts could then be measured by inhibition assays 2s. The development of flow cytometry revolutionized studies on cell subsets 26 and analysis of the apparent molecular weight of an antigen was often made simple by the finding that indirect binding assays could be performed on antigens after SDS-PAGE (Western blotting or immunoblotting2Z). Thus binding serology replaced the cytotoxicity assay and a quantitative and biochemical approach to the cell surface became possible 28 The major problem in using antibodies to study cell surface molecules was to produce a specific antibody against a cell surface molecule. If an immunization was made across a species barrier then potentially all cell surface molecules might be antigenic due to divergence in evolution and thus antibodies to previously unknown cell surface molecules could be produced, but the problem was that unless a pure antigen was used, a mixture of antibodies would result. Various attempts to overcome this problem were made with one approach being to try to identify antigens via inhibition assays with absorbed xenoantisera and use this assay to follow the purification of the antigens. Then strong specific antisera could be produced against the purified antigen. This scheme was successful for obtaining anti-Thy-1 xenoantisera 29 and in initial studies on the CD45 antigen (T200 or the leucocyte common antigen), but there was no general solution to resolving the complexity of xenoantisera (reviewed in ref. 28).
21
M O N O C L O N A L ANTIBODIES In 1975 the hybridoma method to produce monoclonal antibodies was described and initially was discussed in terms of producing antibodies "of predetermined specificity" 3o. A more interesting possibility from the viewpoint of analysis of the cell surface was to use the method in a "shot gun" technique to resolve the complexity of a xenogeneic immunization and to discover new cell surface molecules 31. The principles are shown in Fig. 4. The method was investigated in a mouse anti-rat thymocyte fusion and five new antigens that all marked subsets of lymphoid cells were discovered including the CD4 (W3/25) and CD43 (W3/13) antigens 31. Binding serology as in Fig. 3 was essential for the effective use of the mAb approach since a single antibody against one epitope is not usually effective in a cytotoxicity assay. The xenogeneic approach was applicable to human cells and mAbs to human antigens were soon produced including antibodies to MHC Class I antigen 32, CD1 33, CD3, CD4 and CD8 34 in the earliest studies. The use of mAbs was particularly effective in combination with flow cytometry since background binding was almost non-existent and flow cytometry became increasingly used to characterize new mAbs. In studies on human antigens the use of flow cytometry labelling became the basis for grouping antigens in "Clusters of Differentiation" or CD antigens 3s. Those mAbs that gave the same patterns of labelling on different cell types were considered to label the same antigen and the naming of antigens was formalized in these workshops on human leucocytes 36. This systematic approach has been of great benefit in the field since the CD names have allowed a common nomenclature not only for human antigens but also for the homologous antigens of other species. The CD groupings have mostly identified
C
/
immunize mouse'~
/
""---~serum anti -A too complex
spleen ~_..~anti-B --,"-'~ ~ - ) ~ - - - ~ anti-A ~ anti-C ~ O Q ~ fuse
[
,use )
anti-A
mouse I myeloma cell line anti-A MAb
hybrid cell line
Figure 4 Monoclonal antibodies for the identification of cell surface proteins. 22
unique antigens and a CD antigen is validated when it is cloned and sequenced at the cDNA level and expressed in soluble or cell-bound form to provide a reagent to check that relevant mAbs all react with the same gene product. As well as identifying antigens, mAbs have been of great use in analysing cellular and molecular functions. It was found that the W3/25 (CD4)mAb could be used to separate functional subpopulations of T cells 3z and would block the mixed lymphocyte response 38 and an autoimmune disease in v i v o 39. MAbs that could activate responses were seen in the activation of T cells with CD3 mAbs4~ Inhibition or stimulation of cell responses have subsequently been seen in many other cases and while the meaning of such results can sometimes be difficult to interpret, the use of mAbs in functional studies has been of major importance.
MOLECULAR ANALYSIS - AMINO ACID SEQUENCING Molecular analysis of leucocyte antigens began with the purification of the papain fragment of mouse MHC Class I antigen 4~ and the demonstration that radiolabelled antigens could be immunoprecipitated from detergent extracts 42 Biochemical purification of leucocyte membrane molecules was difficult but was achieved for Thy-1 antigen by use of lentil lectin affinity chromatography and gel filtration. Antibody affinity chromatography using polyclonal antisera was used and elution at high pH in deoxycholate was introduced 43. The only leucocyte antigens to be entirely sequenced at the protein level were Ig 44,4s, fi~ microglobulin 46, MHC Class 147 and I148 and Thy- 1 49
MOLECULAR ANALYSIS - NUCLEIC ACID SEQUENCING LEUCOCYTE GENES
OF
Apart from the antigens described in the paragraph above, all other leucocyte antigens have been sequenced first at the DNA level and this has been achieved in a variety of ways. 1 MAb affinity columns were found to be very effective for antigen purification so, s1 and from the pure protein, peptide sequence could be obtained to allow the prediction of a mixture of oligonucleotides that would encode the sequence in question. The oligonucleotide mixture could be synthesized and then used to screen a cDNA library and isolate the relevant clone s2-ss. In some recent cases the mixture of oligonucleotides has been used in the polymerase chain reaction to amplify cDNA. As far as we are aware, the primary cloning of the following antigens was achieved in this way: CD2, CD3~, CD3~, CD5, CD8fl, CD9, CD10, C D l l a , C D l l b , C D l l c , CD18, CD21, CD23, CD25, CD32, CD35, CD43, CD45, CD46, CD47, CD49a, CD49f, CD50, CD52, CD54, CD55, CD58, CD61, CD62L, CD62P, CD69, CD87, CD103, CD107a, CD120a, CD132, CD151, CD158, aminopeptidase A, BP1/6C3, DEC-205, DNAM-1, E-selectin ligand, Fc~RI~, Fc~RIfl, Fc~RI?, FLT3 ligand, GlyCAM-1, L1, LPAP, Ly-6, Ly-49, M130, Mac-2 binding protein, mannose receptor, OX2, PC-l, scavenger receptor, sialoadhesin, TCR~. 2 DNA was transfected into recipient cells and expression of antigen at the cell surface assayed. This was followed by re-isolation of the transfected DNA which
23
was used for screening cDNA or genomic libraries s6,sT. With this procedure the following antigens were cloned: CD4, CD8a, CD33, CD63, CD98. 3 Bacterial expression systems were used in which cDNA was cloned into 2 phage and screened with antibodies ss. This gave rise to cloning of: CD2, CD3~, CD3?, CD9, CD13, CD26, CD29, CD30, CD31, CD41, CD42a, CD42ba, CD42bfl, CD44, CD48, CD49b, CD49c, CD49d, CD49e, CD51, CD96, CD99, CD105, CD119, CD130, CD147, CD166, B-G, gp42, F4/80, RT6. 4 Differential subtraction of cDNA was used to isolate clones specific for a certain cell type sg. This gave rise to the cloning of: CD20, CD22, CD45, CD62L, CD79a, CD79b, CD82, CD83, CD124, CDw137, CD152, HTm4, LAG-3, MS2, NKG2, PD1, TCRfl, TCRT. 5 Crosshybridization using probes for related proteins was used 6o. This gave rise to cloning of: CD24, CD66, CD88, CD91, CDll7, CD128b, CD135, CD158, ltk, mannose R, MARCO, macrophage lectin, MDR1, M-CSFR (CDll5), thrombopoietin receptor. 6 Methods were used such as "hybrid arrest translation", involving the purification of mRNA by antibody selection of mRNA on membrane-associated polysomes or testing of cDNA clones for their ability to bind to specific mRNA. The mRNA was assayed by translation in vitro or in Xenopus oocytes 61,62.This gave rise to cloning of: CD1, CD56, CD71, CD74. 7 A major impetus to cloning of leucocyte antigens was provided by the expression cloning system devised by Seed in which cDNA is cloned into a bacterial vector that will also replicate and give expression in eukaryotic cells. The cDNA library is transfected into COS cells and those cells expressing the required antigens are selected using antibody in a panning or similar technique. The plasmids are recovered from the positive COS cells, amplified in bacteria and the cycle is repeated until a single clone giving expression of the antigen is isolated 63-6s In some cases ligands such as cytokines were used instead of antibodies (e.g. ref. 66). This method has been highly effective and has resulted in the primary cloning of the following antigens: CD2, CD6, CD7, CD14, CD16, CD19, CD22, CD24, CD27, CD28, CD31, CD33, CD34, CD36, CD37, CD38, CD40, CD44, CD50, CD52, CD53, CD54, CD58, CD59, CD62L, CD64, CD66, CD68, CD69, CD70, CD72, CD80, CD81, CD85, CD86, CD94, CD95, CD100, CD101, CD120b, CD121a, CD121b, CD122, CDw123, CDw125, CD126, CD127, CDw128, CD129, CDwl31, CD134 (OX40), CDwl50, CD161, CD162, CD40 ligand, l14/A10, 2B4, 4-1BB ligand, CMRF35 antigen, Fas ligand, FcaR, G-CSFR (CDll4), GM-CSFR~ (CDll6), IL-1R AcP, IL-10R, IL-12Rfl, IL-13R~, IL-15R~, IL-17R, Ly-9, Ly-49, OX40 ligand, Sca-2, WC-1. 8 The isolation of related proteins by polymerase chain reaction has been successful for many antigens in general although only a few leucocyte surface proteins 67. In some cases the method was combined with oligonucleotides based on peptide data or expression cloning in phage 6s. This method gave rise to cloning of://7 integrin, f14 integrin (CD104), CD1 ld, CD50, CD91, CD148, IL-11R~. 9 A strategy based on direct cloning of human transcripts from hybrid cell lines using a PCR strategy together with Alu repeats was successful in cloning the EMR1 antigen which is probably the human homologue of the F4/80 antigen 69. A comparable strategy using cosmids for chromosome 21 to screen a cDNA library was successful for IFNTR AF-1. 10 Many new sequences are being determined by random sequencing of cDNA clones or by genomic sequencing. Many of the former are available in the "expressed
m
sequence tag or EST" databases. This has revealed a number of new genes often related to those already identified. They are not included in the entries in this book where additional data are required such as evidence of expression, tissue distribution and availability of antibodies recognizing the antigens. One exception to the above methods is the TCRb which was isolated first at the genomic level by sequencing clones encoding the TCRa V-genes 7o. In some cases the same cDNA was cloned independently by more than one group at about the same time and hence appears more than once in the above list. A plot of the progress of sequence determination of cell surface molecules versus time is shown in Fig. 5. Currently the Seed method is the cloning method of choice with the proviso that it has drawbacks for very large molecules and for molecules that cannot be expressed as single chains. The method via protein purification is reliable but it can be tedious to isolate enough pure antigen for sequence determination, particularly if an antigen is present at very low numbers or on a minor cell type. However, techniques of microsequencing have been improving rapidly, including the development of sensitive mass spectrographic methods 71.
AMINO ACID SEQUENCES A N D THE SUPERFAMIL-Y CONCEPT The amino acid sequences of most leucocyte surface proteins contain regions of sequence that have similarities to regions present in other proteins. These are termed superfamilies 72 and it was predicted that these sequences were derived by divergent evolution from common precursors and would have similar structures and types of functions. The first superfamily of leucocyte surface proteins to be defined was the immunoglobulin superfamily (IgSF)73 and this is now one of the largest Seed's method
35 30 25 c
20 cDNA cloning
"0, - 1 5
6 Z
lo
0,1
CO
~"
tl~
r
I'~
O0
CYJ
0
~--
0,1
CO
~"
U~
r
I~
O0
~
0
v--
Ckl
CO
~-
L~
Date of primary sequencing or cloning
Figure 5. Time course for the primary cloning of leucocyte antigens. The number of new leucocyte antigens cloned each year is indicated.
m
The discovery and analysis of leucocyte surface antigens
with more than 100 different polypeptides expressed by a variety of cell types 74. In protein superfamilies there is often only 15-25 % sequence identity and these initial assignments were controversial 74. However the determination of structures of several of these proteins such as CD2 and CD4 (see below) confirmed the predictions. One important aspect of the superfamilies is that they often correspond to domains. The conserved residues tend to relate to important structural features that are characteristic of the superfamily in question, as shown for IgSF domains where these residues are clustered mainly in regions corresponding to the inpointing residues of fl strands of the Ig-fold with a subset of these forming the core of the fold 7s. In contrast, the regions corresponding to the loops at the ends of the strands mostly show greater sequence diversity. Several different superfamilies have been identified among leucocyte surface molecules and these are described in Chapter 3 together with their relationship to protein domains.
MOLECULAR
ANALYSIS - PROTEIN STRUCTURE
The presence of a transmembrane or GPI anchor in membrane proteins hinders the preparation of soluble protein at the high concentrations necessary for techniques of structure determination such as NMR and X-ray crystallography. NMR is not suitable for large proteins but the structures of several single domains have been determined. The first structural information came from the X-ray crystallography of membrane proteins that also occurred naturally as soluble forms, namely immunoglobulin light chain dimers (Bence Jones proteins)76 and fl2-microglobulin from urine 77; later fragments of immunoglobulins were prepared by limited proteolysis and their structure determined 76. Subsequently the structures of the intra- and extracellular parts of membrane proteins have been determined separately. The first structure determined was the extracellular region of MHC Class I antigen which could be cleaved by papain to give a soluble fragment 7s. A key to this success was the ability to select cell lines that expressed the antigen at high levels and the good efficiency of the papain cleavage. This approach in combination with recombinant expression in insect cells was successful for MHC Class I179. The structure of the Ly-6SF domain of CD59 was determined by NMR from native soluble fragments found in urine so With the introduction of recombinant DNA techniques and a variety of different expression systems many structures of domains of leucocyte surface antigens have been determined. NMR structures were obtained by expressing single domains in E. coli, e.g. rat CD2dl s/, in transfected celt lines such as Chinese hamster ovary (CHO) cell lines, e.g. human CD2dl sa and CD59 s3 and in yeast expression systems such as Pichia, e.g. NCAM (CD56) domain 1 s4. All these systems have been used for X-ray crystallography studies, e.g. CHO cells for CD4dl +2 ss,86, CD4d3+4 s7 rat CD2 ss, human CD2 sg, CD8~9o, CD62zdl+2 91 E. tOll for integrin I-domain 92, T cell receptor a chain 93, TNFR 94 VCAM-ldl 9 s a myeloma expression system for T cell receptor fl chain 96 and yeast (Pichia pastoris) for CD40L 97. In addition, crystal structures for the T cell receptor and complexes of TCR with MHC Class I antigens have recently been determined using expression in insect cells 9s and in E. c01i99. Structures for nearly all the domain types found on the extracellular side of leucocytes have now been determined ~oo. Many antigens contain large numbers of domains in a single polypeptide and the complete polypeptides are less amenable to
26
The discovery and analysis of leucocyte surface antigens structural analysis than single domains, probably because of flexibility between some of the domains. In most of the cases above the structures are for only one or two domains unless the polypeptides form a larger complex, e.g. the TNFR contains four domains but the structure determined was of a complex of the trimeric TNF and three TNFR chains 94. However, recently the structure of a stretch of four Fn3 domains from fibronectin has been determined1~ Electron microscopy has been valuable in examining the overall topology of domains of larger proteins and this is discussed further in Chapter 4.
References 1 Bretscher, M.S. (1971) Major human erythrocyte glycoprotein spans the cell membrane. Nature New Biol. 231,229-232. 2 Tomita, M. and Marchesi, V.T. (1975)Amino-acid sequence and oligosaccharide attachment sites of human erythrocyte glycophorin. Proc. Natl Acad. Sci. USA 72, 2964-2968. 3 Singer, S.J. and Nicolson, G.L. (1972) The fluid mosaic model of the structure of cell membranes. Science 175, 720-731. 4 Frye, L.D. and Edidin, M. (1970)The rapid intermixing of cell surface antigens after formation of mouse-human heterokaryons. J. Cell. Sci. 7, 319-335. s Taylor, R.B. et al. (1971) Redistribution and pinocytosis of lymphocyte surface immunoglobulin induced by anti-immunoglobulin antibodies. Nature New Biol. 233, 225-229. 6 Helenius, A. and Simons, K. (1975) Solubilization of membranes by detergents. Biochim. Biophys. Acta 415, 29- 79. z Axen, R. et al. (1967) Chemical coupling of peptides and proteins to polysaccharides by means of cyanogen halides. Nature 214, 1302-1304. 8 Arvieux, J. and Williams, A.F. (1988) Antibodies, a Practical Approach, Catty, D., ed. IRL Press, Oxford, pp. 113-136. 9 Weber, K. and Osborn, M. (1969)The reliability of molecular weight determinations by dodecyl sulfate-polyacrylamide gel electrophoresis. J. Biol. Chem. 244, 4406-4412. lo Muirhead, M.L. et al. (1974) Preliminary characterization of Thy-l.1 and Ag-B antigens from rat tissues solubilized in detergents. Biochem. J. 143, 51-61. 11 Allan, D. et al. (1972) Glycoprotein receptors for concanavalin A isolated from pig lymphocyte plasma membrane by affinity chromatography in sodium deoxycholate. Nature New Biol. 236, 23-25. 12 Klein, J. (1975) Biology of the Mouse Histocompatibility-2 Complex. SpringerVerlag, Berlin. 13 Snell, G.D. (1981)Studies in histocompatibility. Science 213, 172-178. 14 Gorer, P.A. and O'Gorman, P. (1956) The cytotoxic activity of isoantibodies in mice. Transplant. Bull. 3, 142-143. is Boyse, E.A. and Old, L.J. (1969) Some aspects of normal and abnormal cell surface genetics. Annu. Rev. Genet. 3, 269-290. 16 Reif, A.E. and Allen, J.M.V. (1964)The AKR thymic antigen and its distribution in leukemias and nervous tissue. J. Exp. Med. 120, 413-433. 17 Lubaroff, D.M. (1973)An alloantigenic marker on rat thymus and thymus-derived cells. Transplant. Proc. 5, 115-118. 18 Fabre, J.W. and Morris, p.J. (1974)The definition of a lymphocyte-specific alloantigen system in the rat (Ly-1). Tissue Antigens 4, 238-246.
27
19 Raft, M.C. (1971) Surface antigenic markers for distinguishing T and B lymphocytes in mice. Transplant. Rev. 6, 52-80. 2o Kisielow, P. et al. (1975) Ly antigens as markers for functionally distinct subpopulations of thymus-derived lymphocytes of the mouse. Nature 253, 219-220. 21 Hammerling, U. and Eggers, H.J. (1970) Quantitative measurement of uptake of alloantibody on mouse lymphocytes. Eur. J. Biochem. 17, 95-99. 22 Nossal, G.J.V. and Lewis, H. (1972)Variation in accessible cell surface immunoglobulin among antibody-forming cells. J. Exp. Med. 135, 1416-1422. 23 Jensenius, J.C. and Williams, A.F. (1974)The binding of anti-immunoglobulin antibodies to rat thymocytes and thoracic duct lymphocytes. Eur. J. Immunol. 4, 91-97. 24 Dorval, G. et al. (1975) A radioimmunoassay of cellular surface antigens on living cells using iodinated soluble protein A from Staphylococcus aureus. J. Immunol. Methods 7, 23 7-230. 2s Williams, A.F. (1973)Assays for cellular antigens in the presence of detergents. Eur. J. Immunol. 3, 628-632. 26 Bonner, W.A. et al. (1972) Fluorescence activated cell sorting. Rev. Sci. Instrum. 43, 404-409. 27 Towbin, H. et al. (1979) Electrophoretic transfer of proteins from polyacrylamide gels to nitrocellulose sheets: procedure and some applications. Proc. Natl Acad. Sci. USA 76, 4350-4354. 2s Williams, A.F. (1977)Differentiation antigens of the lymphocyte cell surface. Contemp. Top. Mol. Immunol. 6, 83-116. 29 Morris, R.J. and Williams, A.F. (1975) Antigens on mouse and rat lymphocytes recognized by rabbit antiserum against rat brain: the quantitative analysis of a xenogeneic antiserum. Eur. J. Immunol. 5, 274-281. 3o Kohler, G. and Milstein, C. (1975) Continuous cultures of fused cells secreting antibody of predefined specificity. Nature 256, 495-497. 31 Williams, A.F. et al. (1977) Analysis of cell surfaces by xenogeneic myeloma-hybrid antibodies: differentiation antigens of rat lymphocytes. Cell 12, 663-673. 32 Barnstable, C.J. et al. (1978) Production of monoclonal antibodies to group A erythrocytes, HLA and other human cell surface antigens~new tools for genetic analysis. Cell 14, 9-20. 33 McMichael, A.J. et al. (1979) A human thymocyte antigen defined by a hybrid myeloma monoclonal antibody. Eur. J. Immunol. 9, 205-210. 34 Terhorst, C. et al. (1980) Biochemical analysis of human T lymphocyte differentiation antigens T4 and T5. Science 209, 520-521. 3s Bernard, A. et al. (1984)Leucocyte Typing. Springer-Verlag, Berlin, pp. 1-814. 36 Knapp, W. et al. (1989) Leucocyte Typing IV. Oxford University Press, Oxford, pp. 1-1182. 37 White, R.A.H. et al. (1978) T-lymphocyte heterogeneity in the rat: separation of functional subpopulations using a monoclonal antibody. J. Exp. Med. 148, 664-673. 3s Webb, M. et al. (1979) Inhibition of mixed lymphocyte response by monoclonal antibody specific for a rat T lymphocyte subset. Nature 282, 841-843. 39 Brostoff, S.W. and Mason, D.W. (1984) Experimental allergic encephalomyelitis: successful treatment in vivo with a monoclonal antibody that recognises T helper cells. J. Immunol. 133, 1938-1942.
28
4o Van Wauwe, J.R et al. (1980)OKT3: a monoclonal anti-human T lymphocyte
antibody with potent mitogenic properties. J. Immunol. 124, 2708-2713. 41 Shimada, A. and Nathenson, S.G. (1969) Murine histocompatibility-2 (H-2) alloantigens. Purification and some chemical properties of soluble products from H-2b and H-2d genotypes released by papain digestion of membrane fractions. Biochemistry 8, 4048-4062. 42 Schwartz, B.D. et al. (1973)H-2 histocompatibility alloantigens. Some biochemical properties of the molecules solubilized by NP-40 detergent. Biochemistry 12, 2157-2164. 43 Letarte-Muirhead, M. et al. (1975) Purification of the Thy-1 molecule, a major cell-surface glycoprotein of rat thymocytes. Biochem. J. 151,685-697. 44 Hill, R.L. et al. (1966)The evolutionary origins of the immunoglobulins. Proc. Natl Acad. Sci. USA 56, 1762-1769. 4s Hilschmann, N. and Craig, L.C. (1965) Amino acid sequence studies with Bence Jones proteins, proc. Natl Acad. Sci. USA 59, 613-619. 46 Cunningham, B.A. et al. (1973)The complete amino acid sequence of beta 2-microglobulin. Biochemistry 12, 4811-4822. 47 Orr, H.T. et al. ( 1 9 7 9 ) T h e heavy chain of human histocompatibility antigen HLA-B7 contains an immunoglobulin-like region. Nature 282, 266-270. 48 Gotz, H. et al. (1983) Primary structure of human class II histocompatibility antigens 3rd communication. Amino acid sequence comparison between DR and DC subclass antigens derived from a lymphoblastoid B cell line homozygous at the HLA loci (HLA-A3,3; B7,7; Dw2,2; DR2,2: MTI,1; Dcl,l: MBI,1). Hoppe Seylers Z. Physiol. Chem. 364, 749-755. 49 Campbell, D.G. et al. (1981) Rat brain Thy-1 glycoprotein. The amino acid sequence, disulfide bonds and an unusual hydrophobic region. Biochem. J. 195, 15-30. so Sunderland, C.A. et al. (1979) Purification with monoclonal antibody of a predominant leukocyte-common antigen and glycoprotein from rat thymocytes. Eur. J. Immunol. 9, 155-159. sl Parham, R (1979) Purification of immunologically active HLA-A and -B antigens by a series of monoclonal antibody columns. J. Biol. Chem. 254, 8 7 0 9 - 8 7 1 2 . s2 Wallace, R.B. et al. (1981)The use of synthetic oligonucleotides as hybridization probes. II. Hybridization of oligonucleotides of mixed sequence to rabbit beta-globin DNA. Nucleic Acids Res. 9, 8 7 9 - 8 9 4 . s3 Stetler, D. et al. (1982) Isolation of a cDNA clone for the human HLA-DR antigen alpha chain by using a synthetic oligonucleotide as a hybridization probe. Proc. Natl Acad. Sci. USA 79, 5 9 6 6 - 5 9 7 0 . s4 Moriuchi, T. et al. (1983)Thy-1 cDNA sequence suggests a novel regulatory mechanism. Nature 301, 80-82. ss Cosman, D. et al. (1984)Cloning, sequence and expression of human interleukin-2 receptor. Nature 312, 7 6 8 - 7 7 1 . s6 Kavathas, R et al. (1984) Isolation of the gene encoding the human T-lymphocyte differentiation antigen Leu-2 (T8) by gene transfer and cDNA subtraction. Proc. Natl Acad. Sci. USA 81, 7 6 8 8 - 7 6 9 2 . sz Maddon, P.J. et al. (1985)The isolation and nucleotide sequence of a cDNA encoding the T cell surface protein T4: a new member of the immunoglobulin gene family. Cell 42, 93-104. s8 Young, R.A. and Davis, R.W. (1983) Efficient isolation of genes by using antibody probes. Proc. Natl Acad. Sci. USA 80, 1194-1198.
29
s9
6o
61
6e
63
64 6s 66 67 68 69 7o 71 7e 73 74 7s
76 77 78 79 8o
30
Hedrick, S.M. et al. (1984) Isolation of cDNA clones encoding T cell-specific membrane-associated proteins. Nature 308, 149-153. Qiu, F.H. et al. (1988) Primary structure of c-kit: relationship with the CSF-1/ PDGF receptor kinase family - oncogenic activation of v-kit involves deletion of extracellular domain and C terminus. EMBO J. 7, 1003-1011. Korman, A.J. et al. (1982) cDNA clones for the heavy chain of HLA-DR antigens obtained after immunopurification of polysomes by monoclonal antibody. Proc. Natl. Acad. Sci. USA 79, 1844-1848. Long, E.O. et al. (1982)Isolation of distinct cDNA clones encoding HLA-DR beta chains by use of an expression assay. Proc. Natl Acad. Sci. USA 79, 7465-7469. Seed, B. and Aruffo, A. (1987) Molecular cloning of the CD2 antigen, the T-cell erythrocyte receptor, by a rapid immunoselection procedure. Proc. Natl Acad. Sci. USA 84, 3365-3369. Seed, B. (1987) An LFA-3 cDNA encodes a phospholipid-linked membrane protein homologous to its receptor CD2. Nature 329, 840-842. Aruffo, A. and Seed, B. (1987) Molecular cloning of two CD7 (T-cell leukemia antigen) cDNAs by a COS cell expression system. EMBO J. 6, 3313-3316. Yamasaki, K. et al. (1988) Cloning and expression of the human interleukin-6 (BSF-2/IFN beta 2)receptor. Science 241,825-828. Vazeux, R. et al. (1992) Cloning and characterization of a new intercellular adhesion molecule ICAM-R. Nature 360, 485-488. Van der Vieren, M. et al. (1995) A novel leukointegrin, c~dfl2binds preferentially to ICAM-3. Immunity 3, 683-690. Corbo, L. et al. (1990) Direct cloning of human transcripts with HnRNA from hybrid cell lines. Science 249, 652-655. Chien, Y.H. et al. (1987) A new T-cell receptor gene located within the alpha locus and expressed early in T-cell differentiation. Nature 327, 677-682. Wilm, M. et al. (1996) Femtomole sequencing of proteins from polyacrylamide gels by nano-electrospray mass spectrometry. Nature 379, 466-469. Dayhoff, M.O. et al. (1983)Establishing homologies in protein sequences. Meth. Enzymol. 91,524-545. Williams, A.F. and Gagnon, J. (1982) Neuronal cell Thy-1 glycoprotein: homology with immunoglobulin. Science 216, 696-703. Williams, A.F. and Barclay, A.N. (1988)The immunoglobulin superfamilydomains for cell surface recognition. Annu. Rev. Immunol. 6, 381-405. Harpaz, Y. and Chothia, C. (1994) Many of the immunoglobulin superfamily domains in cell-adhesion molecules and surface-receptors belong to a new structural set which is close to that containing variable domains. J. Mol. Biol. 238, 528-539. Amzel, L.M. and Poljak, R.J. (1979)Three-dimensional structure of immunoglobulins. Ann. Rev. Biochem. 48, 961-997. Becker, J.W. and Reeke, G.J. (1985)Three-dimensional structure of beta 2-microglobulin. Proc. Natl Acad. Sci. USA 82, 4225-4229. Bjorkman, P.J. et al. (1987) Structure of the human class I histocompatibility antigen, HLA-A2. Nature 329, 506-512. Brown, J.H. et al. (1993)Three-dimensional structure of the human class II histocompatibility antigen HLA-DR1. Nature 364, 33-39. Fletcher, C.M. et al. (1994) Structure of a soluble, glycosylated form of the human-complement regulatory protein CD59. Structure 2, 185-199.
81 Driscoll, RC. et al. (1991) Structure of domain 1 of rat T lymphocyte CD2 antigen. Nature 353, 762-765. 82 Wyss, D. et al. (1993) 1H resonance assignments and secondary structure of the 13.6 kDa glycosylated adhesion domain of human CD2. Biochemistry 32, 10995-11006. 83 Kieffer, B. et al. (1994)3-dimensional solution structure of the extracellular region of the complement regulatory protein CD59, a new cell-surface protein domain related to snake-venom neurotoxins. Biochemistry 33, 4471-4482. 84 Thomsen, N. et al. (1996) The three-dimensional structure of the first domain of neural cell adhesion molecule. Nature Struct. Biol. 3, 581-585. 8s Ryu, S.E. et al. (1990) Crystal structure of an HIV-binding recombinant fragment of human CD4. Nature 348, 419-426. 86 Wang, J. et al. (1990) Atomic structure of a fragment of human CD4 containing two immunoglobulin-like domains. Nature 348, 411-418. 87 Brady, R.L. et al. (1993) Crystal structure of domains 3 and 4 of rat CD4: relationship to the N-terminal domains. Science 260, 979-983. 88 Jones, E.Y. et al. (1992)Crystal structure of a soluble form of the cell adhesion molecule CD2 at 2.8 A. Nature 360, 232-239. 89 Bodian, D.L. et al. (1994) Crystal structure of the extracellular region of the human cell adhesion molecule CD2 at 2.5 A resolution. Structure 2, 755-766. 9o Leahy, D.J. et al. (1992)oCrystal structure of a soluble form of the human T cell coreceptor CD8 at 2.6 A resolution. Cell 68, 1145-1162. 91 Graves, B.J. et al. (1994) Insight into E-selectin/ligand interaction from the crystal structure and mutagenesis of the lec/EGF domains. Nature 367, 532-538. 92 Qu, A. and Leahy, D.J. (1995) Crystal structure of the I-domain from the CD1 la/CD18 (LFA-1, alpha(L)beta2)integrin. Proc. Natl Acad. Sci. USA 92, 10277-10281. 93 Fields, B.A. et al. (1995) Crystal structure of the V(alpha)domain of a T cell antigen receptor. Science 270, 1821-1824. 94 Banner, D.W. et al. (1993)Crystal structure of the soluble human 55kd TNF receptor-human TNFfl complex: implications for TNF receptor activation. Cell 73, 431-445. 9s Jones, E.Y. et al. (1995) Crystal structure of an integrin-binding fragment of vascular cell adhesion molecule-1 at 1.8 A resolution. Nature 373, 539-544. 96 Bentley, G.A. et al. (1995) Crystal structure of the beta chain of a T cell antigen receptor. Science 267, 1984-1987. 97 Karpusas, M. et al. (1995) 2A crystal structure of an extracellular fragment of human CD40 ligand. Structure 3, 1031-1039. 98 Garcia, K. et al. (1996) An c~flT cell receptor structure at 2.5 A and its orientation in the TCR-MHC complex. Science 274, 209-219. 99 Garboczi, D. et al. (1996) Structure of the complex between T-cell receptor, viral peptide and HLA-A2. Nature 384, 134-141. loo Bork, P. et al. (1996) Structure and distribution of modules in extracellular proteins. Q. Rev. Biophys. 29, 119-167. lol Leahy, D.J. et al. (1996) 2.0 A Crystal structure of a four-domain segment of human fibronectin encompassing the RGD loop and synergy region. Cell 84, 155-164.
31
CONCEPTS CONCERNING PROTEIN SUPERFAMILIES Introduction The amino acid sequences of most leucocyte surface proteins contain regions of sequence that have similarities to regions present in other proteins and are termed superfamilies ](see Chapter 2). It has now been established that these regions often correspond to structural units or domains or modules (see below)and the general sequence term "superfamily" is often dropped in favour of the predicted structural terms "domains" and "modules". This chapter describes the methods for the identification of superfamilies and shows alignments of some sequences to illustrate the key residues that are often conserved in these superfamilies together with examples of their structures.
Nomenclature for superfamilies, protein domains, modules, repeats and motifs The nomenclature for terms like superfamily, domains, repeats and motifs is not hard and fast, and in this book we have tried to conform to the most commonly used names and in some cases to introduce abbreviations that might be useful. The domains and repeats discussed in this chapter include only those present on leucocyte surface molecules. Additional domains have been described for secreted molecules and for surface molecules of other cell types. These include cadherin, Fn type I, kringle, perforin, serine protease and thrombospondin domains 1-3. Recently, a single example of a semaphorin domain has been reported in CD100 and a cytokine domain on a transmembrane protein (FLT3 ligand). Details of these domain types have not been included but are reviewed in refs 4 and 5. Domain
The term "domain" is used where it is likely that a segment of sequence forms a discrete structural unit, i.e. a peptide sequence whose three-dimensional conformation is not determined by other parts of the total protein sequence but is "self-contained" (discussed in ref. 6). Although at one time the term was used exclusively for regions of known structure, it is now used where this seems likely by analogy to other structures. Three criteria are used here. Proof of a domain structure comes from tertiary structure determination and domains established at this level include: IgSF, complement control protein (CCP), cytokineR, EGF, fibronectin type III (Fn3), integrin I domain, lectin C-type, galectin, link, LDLR, LRR and Ly-6. The MHC domain has also been revealed by X-ray crystallography in MHC Class I and II antigens and it might be argued that this should not be referred to as a domain since it is not clear that this domain will be found as an isolated unit rather than appearing always as a structural pair as for the a l and a2 domains. However we will follow the precedent in the field and refer to these segments as domains. TNF and CD40L are members of an increasingly large superfamily whose structures show that they usually exist as trimers. They have not been found as multiples or in association with other superfamily types but the folds for TNF and
32
all the above domains are illustrated later in this chapter and are discussed further in the commentary on each superfamily. Secondly, a domain structure can also be argued for any superfamily segments that occur as the sole component of an extracellular sequence, or as sequence (or sequences) within proteins that is contiguous with hinge-like regions of sequence containing a high content of Ala, Gly, Pro, Ser, and Thr residues. Most Ly-6 domains fall into this category and a structure now confirms this 7,8. A third criterion for defining a sequence as a domain is that the superfamily segments are found in the genome in single exons that can be readily spliced with other exons to form a new gene with an open reading frame. Proteins containing a variety of structural domains could then arise by recombination. The scavenger receptor cysteine-rich domain provides an example based on the last two criteria.
Module The term module is used for the subset of domains that are often used as building blocks in functionally diverse proteins. In the majority of extracellular proteins the modules correlate with exons with phases suitable for "exon shuffling" (see below and ref. 9). However other methods of shuffling are possible and the term module is no longer restricted to domains with compatible exon boundaries lO.
Repeat In other cases it is not clear that a superfamily segment is an independent structural unit and for these the term "repeat" is used in cases such as the leucine-rich repeats (LRR) which are usually present only as multiples forming a larger structure 11,12 and the short repeats of segments of sequence found in individual proteins such as CD43 and P-selectin glycoprotein ligand 1 (CD162).
Motif The term "motif" is used to describe a smaller sequence pattern than might be expected to form a folded structural unit. Thus the patterns of signal sequences for protein secretion and GPI attachment (see Chapter 2)would be considered as motifs, albeit of rather ill-defined character in terms of sequence identities. A good example of a motif is the conserved sequence pattern found in the cytoplasmic domains of the CD3, CD79a, and CD79b antigens and other molecules of signal transduction complexes, now called the ITAM or immunoreceptor tyrosine-based activation motif. Alignments identifying this motif are shown in Fig. 37.
Superfamily The term "superfamily" was widely used to describe sequence similarities predicted to give rise to similar structures. As discussed in Chapter 2, this concept has been proven in many cases by structure determination and it is more common to assume the prediction and to talk about domains or modules. However the term "superfamily" is useful in discussing sequence similarities in general and also when one wants to distinguish between, for example, an Ig domain meaning a domain in immunoglobulins and an Ig-like domain found in many leucocyte proteins. We use the commonly used name IgSF in the latter. We also use TNFSF and TNFRSF to prevent confusion between TNF and TNFR and related proteins. Finally, the term superfamily is often used to describe a protein, e.g. "Thy-1 is a member of the IgSF". When IgSF domains were first described, there were very few cases in which they were found together
33
Protein superfamilies and cell surface molecules
with other superfamily domains (see ref. 13). Since then many examples have been found with two or even more superfamilies (see Table 1 in Chapter 1) and these are often termed "mosaic" proteins. It is confusing to name these mosaic proteins as members of a superfamily and it is more useful to discuss the proteins in terms of their content of domains, i.e. CD62L contains one C-type lectin, one EGF and two CCP domains. In other cases there is no ambiguity, for example phosphotyrosine phosphatase where the name signifies a family rather than a single example. The term superfamily is also used to describe proteins with clear sequence similarity but no clear domain or module structure. These include the proteins with multiple membrane spanning regions such as the TM4SF. Integrins are also likely to have domains but apart from the I-domain discussed below, integrins tend not to show sequence similarities with other proteins and the full extracellular sequences are discussed as a single superfamily.
Identifying domains and repeats: testing the significance of relationships The main difficulty in identifying superfamily domains and repeats is in their low level of amino acid sequence identity, but many methods have been developed to analyse the data from the various large-scale sequencing projects and these are reviewed in detail in refs 14-17 and only an outline is given here. A first step is to use a database searching program such as FASTA 17,18 or BLAST 19. In many cases these programs will pick up some of the members of a superfamily that the new protein sequence matches. However, no search program picks up all superfamily members and it is not uncommon for a relationship to be entirely missed. A second approach is to look by eye, or apply various computer programs zo. A third method is to make a consensus sequence for a particular domain type, for example the PROSITE database contains a large compilation of patterns and sites found in protein sequences (see ref. 21 and see also SWISSPROT 22 and Chapter 1). These patterns can then be used to search a novel sequence for the presence of domains or to search the databases. With the wealth of sequence data these tools are becoming more reliable. Problems arise in aligning sequences if there is variability in the lengths of sequences and also if there is only a low level of identity. This is the case for IgSF domains although there are small patches of characteristic sequence patterns. For example, in the IgSF one would look for Cys residues with the patterns L/I/VXL/IXC and DXGXYXC for candidate regions that might occur around the conserved disulfide bond. In relation to these there should be other patches, for example V/L/ YXW corresponding to fl strand C. If the various conserved patterns fall into place then a possible domain has been identified. The candidate domain can be defined in relation to conserved sequence positions and then tested for statistical significance. For example in the IgSF, the positions of the conserved Cys residues, or equivalent residues if the domain lacks the typical disulfide bond, are nominated and the domain is defined as beginning and ending 20 residues before and after these positions. This proposed domain is then tested for the statistical significance of sequence similarities against a set of domains that are accepted to be in the IgSF. For other superfamilies, other conserved residues would be chosen and the domain defined in relation to these. Possible key conserved residues are shown on the diagrams in this chapter and the designated residues are used to identify the domains in the entries for the molecules.
34
In testing for statistical significance of a superfamily relationship it could be argued that the conserved pattern for a domain should be defined and the extent to which this occurs in the new sequence should be tested. This works well for many superfamilies (see PROSITE discussed above). However, it can be difficult to define precisely a pattern for use in a statistical analysis since at many positions in the conserved pattern, one of a group of alternative amino acids can occur. Moreover it is difficult to know how to treat sequence gaps in defining a pattern. For example in the IgSF there can be very large differences in the length of the middle of the domain and this creates problems in defining a pattern that is characteristic of the IgSF to use in statistical analysis. An alternative method to testing a sequence against a single superfamily sequence pattern is to test it against a set of sequences (e.g. 20 sequences) that are accepted as being members of the superfamily in question. In such an analysis a simple statistical program that compares sequences pairwise for similarity can be used and the ALIGN program 23 has proved satisfactory for this purpose 13. In these comparisons no account is taken of superfamily patterns. However if a set of good scores is obtained against a family of sequences, then the superfamily pattern must be present since this is the only pattern in common amongst the family of sequences against which the new domain is being tested. This method is discussed in detail in ref. 13. One feature of both the comparison of sequences using methods like ALIGN and database searches is the requirement for a matrix to assess the scores for the likelihood of amino acid exchange within a protein during evolution. One of the widest used matrices was the 250 PAMS mutation matrix of Dayhoff 24. However the availability of very many more sequences now makes it possible to construct more accurate matrices such as the BLOSUM62 which is shown in comparison with the Dayhoff 250 PAMS in Table 1. The amino acids are grouped according to structure and the full formulae are given in Chapter 1 (Fig. 2), together with the single and three letter amino acid codes. The derivation and applications of these matrices are discussed in refs 16, 17, 24 and 25. The importance of inspection of sequences is illustrated by the example of LAG-3. This protein contains four IgSF domains with similarities to the four domains of CD4. It seems likely that both these proteins arose by gene duplication from the same precursor. However domain 1 of LAG-3 is atypical in that it contains about 30 residues of extra sequence that is predicted to form an extended loop. The patterns of sequence are also compatible with an unusual disulfide bridge between strands B and G rather than B and F 26
Domain sequence and structure: divergent and convergent evolution In the above section, criteria for defining a superfamily have been based on identifying a sequence pattern that is shared in a non-trivial way between sequences of different molecules. It is then argued that the presence of the sequence pattern indicates a relationship in evolution such that the domains that share the sequence pattern derive from one primordial domain. However it could be argued that a certain structure dictates a sequence pattern and the sharing of the pattern is due to convergent evolution from different ancestral molecules rather than divergent evolution from a primordial domain. Conversely it may be found that sequences with no detectable common pattern form similar tertiary structures and thus that these are in the same superfamily even though there is no detectable sequence relationship.
35
Table 1 The 250 PAMS Mutation Matrix and BLOSUM62 matrix of scores for amino acid substitutions used in database searching and sequence comparisons C -
S
T
0 2
-2 1 Cysteine (C) 9 - 3 4 -1 Serine (S) Threonine (T) -1 1 5 Proline (P) -3 -1 -1 Alanine (A) 0 1 0 0 -2 Glycine (G) -3 -3 1 0 Asparagine ( N ) 0 -1 Aspartic (D) -3 -4 0 -1 Glutamic (E) 0 -1 Glutamine (Q) -3 Histidine (H) -3 -1 -2 Arginine (R) -3 -1 -1 Lysine (K) 0 -1 -3 Methionine (M) -1 -1 -1 -1 -2 -1 Isoleucine (I) Leucine (L) -1 -2 -1 Valine (V) -1 -2 0 Phenylalanine (F) -2 -2 -2 -2 -2 -2 Tyrosine (Y) Tryptophan (W) -2 -3 -2 12
~
C
S
T
P
A
-3 1 0 6
-2 1 1
N
G
7 1 -2 -2 -1 -1 -1 -2 -2 -1 -2 -3 -3 -2 -4 -3 -4
-3 1 0 1 -1 2 1 5 4 0 6 -2 0 -2 -1 -1 -2 -1 -2 -2 -2 -1 -2 -1 -2 -1 -3 -1 -4 -1 -4 0 -3 -2 -3 -2 -3 -3 -2
P
A
E
Q
-4 -5 1 0 -1 0 0 2
~
G
D
~
6 1 0 0 1 0 0 -2 -3 -3 -3 -3 -2 -4
N
-5 -5 0 0 - 1 0 0 - 1 -1 -1 0 0 0 0 1 0 - 1 2 1 1 4 3 2 4 2 6 4 2 5 0 2 5 - 1 0 0 - 2 0 1 - 1 1 1 0 -3 -2 -3 -3 -3 -4 -5 -2 -3 -3 -2 -3 -3 -3 -3 -2 -1 -4 -3 -2
D
E
Q
L
V
F
-2 -1 0 0 -1 -2 -2 -2 -1 -1 -1 -3 -2 -3 -3 0 1 -2 -2 -1 0 -3 -2 1 -1 0 -2 -2 3 1 1 -1 -2 6 2 0 -2 -2 3 0 -2 - 6 8 - 5 0 -2 0 5 6 2 -1 2 5 - 5 5 -2 -1 -1 1 4 -3 -3 -3 2 2 -3 -2 -2 1 3 -3 -3 -2 -1 -3 -3 0 0 2 -2 -2 -1 -1 -2 -3 -3 -1 -3
-6 -3 -2 -3 -2 -4 -3 -4 -3 -2 -2 -3 -3 4 2 6
-2 -1 0 -1 0 -1 -2 -2 -2 -2 -2 -2 -2 2 4 2 4 __
-4 -3 -3 -5
H
L
R
-3 -1 -1 0 -1 -2 2 1
-4 -5 -5 0 0 -2 -1 0 -1
R
K
M
H
K
M
I
I
4 1 4 0 - 1 -1 -1 2 -3
V
~
Y
W
0 -3 -3 -5 -4 -3 -5 -5 -4 -2 -6 -4 -5 -4 -5 -4 -2 0 -4 -4 -5 -4 0 -2 1 -1 2 -1 -1 -2 9 7 10 6 3 7 1 2
-8 -2 -5 -6 -6 -7 -4 -7
F
Y
-7
-5 -3 2 -3 -4 -5 -2 -6 0 0 17 ~
Cys (C) Sulfhydryl Ser (S) Thr (T) Pro (P) Small hydrophilic Ala (A) GlY (GI Asn ( N ) Asp (D) Acid, acid amide Glu (E) hydrophilic Gln (Q) His (H) Arg (R) Basic LYS ( K ) Met (M) Ile (I) Leu (L) Small hydrophobic Val (V) Phe (F) Tyr (Y) Aromatic TrP (W)
11
W
The amino acids are arranged in groups accordmg to their physicochemical properties. The single-letter and three-letter codes are given in addition to the full names. The upper panel shows the 250 PAMS Mutation Matrix and the lower panel the BLOSUM62 matrix. Data are from refs 24 and 25. The Mutation Matrix is based on the frequency of evolutionary replacements of one amino acid for another at homologous positions between present-day sequences and inferred ancestral sequences. One PAM unit is the unit of evolution represented by the matrix corresponding to one accepted amino acid substitution per 100 residues. This is discussed in detail in ref. 24. The BLOSUM series of matrices are calculated differently in that substitution probabilities are calculated from amino acid pairs of multiple alignments of related sequences. Thus the BLOSUM62 (blocks substitution matrix at 62%) is the log-odds matrix derived from pair counts between sequence segments that are less than 62% identical l6lz5.
It now seems unlikely that a general structure will dictate a unique sequence pattern. This can be seen from a consideration of sequences that can give rise to domains with the Ig fold. There are now four different sets of sequence found at the surface of cells with no convincing sequence similarity between them but which all give rise to an Ig-like fold. These are the sequences of IgSF, Fn3, cytokineR and cadherin domains 2,3,27-31. There is also enormous diversity of sequences within the IgSF that leads to the argument that there is no unique sequence required to determine any part of the IgSF fold. Thus it seems rather unlikely that convergent evolution yielding the same structure would give rise to any c o m m o n sequence pattern. The converse argument is that all the sequences that give the same folding structure have derived by divergent evolution and that all sequences with this structure should be included in the same superfamily. For example, the four sets of sequences referred to above might all be considered as IgSF sequences. It does not seem useful to take this point of view since there may be a relatively limited number of small stable protein folds that can occur and these may have evolved on numerous occasions in evolution. In this case each of the sets of sequences with the Ig fold would have an independent primordial ancestor. Alternatively, there may have been one primordial structure which acquired mutations such that a new solution to the structure was produced, ultimately giving rise to sequences that were not detectably similar to the ancestor family of sequences. At this stage there is no way to estimate the probability of the divergent versus the convergent case for generation of the same structure without recognizable sequence similarity and it seems best to stick to sequence patterns as the criteria for defining superfamilies. This is sensible from a practical as well as a theoretical standpoint since sequence data are much more readily obtained than tertiary structural data. The superfamilies defined on the basis of sequence would be grouped as subsets within superfamilies based on tertiary structure considerations. Thus it seems better to retain the sequence criterion and to note that certain superfamilies have the same folding patterns in their domains. Given that the same structure can arise from various sequences, the question arises as to why sequence patterns are conserved in evolution. Molecules on the cell surface present unique determinants for interaction with a soluble molecule or with other cell surface receptors. Such interactions require diversity between molecules and not conservation of epitopes. However the sequence patterns shared within a superfamily conserve the fold of the molecule and usually involve residues pointing inwards in the folded structure rather than out-pointing residues that are available for intermolecular interactions. Thus the question arises as to what evolutionary force can operate to preserve the tertiary structure of the molecule. For cell surface molecules it can be argued that the key evolutionary pressure is the requirement for molecular stability and in particular resistance to proteolysis. The small, tightly folded domains that make up most of the leucocyte molecules may have evolved as parts of stable coat proteins on single cell eukaryotes 32. These coat proteins then gave rise to the families of molecules that evolved along with the evolution of multicellular organisms, to mediate cell division and regulation of cell differentiation. Surface molecules are generally resistant to proteolytic enzymes and this resistance is based on the folded structure, since denatured molecules are easily digested. One could argue that mutation to give new recognition epitopes would be constrained by the necessity of preserving the folded structure of the domain. In general this led to preservation of certain sequence patterns that determine one
37
Protein superfamilies and cell surface molecules particularly stable solution for the fold. Numerous alternative sequence patterns may exist that could also give a stable fold, but to reach these a number of simultaneous mutations may be required and hence a switch to a new pattern may be a rare event in evolution. If a new pattern did form this may have become the founder of a new set of sequences in which the new pattern is retained, again because of the pressure of proteolysis. From this viewpoint it seems likely that the IgSF, Fn3 and cytokineR domains all arose from a c o m m o n ancestor via sequence shifts as described above. This view might be favoured because these domains are found in molecules with similar functions and often a molecule may contain combinations of IgSF domains and Fn3 and cytokineR domains. In particular, IgSF and Fn3 domains are often found together in a single polypeptide and examples of these are particularly c o m m o n in nervous tissue.
Genomic structure and evolution of proteins with mixtures of domain types The number of domains in a cell surface protein can vary greatly. In the Thy-1 antigen (CD90) there is a single IgSF domain making up the whole of the extracellular segment, whilst for the complement receptor 1 protein (CD35) the extracellular region consists of 30 CCP domains in a linear array. In these molecules only one domain type is present but in other proteins there can be a mixture of domain types. For example the Lselectin (CD62L)antigen contains lectin C-type, EGF and CCP domains. The efficient build-up of proteins from individual domains during evolution appears to depend on two aspects of genomic structure. First, there should be an approximate concordance of the domain ends with intron/exon boundaries. Secondly the position of the intron with respect to the reading frame of a gene should be such that an open reading frame results from the recombination of an exon into the intron of an existing sequence 34. The term "module" is often used for these types of domains 1 Introns that are inserted after the first base of a codon are called phase 1, those after the second base, phase 2, and those between codons, phase 0 3s. Analysis of the intron/exon boundaries of domains present on leucocyte surface molecules shows that they are usually of phase 1 type as illustrated in Table 2. Duplication and recombination of such exons may lead to a different combination of exons within a gene but because of the continuity of phase of the intron/exon boundaries (Table 2), a new open reading frame is still formed. The domain does not need to be contained within a single exon to allow shuffling as long as the outermost intron/exon boundaries, corresponding to the ends of domains, are compatible. For instance some IgSF domains are coded for by two exons 13 and the cytokineR domain in the CD132 is also coded by two exons. In the latter case the internal splice site is phase 2 whilst the external ones are phase 1; thus it is not possible to get half of the domain integrated into a sequence containing phase 1 splice sites although multiples of the full domain can be duplicated. One of the consequences of the mechanisms of domain duplication and shuffling discussed above is that the domains are unlikely to be found either with insertions or as partial domains. However the fact that some domains can be found with introns within the coding sequence for a module leads to the possibility that insertions are possible. So far, examples of this are rare with one example in the neural cell adhesion molecule (NCAM or CD56)where a short exon is alternatively spliced leading to a sequence called VASE of 10 amino acids being inserted into the
38
Table 2 Exon boundaries for domains present in leucocyte surface molecules Domain or repeat type
Complement control protein (CCP) CytokineR Death domain EGF Fibronectin type II Fibronectin type III Galectin (lectin S-type) IgSF ITAM motif Lectin C-type (e.g. selectins) Lectin C-type (e.g. Kupffer cell R) Leucine-rich repeat Link LDLRSF Ly-6 MHC Protein tyrosine phosphatase Protein tyrosine kinase ScavengerR cysteine-rich Somatomedin B Tumour necrosis factorSF Tumour necrosis factorRSF
Do the domain boundaries coincide with introns with same splice sites?
Splice site
Usual number of exons per domain
Yes Yes No Yes Yes Yes No Yes No Yes
type 1 type 1
1 2
type 1 type 1 type 1
1 1 1
type 1
1 or 2
type 1
1
type 1 type 1
1 1
type 1
1
type 1 type 1
1a 1
No No Yes Yes No Yes No No Yes Yes No No
In both IgSF and CCP domains there are examples where a domain is encoded by two exons and also where two domains are encoded by one exon. Only limited data are available on some of the domains and it is possible that other examples with different numbers of exons per domain or motif may be found. a See recent data on CD6 36 fourth IgSF d o m a i n 37. A similar m e c h a n i s m seems likely to be responsible for the large insertion in d o m a i n of the LAG-3 protein a l t h o u g h there is no evidence that this is alternatively spliced. T h e possibility that d o m a i n s originated from smaller s u b d o m a i n s is possible but difficult to prove. T h e finding t h a t an IgSF d o m a i n of CD2 can form a folding i n t e r m e d i a t e involving half d o m a i n s from two polypeptides suggests t h a t the d o m a i n m a y have arisen from a h o m o d i m e r of half d o m a i n s 38. G a l e c t i n 2 is u n u s u a l in that three fl strands w i t h i n one sheet are formed by a contiguous sequence contained w i t h i n one exon and all residues involved in carbohydrate binding are present in this region. This raises the possibility t h a t this region originated as a functional m i n i - d o m a i n 39. It is notable t h a t the cytoplasmic parts of m e m b r a n e proteins, or indeed m o s t cytosolic proteins, do not contain m o d u l e s as defined above. For instance the CD45 antigen contains two d o m a i n s w i t h sequence similarity to p h o s p h o t y r o s i n e phosphatases and one has clear e n z y m a t i c activity but the i n t r o n / exon boundaries do not correlate w i t h these structural features 33 A n o t h e r feature of the extracellular parts of l e u c o c y t e m e m b r a n e proteins is t h a t t h e y are often poorly c o n s e r v e d b e t w e e n species, e.g. a r o u n d 4 0 - 5 0 % a m i n o acid i d e n t i t y for CD4, CD8 and CD45 b e t w e e n r o d e n t s and m a n w h i c h c o n t r a s t s w i t h around 90% for the c y t o p l a s m i c part of CD45 and m a n y cytosolic proteins. One
39
Protein superfamilies and cell surface molecules suggestion is that cytosolic proteins need to interact with more than one protein and so mutations in one protein would need to be compensated by complementary mutations in two or more proteins to maintain the desired interactions. Alternatively, a higher rate of mutations will increase the risk of new, "non-specific interactions" occurring. Although these may be of very low affinity, they may cause unwanted interactions at the high concentrations of proteins found in the cytosol. Weak, nonspecific interactions may be less of a problem at the cell surface because the surface proteins are continuously recycling and are usually glycosylated. It is possible that a major function of glycosylation, which is a feature of most cell surface proteins, is to provide a shield to prevent unwanted interactions. Receptor aggregation is often a key step in signalling events across the cell membrane and fortuitous aggregation due to non-specific interactions could affect the balance of signalling with potentially undesirable effects. Although this rapid divergence of extracellular domains is particularly common in leucocyte membrane proteins, it may not be general to all tissues. Thus there are several examples in the nervous system of high levels of conservation between membrane proteins, e.g. IgSF proteins such as myelin protein Po and NCAM.
THE D O M A I N S A N D SUPERFAMILIES T H A T ARE F O U N D IN LEUCOCYTE
CELL SURFACE MOLECULES
The frequency of the different types of domains and motifs found in leucocyte membrane proteins are summarized in Table 3 and itemized in Table 1 of Chapter 1. It is clear that IgSF domains are the most common domain type, present in about 34% of leucocyte antigens and these are often involved in protein interactions. Fn3 and cytokineR domains have a similar Ig-like fold to the IgSF domains. Thus about 54% of leucocyte membrane proteins contain at least one domain with an Ig fold. TNFSF domains are often present as trimeric proteins and exist in membraneassociated and soluble forms and the latter have cytokine activities. Integrins are involved in adhesion events with membrane proteins and extracellular matrix. IgSF and Fn3 domains are common in cytokine receptors and these are usually complexes of two or three different polypeptides. Thus by analysis of the sequence some idea of the function of a protein may be obtained. The superfamilies that are present in leucocyte surface molecules are discussed below, together with alignments of their amino acid sequences. The alignments were made using a variety of computer programs, ALIGN 23, AMPS 4o, PILEUP41 ! and then modified after visual examination. The ends of the domains can be difficult to define from the sequence and this problem is illustrated by consideration of the structure for CD4. In CD4 the last fl strand of domain 1 continues directly into domain 2 and the last fl strand of domain 3 also continues to form the first fl strand of domain 4 42-44. Thus in the alignments shown, the domains are defined with respect to key internal residues that are marked with an asterisk and the beginnings and ends can be taken for statistical comparisons as being a constant number of residues before and after the conserved positions. For example, in the case of the IgSF this is taken as 20 residues before and after the conserved Cys positions. If the goal was to express a single domain in an expression system then sequence alignments and structure should be taken into account and a structural prediction would be attempted on the basis of all the data to decide on the
40
Table 3 Analysis of structural features of leucocyte surface molecules Domain type
% of leucocyte antigens where domain is present
A Distribution of superfamily domains or motifs in extracellular parts of leucocyte antigens CCP 5 CytokineR 8 EGF 4 Fibronectin type II 1 Fibronectin type III 12 Galectin (lectin S-type) <1 IgSF 34 Integrin 8 Lectin C-type 6 LRR 2 Link <1 LDLR 1 Ly-6 2 MHC 2 TNFSF 3 TNFRSF 3 ScavengerRCR 3 Somatomedin B <1 B Percentage of domains in the intracellular parts of leucocyte antigens Phosphotyrosine phosphatase (PTPase) 1 Tyrosine kinase 2 Note that leucocyte polypeptides often contain more than type of domain or motif.
sequence that should be expressed. In order to obtain stable proteins it may be advantageous to have a few extra amino acids at the N-terminus of a domain as discussed in ref. 45.
Complement control protein (CCP) domains (Figs 1 and 2) This domain is named CCP because it is commonly found in proteins that control the complement cascade46. For instance Factor H consists solely of 20 CCP domains whilst other complement components contain CCP domains mixed with other domains, e.g. Factor B and C2 each contain three CCP domains together with a serine protease domain. The CCP domain is also commonly called the short consensus repeat (SCR)or Sushi 46. It is present in widely different numbers in cell surface molecules, ranging from 30 domains in complement receptor 1 (CD35) to two in L-selectin (CD62L). These domains are clearly involved in protein binding and the CR1 (CD35) and CR2 (CD21) complement binding regions have been mapped to the first four CCP domains of each of the first three groups of seven domains in CD35 and to the first two domains of CD21. The structure of a pair of CCP domains from complement control protein Factor H (domains 15+16), has been solved using N M R 47. Each domain consists of two segments of antiparallel fl sheet and a short triple-stranded fl sheet with no c~-helical structure 47,4s. The folding pattern for this domain is shown in Fig. 2 and the fl strand positions are marked above the sequence alignments shown in Fig. 1.
41
P r o t e i n s u p e r f a m i l i e s and c e l l s u r f a c e m o l e c u l e s
Factor H CD35 d 12 Factor B CD621. C4BPA IL-2R 1 FXIII
.....
LP RV GS I Q
KS - ~ D I SL~GVVAHM V AERTQRDK QP S L---IEIGVE I KGGSFRL E- -[-~LEAPELGTMDCTHP
EL
DDD~I
EP
TV-NVDYMNRNNIEMKW
P
ATFKAM
T ~ S ~ F EP SDSYQi~iEV - DNFS EVF E ...... LQ A LELYJVIC[P LGNFNFNSQCAFSICIS i
. . . .
AYK
T M L NC E]C]K
KYEGKV~
DLIDFVlClK
m
Factor H CD35 d12 Factor B CD62L C4BPA IL-2R1 FXIII
FG I DGPA
.....
I A
YDLRGAA ..... SMR FYPYPVQ ..... TRT T N L WG I E . . . . . ETT YKPTTDEPT- - TV I FRR I KSGSLYML YDLSPLTPLSELSV
n G - E ~ H P
. . . . . . . . . .
TPQ~D]WSIPAA . . . . . . . . .
r~ sr~ i
WHT E
R S TIGISJW SIT C K T Q D Q K T V R K A E[CIR E P FIOIN]W~JS p E . . . . . . . . . ~TICIQ QKNLR~Tnpy Q .......... GICIE TGNS SHL.S~SWDNQC- _ QIC[T NR-~G~EVKY ........ F~LUT
Figure 1 CCP domains. Residues identical in four or more sequences are boxed.
The lines above the sequences correspond to the positions of the fl strands determined from the structure of Factor H domain 16, residues 927-985 (see Fig. 2)4s. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPR OT database unless otherwise indicated and the database accession zmmber and residue numbers are given in brackets. Factor H, human complement Factor H precursor domain 16 (P08603, 929-985); CD35 d12, complement receptor 1 precursor domain 12 (P17927, 745-799); Factor B, human complement Factor B precursor (P00751, 35-99); CD62L, human CD62L or L-selectin precursor (P14151, 195-252); C4BPA, human complement C4 binding protein (P04003, 249-313); IL-2R1, human interleukin 2 receptor r chain (CD25) precursor (PO1589, 22-83); FXIII, human coagulation Factor XIII B chain precursor (P05160, 452-516).
FactorHCCPdomain~ Figure 2 The folding pattern of a CCP domain. Ribbon diagram showing the folding pattern of CCP domain 16 from Factor H determined by NMR 4s. The fl strands are shown as broad arrows pointing from the N- to C-terminal direction and the connecting loops as thinner lines.
II
Protein superfamilies and cell surface molecules Cytokine receptor (cytokineR) domains (Figs 3 a n d 4) The cytokine receptor domain is often found in association with Ig and Fn3 domains in receptors for cytokines. A common arrangement is to have a single N-terminal cytokineR domain followed by a Fn3 domain, but there are variations on this theme. Initially these two domain types were not distinguished 2o and the term haematopoietin receptor superfamily was widely used for molecules containing this pair of domain types ag,so. We use the term cytokine receptor domain for the region of about 100 amino acids usually found N-terminal to the Fn3 domain and alignments of domains from this superfamily are shown in Fig. 3. The IL-7 receptor contains a clear Fn3 domain but the sequence at the N-terminal region shows only a marginal similarity to the cytokineR domains. The possible cytokineR domain in the IL-7 receptor s~ is shown below the other sequences in Fig. 3 but the correctness or otherwise of this assignment will require validation by tertiary structure determination. The structures of the extracellular region of the growth hormone receptor and the related prolactin receptor have been solved by X-ray crystallography 29"s2. This has revealed the fold for the cytokineR and the Fn3 domains that constitute the extracellular portion of this receptor. (The structure of Fn3 domains from fibronectin and neuroglian have been solved by NMR and X-ray crystallography - see below.) Both the cytokineR and the Fn3 domains have similar folds which are also similar to the folds of IgSF C2-set domains, e.g. in CD4 a2-aa CD2 s3,s, and telokin (Fig. 4) ss Bazan s6 had previously argued that there may be structural similarities between cytokineR domains, Fn3 domains and IgSF domains on the basis of predicting patterns of fl strands in the sequences. Despite the success of these predictions the degree of sequence similarity between these domain types is low. The cytokineR domains have a characteristic Cys-X-Trp sequence together with three other conserved Cys residues, whilst the Fn3 domains lack a conserved pattern of Cys residues. The possible origin of the cytokineR, Fn3 and IgSF domains by divergent evolution is discussed above (see Domain sequence and
structure). Epidermal growth factor (EGF) domains (Figs 5 a n d 6) EGF domains are found in EGF itself and in transforming growth factor ~ (TGF~). This domain type is also found in a variety of secreted proteins such as blood coagulation factor IX and cell surface molecules such as in three selectins, Lselectin, E-selectin and P-selectin (CD62). The structures of several EGF domains have been determined and all show similarity in folding pattern, e.g. EGF s7 TGFc~ sS, Factor IX EGF domain s9,6o and the EGF domain found in CD62E 61. The structures of EGF and Factor IX EGF domains are shown in Fig. 6. The latter is slightly smaller than EGF itself but is probably representative of the repeating EGF domains found in many proteins (see Fig. 5). The single EGF domain from Factor IX has functional activities distinct from the EGF itself, for example it has Ca ~§ binding activity s9,62. The structure of a pair of calcium binding EGF domains in human fibrillin 1 have been determined by NMR and it is likely that the orientation of tandem EGF domains can vary 63. In addition to mediating interactions directly, EGF domains may be important in giving the correct spacing and orientation of other domains. l
43
Figure 3 Cytokine receptor (cytokineR) domains B
A
C
C' -
GHR PLR II.-hKP
EPOR 11L3R d l
GM-CSFR Il.-hRri
rr.-2Rp 11.-4K II.-3R d l 11.~7R
D A E L D D Y ~ F ~ ~ Y ~ ~ Q L E V N G ~ Q H S L T ~ A F E D P D V N T - - - - ~ L E F ~ I C ~ A - - - - - - L V E
E
F
G
GHR PLR IL-hRp
EPOR 11.-3R d?
GMCSFR 11.-hRo
II.-2R[3 IL-4R II.-7K d l
IL-7K
* N F R - - - - - - K L Q E I Y F I E T - K K F L L I G K S N I C V K ~ G E K S L T C K - K I ~ - - - L T T I V
Protein superfamilies and cell surface molecules Figure 3 (opposite) CytokineR domains. Residues identical in four or more sequences are boxed. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPROT database unless otherwise indicated and the database accession number and residue numbers are given in brackets. GHR, human growth hormone receptor precursor (P10912, 46-147); PLR, rat prolactin receptor precursor (P05710, 21-116); IL-6Rfl, human IL-6 receptor fl (CD130, gp l 30) precursor (P40189 124-218); EPOR, mouse erythropoietin receptor precursor (P14753, 42-140); IL-3R, mouse IL-3 receptor precursor (CDw123) domains 1 and 3 (PIR: A35782, dl 29-127; d3 243-347); GM-CSFR, human GM-CSFR precursor (CDll6) (P15509, 116-214); IL-6R~, human IL-6 receptor ~ chain precursor (CD126) (P08887, 112-214); IL-2R fl; human IL-2 receptor fl chain precursor (CD122) (P14784, 26-125); IL-4R, mouse IL-4 ~ chain receptor precursor (CD124) (P16382, 24-122); IL-7R, human IL-7 receptor ~ chain precursor (CD127) (P16871, 32-127). The sequence alignments are from 20 amino acids N-terminal from the conserved CXW. The sequence start corresponds to residue 2 in the prolactin receptor. The C-terminus is more difficult to define due to the lack of conserved residues and that shown is close to the predicted boundary between the cytokineR domains and the Fn3 domains in GHR, PLR, IL-6R (CD126). The evidence for a cytokineR domain in IL-7R (CD127) is controversial and this sequence is given b d o w the main alignments.
Figure 4 The folding patterns of
Human GHR domain 1
Fibronectin Fn3 domain 10
Human GHR domain 2
Human CD4 domain 2
the cytokineR and fibronectin type III (Fn3) domains. Ribbon diagrams for the cytokineR (domain 1) and Fn3 (domain 2) from human growth hormone receptor 29, and Fn3 domain 10 from human fibronectin 3o. The IgSF C2-set domain from human CD4 domain 2 is included for comparison 42,43. The fl strands are shown as broad arrows pointing from the N- to C-terminal direction and the connecting loops as thinner lines. Some gaps are present in the loops of the growth hormone receptor where the structure has not been fully resolved 29. Each fl strand is labelled using the same nomenclature as in the IgSF. This lettering corresponds to that in the sequence alignments (Figs 3, 8 and 15).
45
46
- FA9- I FA9-2 EGF
CD621. CLh2 P CD62 t PRTC 114/AIO NOTCH
V D G . . D N S D . . T . . T - .T P L E G P S T N D
D V S A A A H D E
N
. . . . .
*
. . .
Figure 5 EGF domains. Residues identical in five or more sequences are boxed. The asterisks mark the positions o f the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPROT database and the database accession number and residue numbers are given in brackets. FA9-1 and FA9-2, human coagulation Factor IX precursor (P00740, 92-130 and 131 -1 72); EGF, human epidermal growth factor precursor (PO1 133, 971P , CD62p or P-selectin precursor (P16109, 1014), C D ~ ~human L , CD62r or L-selectin precursor (P14151, 157-193); C D ~ ~human 160-196); C D ~ ~human E , cD62~ or E-selectin precursor (P16581, 138-1 76);PRTC, human protein Cprecursor (P04070, 96-133), 114/AlO; mouse haematopoietic cell surface protein 114IA10 precursor (P19467, 232-274), NOTCH; Drosophila notch protein (P07207, 1021-1059). The ends of the alignment correspond to those of the coagulation Factor I X EGF domain whose structure has been determined 59. The structure of EGF itself has been determined for a sequence that extends a further four residues beyond that shown (see Fig. 6 ) 57.
Protein superfamilies and cell surface molecules Figure 6 The folding pattern of EGF domains. Ribbon diagrams of EGF s7 and a coagulation Factor IX EGF domain sg. The fl strands are shown as broad arrows pointing from the Nto C-terminal direction and the connecting loops as thinner lines. The N-terminal core of the structure is similar in both domains but the EGF structure extends further with two more short fl strands.
EGF
Factor IX EGF domain
Fibronectin type II (Fn2) domains (Fig. 7) The Fn2 domains were first identified as one of three different repeating sequence patterns within the fibronectin molecule. The Fn2 domain has been found in few other proteins and the only leucocyte molecules with this domain are the mannose receptor and DEC-205 which each contain one Fn2 domain. The structure of a sequence from bovine seminal fluid protein PDC-109 that shows sequence
.brd T'VT!Y,GGSNGV!C :NG.TLFYS
Fibrd2 MannoseR
T V L V Y E AM
co,,.~
QLG__GN S N G A L L]G N F ~ G A
D s w v MIG G N S l A I ~
N N H N E N K
T F C G K
IC T S E G R RID NL~MKEJWCI
AID Q__KJF~
P G P QPIWCI [C T S E G R BID GJRIL W C]A
QID~WIGIYiCL~E
wSl~ R IC TIH KIG R
,A~176
T D A D S T
M A A
10pl yC
P K
SlDIKIKIWIG F C PID Q G
Figure 7 Fibronectin type II (Fn2) domains. Residues identical in three or more sequences are boxed. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPR OT database and the database accession number and residue numbers are given in brackets. Fibr, h u m a n fibronectin precursor (P02751; dl, 345-405; d2, 405-465); Mannose R, h u m a n mannose receptor precursor (P22897, 153-212); Factor XII, h u m a n coagulation factor XII precursor (P00748, 32-91); Collag, h u m a n type V collagenase precursor (EC 3.4.24.7) (P14780, 332-391). The ends of the alignments are based on the exon boundaries of the fibronectin type II domains.
47
similarity over part of the Fn2 domain alignment shown in Fig. 7 has been determined by NMR 64.
Fibronectin type III (Fn3) domains (Figs 4 and 8) The largest group of Fn3 domains in leucocytes is found in many of the receptors for cytokines. These polypeptides also contain cytokineR domains (see above) and often have a characteristic WSXWS sequence between fl strands F and G. The Fn3 domain is also particularly common in membrane molecules found in the nervous system, which in addition often have IgSF domains 13'6s. Fn3 domains are also found within cells including large numbers in the group of muscle proteins that bind myosin such as twitchin in Caenorhabditis elegans 66 and also in titin in vertebrates 67'68. Fn3 domains, together with IgSF domains, are one of the few examples of domains present both intracellularly and extracellularly in leucocytes. Another example of cytoplasmic localization of Fn3 domains is in the cytoplasmic segment of the integrin f14 chain 69 (note the external regions of integrins do not contain any Fn3 domains). This is currently the only example of a domain found at the surface of leucocytes which is also present on the cytoplasmic side of a transmembrane protein (the muscle proteins discussed above are not membrane associated). Several structures for Fn3 domains have been solved by NMR 3~ and X-ray crystallography 29"s2"7~ This domain consists of two fl sheets with a similar folding pattern to the IgSF fold, the cytokineR domain and the domains of the PapD chaperone protein 2s. However, there is no significant sequence similarity amongst these proteins as analysed by the methods discussed above. Recently, the structures of a pair of Fn3 domains from neuroglian and four domains of fibronectin have been determined by X-ray crystallography 7o,7~. These structures have shown that the orientations between domains can vary considerably within and between proteins.
Figure 8 (opposite) Fibronectin type III (Fn3) domains. Residues identical in five or more sequences are boxed. The positions of the fl strands determined for domain 10 of human fibronectin are indicated above the sequences 3o. See Fig. 4 for folding patterns of Fn3 domains from fibronectin and growth hormone receptor. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPR OT database unless otherwise indicated and the database accession number and residue numbers are given in brackets. GHR, human growth hormone receptor precursor (P10912, 148-251); Fibr, human fibronectin precursor (P02751: dl 605-700, d2 719-809, d5 996-1085, dlO 1447-1541); LAR, human LAR precursor (P10586, 596-694); TWIT, twitchin cytoplasmic protein from Caenorhabditis elegans (PIR: S07571 1761-1854); L1, mouse neural adhesion molecule L1 precursor (Pl1627, 916-1012); IL-7R, human IL-7 receptor precursor (CD127) (P16871, 128-231); IL-6Rfl, human IL-6 receptor fl chain precursor (gpl30) (PIR: A36337, 221-324); PLR, rat prolactin receptor precursor (P05710, 121-224); IL-3LR, mouse IL-3 receptor-like protein precursor (AIC2B) (PIR: A35782, d2 135-243, d4 342-441).
48
Figure 8 Fibronectin type I I I (Fn3)domains B
A
C
GHR
LI
C'
Fihr-?
LI
PLR
E
F
G
49
Galectin or the lectin S-type domains (Figs 9 and 10) Galactoside binding proteins have been sequenced from several species and shown to contain a sequence pattern different from that of the lectin C-type domain. These were termed S-type to distinguish them from lectin C-type because the first examples contained free accessible thiol groups 72. However, they are now generally referred to as galectins 73. They are found both intracellularly and extracellularly and a region with strong sequence similarity is found in the Mac-2 leucocyte antigen now called galectin 3 (Fig. 9). However in this case analysis of protein produced by recombinant DNA techniques shows no requirement for a reducing environment for lectin activity and no accessible thiol groups 74. Thus the thiol requirement is no longer general for this domain type. The structures of galectins determined by X-ray crystallography show two antiparallel fl sheets, with five and six strands each, which associate to form a fl sandwich (Fig. 10). The topology is different from the Ig fold with one sheet made up of strands AJCDEF and the other sheet strands KBGHI (see Fig. 12 for topology of IgSF domains). However the galectin 2 structure does resemble some leguminous plant lectins although they lack significant sequence similarity 39
G protein-coupled receptor or transmembrane 7 superfamily (Fig. 11) This large superfamily of several hundred proteins are expressed by a wide range of cell types and are characterized by the presence of seven hydrophobic membrane-spanning sequences (reviewed in ref. 75 and see WWW site at ref. 76). The proteins are oriented with the N-terminus on the extracellular side and the C-terminus on the cytoplasmic side of the plasma membrane. Several names have been used to describe this superfamily such as G protein-coupled receptor, 7TMS (seven-transmembrane) and rhodopsin superfamilies and about six different groups of receptors have been distinguished 76. We use the term G protein-coupled receptor as this is the most widely used name. The sequence conservation is highest in the potential transmembrane segments, with most diversity in the N- and C-termini and the cytoplasmic loop between transmembrane segments 5 and 6. Most members of the G protein-coupled receptor superfamily have been shown to couple to various G proteins and include a large family of receptors for chemokines. Many more examples may be found on leucocytes as the monoclonal antibody approach may not recognize them well because of their low site number per cell and the high degree of amino acid sequence identity between species homologues. Experiments using chimeric proteins have shown that the sequences contributing to G protein attachment are found in transmembrane segments 5 and 6 and the cytoplasmic loop between them. A subset of closely related G protein-coupled R members is found on leucocytes and includes the C5aR (CD88), fMLPR (FPR) and IL-8Rs (CDw128 and IL-8Rb). Two further members, the F4/80 and CD97 antigen are unusual in that they contain several EGF domains in their first extracellular regions 77.
Immunoglobulin (Ig) superfamily domains (Figs 12-15) Immunoglobulin superfamily (IgSF) domains are the most abundant domain type found in leucocyte membrane proteins, as is evident from the collated data in Table
50
A
c
B
Rat palccrin I Eel palcctin Rat palectin 3
D
G
E
H
I
F
J
K
Figure 9 Galectin or lectin S-type domains. Residues identical in four or more sequences are boxed. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section I I . The positions of the P strands present in galectin 239 are indicated above the sequences and on the fold (Fig. 10). The sequences of the following proteins are from the SWISSPROT database and the database accession number and residue numbers are given in brackets except where the full sequence is given. Human galectin 2, human galectin 2 precursor or HL14 (P05162, 2-131); Human galectin I , human galectin 1 or P-galactoside binding lectin (P09382);Rat galectin 1, rat galectin 1 or P-galactoside binding lectin (PI 1762); Eel galectin, electric eel P-galactoside binding lectin (P08520); Human galectin 3, human galectin 3 or Mac-2 antigen precursor (P17931, 1 1 1-248); Rat galectin 3, rat galectin 3 or Mac-2 antigen precursor (P08699, 123-260).
Protein superfamilies and cell surface molecules
/')
ill" , I
7:,
x\
,7;"
//
/!'t ~
~
i;l I i'
~""
,f X ',',,
/Bt,t,G,\ ~ ,
,/ t '<," .. ....
Figure 10 The folding pattern of a galectin or lectin S-type domain. Ribbon diagram of the one domain of the homodimer of galectin 2 39. The fl strands are shown as broad arrows pointing from the N- to C-terminal direction. Each strand is labelled and corresponds to the labels in the alignments (Fig. 9).
..... ,,,,,
1D,,,
7;
i'
/;,'
\x.~
4_11
_
//
3 which shows that approximately 34 % of leucocyte membrane polypeptides contain IgSF domains. The structures of several IgSF domains have been determined by X-ray crystallography including Ig V- and C-domains, fl~-microglobulin 27, MHC Class I antigen c~3 domain 78, MHC Class II ~2 and f12 domains 79-81, CD4 domains 1+2 and CD4 domains 3+4 42-44, CD8~ 82, CD2 s3,s4,83, VCAM domain 184 and T cell antigen receptor ~ and fl V- and C-domains 8s-88 These structures show that the IgSF domains characterized by sequence similarities over about 100 amino acids correspond to structural units with distinct folding patterns referred to as the Ig fold (reviewed in ref. 27). The Ig fold consists of a sandwich of two fl sheets, each consisting of antiparallel fl strands of 5-10 amino acids with a conserved disulfide between the two sheets in most but not all domains (Fig. 12). The sequence similarities are mainly found at the positions of in-pointing residues in the fl strands with considerable differences in the loops that connect the strands and the out-pointing residues on the faces of the fl sheets. The core of the fold is made up of three fl strands labelled ABE and GFC and the positioning of these is shown in the various folds illustrated in Fig. 12. The folds vary considerably in length in the middle of the sequence with Ig V-domain folds being the archetype for the longer fold. The extra sequence in comparison with C-domains forms an additional pair of fl strands (C' and C" in Fig. 12) and the connection between these forms the second complementarity determining region in antibody and TCR V-domains. In IgSF domains there are limited sequence patterns in fl strands B, C, E and F that are common across the superfamily (Figs 13-15) and other patterns that allow a subdivision of the domains. Ig and TCR V-domains have a characteristic pattern in
Figure 11 (opposite) G protein-coupled receptors. Residues identical in four or more sequences are boxed. The bars over the sequences indicate the transmembrane regions. The sequences of the following proteins are from the SWISSPR OT database and the database accession numbers are given in brackets. IL-8R, human high-affinity IL-8 receptor192; C5aR, human C5a anaphylatoxin chemotactic receptor (P21730); fMLPR, human fMet-Leu-Phe receptor (P21462); Rhodopsin, human rhodopsin (P08100); NeurokininR, human neurokinin A receptor (P21452); DopaR, human D(1) dopamine receptor (P21728).
52
n
Figure 11 G protein-coupled receptor or transmembrane 7 superfamily
CSaR IhlLPR
. . . . . . . . . . . .
NeurokininR
-
DopaR
. . . ~ . . . . . ~ . . . . .
-
-
II
111
IV
VI
v11
Protein superfamilies and cell surface molecules
! FAb NEW V H
lJ2-Microglobulin
;q'
Human CD4 domain 2
c
Rat CD2 domain 1
Figure 12 The folding pattern of IgSF domains. Ribbon diagrams for four IgSF domains: Ig V-set (VH of human NEW Fab), Ig Cl-set (fl2-microglobulin), Ig C2-set (CD4 domain 2) and Ig V-set lacking the normally conserved disulfide between fl strands B and F (rat CD2 domain 1). These are labelled with the corresponding strand letters used in the alignments for the Ig V-set, Cl-set and C2-set sequences (Figs 13-15) and in the Fn3 domains (Figs 4 and 8). The data are from the Brookhaven Protein Structure Database 83 the region leading into fl strand F of Asp-X-Gly/Ala-X-Tyr-X-Cys. The receptor Cdomains have a characteristic pattern between fl strands B and C of Gly-Phe-Tyr-Pro and another on the C-terminal side of fl strand F of Cys-X-Val-X-His. The Ig, TCR and MHC antigen C-type domains all share the same types of sequence patterns and are referred to as the C 1-set within the IgSF. With the sequencing of various cell surface molecules a third category of domains became evident, namely domains of length similar to C-domains but with some of the sequence patterns of V-domains. These domains are referred to as the C2-set 13. They have V-type patterns in the fl
54
ri
I$
t_
ci
I
Y
TCRO.
cux (I
*
m
Figure 13 Immunoglobulin V-set domains. Residues identical in five or more sequences are boxed. The positions of the B strands are indicated above the sequences, The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPROT database and the database accession number and residue numbers are given in brackets. Ig A; mouse Ig X chain precursor (MOPC 104E) (POI 724, 21 -129); Ig K , h u ma n Ig K chain Roy (P01608,3-107); IgG heavy, human IgG heavy chain NEWM (P01825, 3-1 16); TCR [j, h u m a n TCR 3 chain precursor (POI 733, 22-135); TCR (1, mouse TCR cy chain precursor (PO1 739, 23-132); CD8 113, rat CD8 /3 chain precursor (P05541, 21-134); CD8 0, rat CD8 chain precursor (P07725,27-138); CD4 d l , hum an CD4precursor domain 1 (POI 730, 21123); Thy-1, rat Thy-1 precursor (P01830, 18-128); CD2 d l , rat CD2 precursor domain 1 (P08921, 20-120).
B
A
D
C
Ig k If K
TCR
*
-
E
F
G
Figure 14 Immunoglobulin C1-set domains. Residues identical in four or more sequences are boxed. The positions of the p strands are indicated above the sequences. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section 11. The sequences of the following proteins are from the SWISSPROT database unless otherwise indicated and the database accession number and residue numbers are given in brackets. l g A, human l g X chain C region (P01842, 7-104); Ig K , human Ig K chain C region (P01834,6-106); IgG heavy, human Ig 7-1 C region (P01857, 230-329); TCRP, human TCRP chain (POl850, 10-1 17);/32-Microglobulin,human p2-microglobulin precursor (P01884,24-1 19); MHC Class I d3, human MHC Class I HLA cy chain precursor domain 3 (PXR:A02189,203-301); MHC Class I I d2, human MHC Class I I D R cy chain precursor domain 2 (PIR: A02206, 113-209)).
B
A
C
c
C D J d? Lld3 Q I A;4R@K I A QM V S H R A R Q S T M N A T’A N L S Q P F I T S N N S N P V E D E D
Amalgamd3 NCAM C k A d4 IgFcRn CD2 d? CD3 E
V M N O - T D B E L N L Y Q D G - - - - - - - - - K H L K L S -
* E
.
.
.
_ -
_
F
G
_
-
. _ _ . .
. . . . . . . . . . . - - . . . . . .
m
Figure 15 Immunoglobulin C2-set domains. Residues identical in four or more sequences are boxed. The positions o f the /3 strands are indicated above the sequences. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section I I . The sequences of the following proteins are from the SWISSPROT database unless otherwise indicated and the database accession number and residue numbers are given in brackets. CD4 d2, h u m a n CD4 precursor domain 2 (PO1 730, 123-204); L1 d3, mouse neural cell adhesion molecule L1 precursor domain 3 (P11627, 243-331); M A G d4, rat myelin-associated glycoprotein precursor domain 4 (P07722, 327-412); Amalgam d3, Drosophila amalgam protein precursor domain 3 (PI 5364, 231 -327); NCAM, chicken neural cell adhesion molecule precursor (P13590, 203-295); CEA d4, h u ma n carcinoembryonic antigen precursor domain 4 (CD66e) (P06731,325-414); IgFcRII, mouse IgG FcRII precursor (CD32) domain 1 (P08101,37-116); CD2 d2, human CD2 precursor domain 2 (P06729, 127-203); CD3 E , hum an CD3 E precursor (P07766, 29-1 17).
P r o t e i n s u p e r f a m i l i e s and c e l l surface m o l e c u l e s
.
- - P Q Y ~ G S E I L ~ Q H N D K N I G G D E D D K N I G S - -
58
tea.
a>.
~zuaCy
oo,<<
~.a.
~
'
I71 >..-a
]
a~-O'
.a,-a
,
~'->'~
I ~I n ~,c ~
"~ O
,..a
> - < <
[ > > > > [
- N t - m
,-,'~
~ < Z : >
,
<[~] = ~"
[=,-I~>-
O m m m
~ =
~ Z >
aaz-
< < - >
I~-I-I-~-'5} I 1 1 1 1
~ z ~ z
CYZmO
' ~Q
-
<
o , ~ z
<1
,
~
,
~1~
,," o
,-a--
,,,z=z
~ <
~
>- z
>
,-
,-a ~
--
,~-,~ , ,
-
,
oooeJ ,
mz.-a ~0' , F" >" >"
~ ~
>~
>-,
Z~
<
<
<
M au LD LD
Z
>
i -~F1 - - ,,,'
M
~u.a.~ u-, > . - , ~
,-,' ~
~l~cl
0 ~ : 0 '
0<
, "' ~
~0"" N -.a ,--.I r~ ~
,
o ' r," >- c y
::I::: ~
[-- ~
~-~-~.>
i
lae~,.=l
Z Z ~
[-O']m~
<
[_~ o 0 L ~ '
z
< z
,._OJ<
"al~ ~1
~-~mm
".* r~
z
~
. . . .
,
< > - - >
f~loF~l
,
la. , - I > . . 1
C~ C~ C~I
Z Z ~
...1
~-
.a.a
~e~
<
I
1< < <1 ~
< < m Z
[~
< -a
N o ~
-
<
.a-s-
-
<
>
~ : z z
~
,.-.1 >
<
~..a<.a
, ad
a . O , ~
~ F-,<
:::: ,,z >, t...
N a ~
b- ~
a-a>> > > ~ ~
u~Z
.
a.aOua
omu..>-
~ e ~ <
Cy~i--,ew
~
~
::E ..a
~
z o z >
m
-
,-1
o
u.
0'
I ,
,--
o
>.
~
>~
~
a::: >- ~
c~=
,-1 ~
uaoa.>
>~
~
<
-aN
0'<>
,
t-.,
<
t--
~ z ~
>
<~-.a-
z ~ m
~
>., u.,, u., >
,
.
.
.
, tO0'
D--~ol-~l
/>'~>'~J
.
~l.x.>
~>.,=
c~>~-
~zz~
I Ixxxl
I-~-_*>'1-~-I>'1 z z ~
,
~"
I-CI<[-G--O-]
=lZ
,~
z Zl
o ua > aa
I-
r. a:
I>>>~-1~
~. ~
~0'~
I ~ ~ l
~ux~"
<-coco
>~-c:~.
i~.~.~.~. I
>.<<=
I'~-~-
F~']oI-~~I
0'--
> .a a- >-
o0
o o ",." ~ ~ Z LO ua ua aa ra ~
I z z zl~
-
O'CY>.a2 r ~ - - ~
._1 ::12 a~ ~
- -
z
>>
~
,-
,
ua ~
,,
~n~e
Z<
~
a.~
Uala.
~
' <
>ao,',,
N
r,,'
~.
~ a . O '
'~=~
z
,
e~ f.-- m ~:.~
~
m~
a~ ~E.I_~
>"~1_~_1._.~ I'-~-~1~--1
:~a:-
> > ] -
za:
, 0 'v
cyz<,-"
~:-=-~
[-<-~or~]
I~:~l
<
Oo]
~- ~
Ioo a,o..
-
>
o., <
C y ~
,
mE
,
~
,
~
> > ~ , >--,.1...1
,
~a.
, ad
~Z
~-
'
.-,
Z
..io,
=z
>-
rO 0 ua Z O 0 F-ad t~ CY t~ ~
>-~
<~
a:~
~t.-.~ =~c~u. ~
<
.< , -
o
<
m ~ 0
t--, >
0
a~
I =" ='~-.1-~ O0
o
~ LD >, >~ >~-~m <
z~o~l
Z
>1 -a I ..a I
Z
, ~
< <
e,,,
<
OO
<
> ~
~.~
,
> > ,-a - I ,.a ~
o o oI
~ ,
[-0--07 ~I-UI <~-
u')
~. =
a.~
m~<>
!-~--~
< < ca
-
~ < > <
,
>.>,~
I:~:=1
~'aIZ. >>>1
,a~-~
l_~_i z < < ~
___=z,,.a
I
~
, , o i-a.a-a.a
,
r',
.a~oz
~
I'~:"~-~.~.i ,
mzmm ,
<~mo
/ .--,><~
>>~.
I. . . .
~'~'~-a
>>~'~
--,z~
~ Z r O
~n[-~-I
- > o ~
I-~,.,.a'~ ~1
-=~.~
> [-~1
zt-.,cyz
orO
~ - -
u~
,
ua ~
I
zz
>-~>
O'CY~
I>>>>1
u a ~ O z
~.~.m~:
:~-~
ox
> ~ > ~
~
>.a==
ax
~.a~.
V L K K G T I Q D . . . . . . N F R H T K E L N . . . . ~ . . . . . . . . . . . ~~~~~~~~
R F L S K T D K R L L Y C I K - ~~~ - ~ - ~- ~-
-
transmrwhr,m<
Figure 16 Integrin o chains, Residues identical in three or more sequences are boxed. The sequences of the following proteins are from the SWISSPROT database and the database accession number and residue numbers are given in brackets. CD49a, rat integrin a1 precursor (P18614, 25-1 172); CD49b, human integrin a2 precursor (P17301, 26-1 161); CDS1, vitronectin receptor integrin aV precursor (P06756, 27-1023); CD49d, integrin 04 precursor (P13612, 36-1013). The extracellular and transmembrane regions are shown. The position of the I-domain in those sequences where it i s present is indicated. Alignments of the I-domain are shown in Fig. 28.
strand E-F region, a pattern of Pro-X-Pro is relatively common between fl strands B and C and the pattern Cys-X-Ala-X-Asn is common after fl strand F. This distinction between C2- and C 1-sets has been confirmed by structural studies, e.g. CD4 domain 2 is a C2-set sequence and its structure is classified in terms of sheet assignments labelled as ABE/GFCC'. This is in comparison to ABED/GFC for C 1-set sequences and ABED/GFCC'C" for V-set sequences. That is, for C2-set sequences the middle fl strand may be generally in line with the GFC fl sheet rather than the ABE sheet as is the case with antibody C-domains. So far, none of the structures of C2-set domains has a C 1-set structure, i.e. like TCR, MHC antigens and Ig C-domains themselves. In all IgSF domains the structure of the core of the domain is maintained. The C2set domains are mostly included in an I-set that has been defined on structural grounds (reviewed in refs 89 and 90) but in this review where in most cases only sequence data are available, we use the widely accepted C2-set nomenclature. The points about conserved patterns and the positioning in fl sheets are made evident by comparing the sequence alignments in Figs 13-15 with the folding patterns for the domains in Fig. 12. The sequence alignments are discussed in more detail in ref. 13.
Integrins
(Figs 1 6 - 1 9 )
The integrins have a heterodimeric structure with c~and fl chains that both traverse the lipid bilayer. There are at least 16 ~ and 8 fl chains which can be found in various but not all possible combinations. Sequence similarities are seen within the ~ and fl chains across all the integrin types (Figs 16 and 17 and discussed in detail in refs 91-93). The integrins are known to be involved in cell interactions and include receptors for the extracellular matrix proteins fibronectin and vitronectin and for cell surface molecules CD54 (ICAM-1)and CD102 (ICAM-2). The integrins are reviewed elsewhere 9, including a companion volume in this FactsBook series 9s. They are expressed on many different cell types; the CD11/CD18 family and the CD49 very late activation antigen family (VLA) are expressed mainly on leucocytes. This family of related proteins does not generally contain other domain types. One exception is the f14 integrin that contains two Fn3 domains in the cytoplasmic region (ref. 69 and see also section on Fn3 domains). Some integrin c~ chains contain an inserted sequence or I-domain which shows sequence similarity to sequences in von Willebrand factor (where the domain is usually termed the A-domain!), some matrix proteins and complement factor B (Fig. 18) 96. The structures of two I-domains from CD 1 l a and CD 1 l b have recently been determined by X-ray crystallography (Fig. 19)96,97. The fold consists of alternating amphipathic ~ helices and hydrophobic fl strands that form a classic dinucleotide binding fold that is a common topology of many intracellular enzymes.
Figure 17 (opposite)
Integrin fl chains. Residues identical in three out of three of the sequences are boxed. The sequences of the following proteins are from the SWISSPR OT database and the database accession number and residue numbers are given in brackets, ill, human fibronectin receptor - integrin fll precursor (CD29) (P05556, 26-752); f12, human integrin f12 (CD18) precursor (P05107, 24-724); f13, human integrin f13 (CD61) precursor (P05106, 30-742). The extracellular and transmembrane regions are shown.
60
Figure 17 Integrins
IN'TEGRIN BETA CHAINS
m
62 CDllh
C -
2
D
R Q D S Q -
S T T T N
E K K E R
K E N K U
6
S S L F S
4
3
5
E
F -
-
P H A k L K Q I R
R S E E Q E
D E R K E N
P N A N D R Y D
P I L D ? N V V K
' L 4 P E C C S C L C S G
Figure 18 Integrin I-domains. Residues identical in five or more sequences are boxed. The positions of the p strands and LY helices are indicated by letters and numbers respectively shown above the sequences as determined for the structure of the I-domain from CD11b 96. The sequences of the following proteins are from the SWISSPROT database and the database accession number and residue numbers aregiven i n brackets. CD1 l b , human CDllbprecursor (P112151,141-338); C D l l a , human CDllaprecursor (P20701, 147337); CD49b, rat CD49b precursor (PI 7301; 166-367); CD49a, human CD49a precursor (8'18614,165-365); CDl l c , human CD1 l c precursor (P20702, 142-338); V W F ; human von Willebrand factor precursor (P04275; 1268-1463, 1490-1672, 1681 -1 874).
Lectin C-type domains (Figs 20 and 21) This family of lectin domains are termed C-type because some members have been shown to require Ca 2§ to bind carbohydrate 98. This domain has been found in a number of lectins such as the Kupffer cell fucose/galactose receptor, hepatocyte galactose receptor, mannose binding protein from plasma, and galactose binding proteins in two invertebrate species, the flesh fly and sea urchin 99-1Ol. Lectin Ctype domains are found in leucocyte cell surface antigens such as CD62L (L-selectin) and the low-affinity Fc receptor for IgE (CD23) and in a number of proteins not originally known to bind carbohydrate such as the proteoglycan core protein lO2. In some cases carbohydrate binding for the lectin C-type domain has been established, e.g. CD62E 61 Two groups of lectin C-type domains can be distinguished on genetic organization and sequence patterns. The L-selectin has the lectin C-type domain plus about 10 residues of the signal sequence contained completely within one exon with phase 1 intron boundaries lO3. In cases other than the selectins lo3,1o4, the lectin domain is usually found spread over three exons which also include the C-terminus of the protein and the 3' untranslated sequence lol. The majority of lectin C-type domains are found as single domains although the macrophage mannose receptor contains eight domains in tandem lO5. In this case there is no simple correlation between the 26 exons that encode the eight lectin C-type domains which are encoded by 2-4 exons with a variety of phases of intron/exon boundaries lo6,1o7. As well as the differences in intron/exon organization there are sequence patterns that distinguish the lectin C-type domain present in selectins from the other lectins as shown in the sequence alignments in Fig. 20. There is a characteristic Trp residue at the Nterminus of the selectins E-selectin (CD62E), P-selectin (CD62P) and L-selectin (CD62L), whilst in the other group there is a longer patch of sequence at the Nterminus that shows a conserved sequence pattern of Cys-X-X-X-Trp. At the N-terminus of the mannose receptor there is a region similar to the carbohydrate binding domain of the B chain of the plant lectin ricin and related lectinsl~ This is part of the galactose binding domain of ricin and is the only example of this domain type at the surface of leucocytes to date.
Figure 19 The folding pattern of the integrin I-domain from CD 11 b. The fl strands are shown as broad arrows pointing from the N- to C-terminal direction and the ~ helices as coils 96. The fl strands are labelled A - F and the helices as numbers 1-6 as in the alignments (Fig. 18).
63
64
al
a2
P2
ManBP ManR KUCR RHL-I CD23 PCCA CD62k
CDh2P CDOX
Li
L2
L3
L4
P3
P4
I35
Figure 20 Lectin C-type domains. Residues identical in five or more sequences are boxed. The positions of the 0 strands, cy helices and loops (L) determined for the structure of the rat mannose binding protein'O8 are shown above the sequences. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPROT database and the database accession number and residue numbers are given in brackets. ManBl? rat mannose binding protein A precursor (P19999, 11 7-238); ManR, human mannose receptor precursor (P22897, 362-490); KUCR, Kupffer cell carbohydrate binding receptor (P10716;412-540); RHL-I, rat hepatic lectin 1 or asialoglycoprotein receptor 1 (P02706; 152-279); CD23, low-affinityIgE receptor (P06734; 163-286); PGCA, cartilage-specific D 6, 2 or ~ E-selectin precursor (P16581; 22-142); C D ~ ~CD62p P , or P-selectin proteoglycan core (P07897; 1914-2038); C D ~ ~c E L , leucocyte c D 6 2 or ~ L-selectin precursor (P14151;39-159). precursor (P16109; 42-162); C D ~ ~human
L4
L3 ~
1
2
Figure 21 The folding pattern of a lectin C-type domain. Ribbon diagram of the lectin C-type domain from the rat mannose binding protein lo8. The fl strands are shown as broad arrows pointing from the N- to C-terminal direction, ~ helices as coiled ribbons and the connecting loops as thinner lines. The labelling of the fl strands (ill-5), ~ helices (~1-2) and loops (L1-4) corresponds to that in the sequence alignments in Fig. 20. The numbers 1 and 2 refer to the position of the two holmium ions that are known to stabilize this region that contains a high proportion of non-regular secondary structure lo8
The structure of a lectin C-type domain from a rat mannose binding protein has been determined by X-ray crystallography lo8 and the fold is shown in Fig. 21. The structure contains an unusual region of non-regular secondary structure stabilized by two holmium ions in the crystal structure. The main part of the domain contains both fl sheet and ~ helix and this is unusual as most of the domains for the other cell surface molecules consist solely of fl structure (except the link which has a similar fold to the lectin C-type domain, the MHC superfamily and the integrin I-domain). The lectin C-type domain of the human mannose binding protein has also been crystallized with some additional C-terminal sequence. This extra sequence formed a triple c~-helical coiled coil structure leading to the display of three lectin C-type domains lo9. This provides a method of increasing the avidity of the interaction as the monomeric affinities of the lectin domains are very low. The structure of the lectin C-type domain of CD62E has also been determined61. In both the mannose binding protein 11~ and CD62E 112 key contact residues in determining the carbohydrate specificity have been determined such that the specificity can be readily modified by mutagenesis. The topology of the fold is similar to that of the link domains (see below).
65
Leucine-rich repeats (LRR) or leucine-rich glycoprotein (LRG) repeats (Figs 22 and 23) The leucine-rich repeat (LRR) is characterized by a pattern of conserved residues including about 5 or 6 leucines and some other residues in a tightly defined repeat of about 24-29 residues (Fig. 22). It is found both intracellularly and extracellularly in a variety of species including Drosophila and yeast and has also been found in the platelet glycoproteins CD42a and CD42b. It often occurs in an array of tandem repeats 11. For instance there are nine repeats in the leucine-rich glycoprotein where the repeat was first noted, 26 repeats in the yeast adenyl cyclase, 10 in the proteoglycan protein 113 and three in the trkB protein114. In some cases sequence similarities are observed beyond the alignments shown and it has not been clear exactly where the repeat starts. The determination of the structure of porcine ribonuclease inhibitor which contains 15 repeats, shows that each repeat formed a short fl strand and an c~ helix approximately parallel to each other. The fl strands and c~ helices are parallel to a common axis resulting in a horseshoe-shaped molecule with the helices flanking the circumference (Fig. 23) 11. An unusual property of the repeats is their exposed parallel fl sheets which may be useful for protein-protein interactions 12. Two types of repeat alternate in the ribonuclease inhibitor and in general there is considerable heterogeneity within the LRR (Fig. 22). Thus the dimensions of the c~ helices and fl strands may vary and indeed the sequence alignments for LRR in CD42 should be regarded as tentative (reviewed in ref. 12). The c~ chain of CD42b contains seven LRRs and these, together with all the remaining coding sequence, are encoded by a single exon 11s. Thus this repeat is not generally coded by single exons.
Link domains
(Figs 2 4 a n d 25)
Two link domains were originally noted in the link protein that binds hyaluronic acid 116. This protein also has one IgSF domain. Subsequently a further four link domains were observed in the proteoglycan core protein that has a chondroitin sulfate binding site. This protein also contains one IgSF domain, a CCP domain and a lectin C-type domain lo2. The only link domain found on leucocytes to date is a single domain in CD44 and this domain is known to bind to hyaluronate 117,118. The structure of a link domain from TSG6, a hyaluronic acid binding protein that can be induced by TNF, has recently been determined by NMR 119. The fold has a similar topology to that of a lectin C-type domain although there is no significant sequence similarity (Fig. 25). However the large loops that make up the Ca 2§ binding region in the mannose binding protein are much shorter in TSG6. A small fl strand number 2 is indicated in Fig. 25 which is not shown in the lectin C-type domain (Fig. 21). Thus the strand numbering differs with fl strands 3-6 in the link corresponding to strands 2-5 in the lectin C-type domain. The link domain also has a similar topology to a domain found in subunits of a toxin from Bordetella pertussis 119,12o. The predicted hyaluron binding site is in the same position as the carbohydrate binding site of E-selectin 119.
66
,
RNAase inhib A4 RNAase inhib B4 RNAase inhib A5 RNAase inhib B5 CD42a CD42b A2g-6 A2g-7 A2g-8 PG-2 PG-3
Q i
,,,,,,,
,,,
E T~R~E R E]LIDIL K K EILISIL A F DH[L[P Q
DIE
G w A L
TILIWIL
IC'ioaPa
N C G L T P A~C K clo, v S NG C GDAG I A ElL CIP GI-L-]L g c D I T A S G C R D] L CIR V Q a G N K L G D E G A R LIL CIE QT[-'~DV TQNPWH C DC S T
i
LLANFTELRT
CC O ' Qil LLLPQP
a F T P
V
R R
i
GENQ
P
NGN
S~L
ETL
P PD
P E P R
A G L
....
QVLOK ....
ARVAAGA NKI
Qa S K Q Y
KV
P
- - -
.A
V
E
S KNQ[-L-]K E L P E K . . . .
.
.
.
.
Figure 22 Leucine-rich repeats. Residues identical in four or more sequences are
boxed. The positions of the fl strand and c~helix determined from the structure of the ribonuclease inhibitor 12 are indicated above the sequence. The sequences of the following proteins are from the SWISSPR OT database and the database accession number and residue numbers are given in brackets. RNAase inhib, pig ribonuclease inhibitor (P10775; A4 195-222, B4 223-250, A5 252-279, B5 280307); CD42a, human platelet glycoprotein IX precursor (P14770, 70-98); CD42b, human platelet glycoprotein IB fl chain precursor or CD42c (P13224, 74-101); A2g-6, A2g- 7, A2g-8, human leucine-rich c~2-glycoprotein (LR G) (P02 750, 148-171, 172-195, 196-219 respectively); PG-2, PG-3, human bone proteoglycan II precursor (P07585, 100-123 and 124-147 respectively). Note the domain borders have been modified from earlier assignments 19a to account for the recent structural information for LRR from ribonuclease le
\
Figure 23 The folding pattern of the leucine-rich repeats present in the ribonuclease inhibitor. Each repeat consists of a short fl strand and a longer ~ helix 12. The repeats assemble into an ordered horseshoe-like structure with the curved fl sheets lining the interior and the helices flanking the outer circumference.
67
68 P l TSG6
CD44 COREl CORE2 CORE3 CORE4 LINK1 LINK2
-
a1
I
V Y H R E V F H V E V F H Y R v F ~A V F H Y R V F F A T V F P Y F R F Y Y L
P 4
A R - K A I TY S P
P I
S N S P E K F G S T D QME Q F R L G D H P T K -
E F K A L Q R T IR A A
0 2
-
a2
P 3 1 -
I D -
G A -
Q G -
L D L N
LL P 5
P 6
TSG6 CD44 COREl CORE2 CORE3 CORE4 LINK1 LINK2
Figure 24 Link domains. Residues identical in four or more sequences are boxed. The positions o f the 0 strands, and a helices determined for the structure of TSG6, a tumour necrosis factor-inducible protein ' I 9 , are shown above the sequences. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPROT database and the database accession number and residue numbers are given in brackets. TSG6, h u m a n tumour necrosis factor-inducible protein TSG6 precursor (P98066, 36-1 32); CD44, human CD44 antigen precursor (P16070, 32-123), COREl, CORE2, CORE3, CORE4, rat cartilage-specificproteoglycan core protein precursor (P07897, 152-251,253-353,486-585 and 587-687 respectively); LINKI, LINK2, rat proteoglycan link protein (P03994, 158-257 and 259-354 respectively).
ctl
Figure 25 The folding pattern of a link domain. Ribbon diagram of the link from TSG6, a tumour necrosis factor-inducible protein ~19. The fl strands are shown as broad arrows pointing from the N- to C-terminal direction, ~ helices as coiled ribbons and the connecting loops as thinner lines. The short lines in bold indicate the positions of disulfide bridges. The labelling of the fl strands (ill-6), and c~ helices (c~1-2) corresponds to that in the sequence alignments in Fig. 24. Low-density lipoprotein receptor (LDLR) domains
(Figs 2 6 a n d 27)
The LDL receptor contains seven domains of about 40 amino acids with six conserved cysteine residues that have been called LDLR domains 121. The LDLR also contains three EGF domains. LDLR domains have also been found in other proteins, notably some complement components such as C6, C9 and Factor I. In the LDLR, four of the LDLR domains are encoded by individual exons whilst the other three are encoded by a single exon 122. Mutational analysis has indicated that the LDLR domains are important in the binding of some lipoproteins but otherwise the function of this domain type is not known 123 The structure of the N-terminal LDLR domain of the LDL receptor has been determined by NMR 124. It consists of a fl hairpin structure followed by a series of fl turns (Fig. 27). It lacks extensive ~ helix and/or fl sheet found in many domain types and in this respect it is more like the TNFR 12s although these structures are not particularly similar. However both the LDLR and TNFR contain repeats of about 40 amino acids with a high content of Cys bridges and an unusually low percentage of hydrophobic residues. The LDLR domain shows no resemblance in topology to any other domain type to date. Ly-6 d o m a i n s
(Figs 28 a n d 29)
The Ly-6 antigens are a group of leucocyte antigens first identified in the mouse that consist of 70-80 amino acids containing 10 Cys residues ~26"127 Southern blot
69
70 LDLR dl LDLR d2 LDLR d3 Comp 9 Factor I Comp 7 Comp 6
Figure 26 Low-density lipoprotein receptor (LDLR)domains. Residues identical in four or more sequences are boxed. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPROT database and the database accession number and residue numbers are given in brackets. LDLR, h u m a n low-density lipoprotein receptor precursor (P01130, d l 20-59, d2 61-102, d3 103-141); C o m p 9, rainbow trout complement C9 (P06682, 72-1 12); Hemo.Linker, marine worm giant extracellular haemoglobin linker 2 chain (P18208, 61-202); Factor I, human complement factor Iprecursor (P05156, 253-291); C o m p 7, h u m a n complement C ? precursor (P10643, 77-1 16); C o m p 6 , h u m a n complement C6 precursor (P13671, 131-1 71).
Figure 27 The folding pattern of a low-density lipoprotein receptor (LDLR) domain from the LDL receptor (domain 1) determined by NMR ~24. The domain consists of a fl hairpin structure followed by a series of fl turns. analysis indicates that many Ly-6-related genes are present in the mouse and of these, 10 distinct genes have been identified 128. The Ly-6 antigens are expressed in nonlymphoid tissues, for example kidney, as well as on leucocytes. Homologues of the Ly-6 antigens have been found in the rat but not yet in other species. In humans the CD59 antigen contains a single Ly-6 domain which causes inhibition of the activity of complement to lyse cells expressing CD59. An invertebrate member of the Ly-6 superfamily has been isolated from squid optic and central nervous tissue 127,129. All the above molecules consist of a single Ly-6 domain attached to the cell surface by a GPI anchor. The urokinase plasminogen activator receptor (CD87) contains three Ly-6 domains separated by hinge-like sequences and is also attached to the cell surface by a GPI anchor. The alignments for these domains are shown in Fig. 28. No Ly-6 domain has been found in combination with domains of any other superfamily and this may be because the exon structures known for this superfamily are not suited to exon shuffling (Table 3). The structure of the Ly-6 domain of human CD59 has been determined by NMR 7,8 It forms a relatively flat disk-like shape consisting of a two stranded fl sheet fingers packed against a protein core formed by a three-stranded fl sheet and a short ~ helix (Fig. 29)7,8. The topology of the fold is similar to that of snake venom neurotoxins consistent with earlier predictions based on sequence analysis 7,8
MHC domains (Figs 3 0 - 3 2 ) The MHC antigens and related molecules contain membrane-proximal IgSF C 1-set domains. However their N-terminal segments, including the a l and a2 domains of MHC Class I heavy chain and the ~1 and fll domains of the Class II ~ and fl chains, show no sequence similarity to IgSF sequences 13o and the Class I and II domains are known to form an independent structural unit as shown in Fig. 30 78,79. Thus in the sequence alignments in Figs 31 and 32 the sequences are shown as an MHC Ial-set and MHC Ia2-set. There are numerous sequences related to the classical MHC antigens and these show a Class I-type structural organization, including the binding of fl2-microglobulin, with no examples so far of a Class II-like organization.
71
C
B
Mouse Ly-6C UPAR-I UPAR-2 UPAR-3 Squid Sgp? Cobratoxin
T H E -
- -
Q I K ~ D W C D A F C
* D
a
-
E
. . .
Figure 28 Ly-6 domains. Residues identical in three or more sequences are boxed. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The positions of the p strands (A-E) and single N helix are indicated b y the bars above the sequence .' The structure of the cobratoxin has also been determined and the strands are in similar positions although the penultimate strand is smaller .' The sequences of the following proteins are from the SWISSPROT database unless otherwise indicated and the database accession number and residue numbers are given in brackets. Human CD59, human CD59 antigen (P13987,26-95); Mouse Ly-6A (P05533,27-105); Mouse Ly-6C (P09568,27-102); W A R - 1 , W A R - 2 , UPAR-3, human urokinase plasminogen activator receptor (CD87) (PIR: S12376, 23-99, 115-199 and 214-294 respectively); Squid Sgp2, squid glycoprotein 2 residues 1-92 Cobratoxin, Naja siamaensis long neurotoxin (a-cobratoxin, P01391, 1-63).
Protein superfamilies and cell surface molecules
72 A Human CD59 Mouse Ly-6A
Figure 29 The folding pattern of a Ly6 domain. Ribbon diagram showing the fl strands of human CD59 8 as broad arrows pointing from the N- to C-terminal direction, the single helix as a coil and the connecting loops as thinner lines. The labelling of the fl strands (A-E) and single helix (~) corresponds to that in the sequence alignments in Fig. 28.
i
f E
The Qa and Tla antigens of mice are very similar in sequence to M H C Class I antigens. H u m a n CD 1 antigens, an Fc receptor of rodent neonatal gut 13x and a Class I-related molecule expressed by cytomegalovirus 132 all show sequence identity at the level of about 30%. A more detailed discussion of MHC-related sequences can be found in refs 78, 80 and 133-135. The M H C Class I ~ 1 and ~2 domains show weak sequence similarity to each other and form a similar fold containing a platform of fl strands and a single ~ helix 78. The two domains together form the peptide binding groove of the M H C molecule. In the M H C Class II molecules the ~1 domain shows strong sequence similarity to Class I ~1 and the Class II fll domain is most similar to Class I ~2133. The M H C Class II fold is similar to that of M H C Class I with a typical peptide binding groove 7 9 - 8 1 The rodent neonatal Fc receptor has sequence similarity to M H C Class I and X-ray crystallography showed that it has a typical M H C Class I like structure although the peptide binding groove is closed and unable to bind peptides 136
HLA A2 (~I
HLA A2. ~2
Figure 30 The folding pattern of MHC domains. The fl strands are shown as broad arrows pointing from the N- to C-terminal direction, c~helices as coiled ribbons and the connecting loops as thinner lines. The arrowheads indicate where the MHC Class I c~l joins the ~2 domain to form the peptide binding groove flanked by the two ~ helices. The data are from the Brookhaven Protein Database.
73
P
G
a
P
s
a
Figure 31 MHC Class I cul-set domains. Residues identical in three or more of the sequences are boxed. The positions of the p strands, ct helices determined for the structure o f the human H L A Class I are shown above the sequences. The sequences of the following proteins are from the SWISSPROT database and the database accession number and residue numbers are given in brackets. MHC Class I , human H L A Class I A-2CY precursor (P01892,25-116); CDla, human CDla antigen precursor (P06126, 26-1 12);FcR rat, rat gut Fc receptor precursor (P13599, 25-1 15);HCMV, human cytomegalovirus glycoprotein H301 precursor (P08560, 19-101); Class 11 A - B a, mouse MHC Class I I A - B CY chain precursor (P14434, 21-103); Class I I DQ(3) a, human MHC Class I I DQ(3) ct chain precursor (P01909,28-109).
Protein superfamilies and cell surface molecules
74
P
P MHCCI~~SI CDI a FcR rat
P
P
rJ
P
MHC' Class I CDI FcR rat HCMV Class II A p
CI .
~
-
.
.
.
a
c(
a
.
. . . . . .
.
~
~
. . . . . . . .
-
.
.
.
.
~
Figure 32 MHC class I ot2-set domains. Residues identical in three or more of the sequences are boxed. The positions of the 0 strands, (Y helices determined for the structure of the human HLA Class I are shown above the sequences. The sequences of the following proteins are from the SWISSPROT database and the database accession number and residue numbers are given in brackets. MHC Class I , human HLA Class I A - 2 o precursor (P01892, 115-203); CDla, human CDla antigen precursor (P06126, 109-1 99); FcR rat, rat gut Fc receptor precursor (P13599, 110-1 99); HCMV, human cytomegalovirus glycoprotein H301 precursoi (P08560, 112-210); Class I I A 0,mouse H2 Class II A p chain precursor (P14483, 32-122); Class DQ(3) p, human MHC Class I I DQ(3) $ chain precursor (P06126, 109-199).
75
76
a2'
a 1'
a 2
8 3
P 2
P I
a1
8 4
8 5
P 6
P 7
-
. . .
P
89
P I 2
P I 0
a 4
a3
P I 1
a 5
a 6
Figure 33 Protein tyrosine phosphatase (PTPase) domains. Residues identical in four or more sequences are boxed. The positions of the ,b' strands and a helices shown above the sequences are those from the structure of protein tyrosine phosphatase 1B determined b y X-ray ~ r y s t a l l o g r a p h y ' ~The ~ . Cys residue of the active site of this structure is indicated in bold together with the asterisk. The sequences o f the following proteins are from the SWISSPROT database and the database accession number and residue numbers are given in brackets. PTlB, human protein tyrosine phosphate phosphatase PTlB (P18031 4-282); TCTP, human T cell protein tyrosine phosphate phosphatase (P17706, 25-280); CD45, human CD45 antigen (P08575, d l 639-915, d2 931-1232); LAR, human LAR protein (P10586, d l 1331-1602, d2 161 7-1894).
Protein tyrosine phosphatase (PTPase)(Figs 33
a n d 34)
The PTPase family of integral membrane proteins was discovered when two cytoplasmic repeats of the CD45 antigen 137 were matched with the sequence of a placental cytoplasmic phosphotyrosine phosphatase laS"la9 Subsequently PTPase activity has been shown for the membrane-proximal cytoplasmic domain of CD45 but as yet not for the C-terminal domain 14o. Subsequently many other sequences have been identified with similarities to these sequences by cross-hybridization with cDNA probes and these include cytosolic proteins and membrane phosphatases. The LAR cell surface protein contains a tandem pair of PTPase domains together with an extracellular portion with three IgSF domains and eight Fn3 domains, although this protein is not expressed widely on leucocytes T M Sequences with many similarities to the PTPase domains have also been identified in Drosophila 142. Some examples of the sequences are shown in Fig. 33. The second domain in CD45 is unusual in comparison with other PTPases in that it contains an insertion of 19 amino acids with a very high content of acidic and Ser residues. The Ser residues may be phosphorylated by Set kinases to produce an extremely negatively charged region of sequence. The complete genomic structure for mouse CD4514a shows that the region illustrated in Fig. 33 is encoded by 6 or 8 exons for each domain. However the ends of the domains as defined from the sequence similarities do not correspond to the ends of exons with the same phase of intron/exon boundaries aa. The genetic origin of these domains is unclear. The structure of the human protein tyrosine phosphatase 1B domain has been determined (Fig. 34)144. This 37 kD domain is organized into eight c~ helices and 12 fl strands with the active Cys in a loop between the final ,H strand and the fourth c~ helix (Fig. 34)144
(zl
_
/"-3 "~;~'//~, o~2'
P9 Figure 34 The folding pattern of protein tyrosine phosphatase domains. Ribbon diagram to show the folding pattern of human protein tyrosine phosphatase 1B determined by X-ray crystallography TM. The structure contains a 10 strand fl sheet flanked by regions of ~ helix. Two c~helices at the N-terminus correspond to regions with lower levels of sequence conservation and wrap around the N-terminus of ~-6.
77
78
* SREC d4 CD5 dl CFAI CD5 d? CD5 d3 SREC d I SREC d2 SREC d3 SCAV
P K HTR -OMG D ~FQ D RSNS EGK FS LQ L V A Q I VQ AVEGN E P GS NVEGD P T ~ T F K
ERQ
GN G m R -
@I
. . .
Q l E l SFL C[C~T$E~R NEGSVEIY NEGTLETF HEGRVEIW
Q H W H
Figure 35 Scavenger receptor cysteine-rich (SCRC)domains. Residues identical in four or more sequences are boxed. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPROT database unless otherwise indicated and the database accession number and residue numbers are given in brackets. SREC, sea urchin egg peptide speract receptor precursor (P16264, d l 38-145, d2 148-258, d3 259-367, d4 377-486); CD5, human CD5 antigen (P06127, d l 30-134, d2 156-269, d3 271-369); CFAI, human complement factor I precursor (PO5156, 109-216); SCAV, human scavenger receptor type I (P21757,345-451). Alignments are from 20 residues before the conserved glycine to the C-terminus of the scavenger receptor.
Scavenger receptor cysteine-rich (SRCR)(Fig.
35)
Three domains with sequence similarities were identified in the extracellular region of the CD5 antigen and later single domains were detected in macrophage scavenger receptors, the complement control protein Factor I, three each in the CD6 antigen and the speract receptor protein present in sea urchins ~4s, 11 in the bovine WC1 antigen 146 and nine in CD163 (M130) a macrophage cell surface protein ~4z. This domain type was first reported in the scavenger receptor and thus is named scavenger receptor cysteine-rich (SRCR) domain 14s and alignments for this superfamily are shown in Fig. 35. However the role of this domain type in the function of the scavenger receptor to bind various ligands is unclear, as some ligands will bind to both scavenger receptors I and II but the latter lacks this type of domain 148. Initially it was argued that the CD5 antigen domains were related to IgSF domains, ~49 and then the PapD bacterial protein 28. It is now clear that the CD5 domains are related to the SRCR domains. A ligand for CD6 called ALCAM (CD166) has been identified on thymic epithelial cells. It contains five IgSF domains ~so and interacts via its N-terminal IgSF domain with the membraneproximal domain of the three SRCR domains of CD6 ~sl. No tertiary structure data are available yet for this domain type.
Signal transduction sequence motifs: ITAM, ITIM and the death domain; motifs in cytoplasmic parts of membrane proteins (Figs 36 and 3 7) The cytoplasmic parts of leucocyte membrane proteins tend not to have domains like those found on the extracellular parts. However a number of different motifs have been identified and clearly play an important part in transmitting the effects of signals from the outside of the cell to the nucleus and vice versa. There is a wide range of proteins recognizing these motifs and in turn being recognized by other proteins. The binding may also be dependent on phosphorylation and in addition these regions may interact with components of the cytoskeleton. This is a large field and beyond the scope of this review. However four important motifs will be summarized. Programmed cell death has been shown to involve signals through cell surface receptors. In the immune system two well-characterized examples involve TNFRI (CD120a) and Fas (CD95) ls2'ls3. Both these proteins contain a motif in the cytoplasmic region termed the death domain that is necessary for self-association of the receptors leading to signals that lead to cell death ls3. Related sequences have been identified in other proteins including some isolated by the yeast two-hybrid system, e.g. RIP, TRADD (reviewed in refs 152 and 153) and are illustrated in Fig. 36. Thus TNFRI interacts with TRADD which in turn can interact with RIP which has serine threonine kinase activity and can associated with Fas ls4. Some proteins with no apparent involvement in "death", such as the cytoskeletal protein ankyrin, also contain a "death" domain. Unlike the cytoplasmic domains of TNFRI and Fas, these death domains show no significant sequence similarity to "reaper", a protein that is expressed in Drosophila cells that are destined to die lss One motif that used to be called the signal transduction sequence motif is present in the cytoplasmic regions of several membrane proteins of the antigen receptor
79
80 Fas TNFR Ankyrin TRADD RIP NFKB NGFR
Fas TNFR Ankyrin TRADD RIP NFKB NGFR
. .
. . .
.
~. . . . . .
Figure 36 Death domain. Residues identical in four or more sequences are boxed. The sequences of the following proteins are fr o m the SWISSPROT database unless otherwise indicated and the database accession number and residue numbers are given in brackets. Fas, human Fas precursor (P25445,228-31 I ) ; TNFR, hum an TNFR I precursor (PI 9438,354-438); Ankyrin, h u m a n ankyrin 1 (P16157, 1400-1483); TRADD, human TRADD protein (PIR:A5691 I , 213-302); RIP, hum an RIP protein (PIR: 138992, 282-367); NFKB, human NF-KBp105 (P19838, 81 7-890); NGFR, human low-affinity NGFR precursor (PO8138,344-418).
Human CD37 Human CD3 8 Mouse CD3 ~. Human CD3~ (1) Human CD3~ (2) Human CD3~ (3) Mouse CD79a Mouse CD79b Ral Fc~R 13 Rat FcaR 7 Human CD5
~ K Q T L L P N ~ q ~ Q P~K D REDD Q!H~ Q A C L R NID]QV]YIQP RDRDDAQ KERPPPVPNPr_D.]YIEP I RKGQRDL P P AYQQGQNQIL YIN E~N C GR R E E ! D KPRRKNPQEG]LYINE QKDKMAE EI E R R R G KG H[-D~GILY[QG S T A T K D T DAV~ E N]L ['~M P D D Y E D[-D~ ~Y[Y[E G~N L DDCSM DTI ~ KAGMEE H E G N I D QT AT DI FERSKVPDIDIRL~YIEE HVYSP I A [-D~I ASREKS[DIAV[YITG N T R N Q E T ENP TAS HVIDINEIYIS QP PRNS RL S AIY]P
Figure 37 ITAM (immunoreceptor tyrosine-based activation motif) or signal transduction motifs. Residues identical in five or more sequences are boxed. The sequences of the following proteins are from the SWISSPR OT database and the database accession number and residue numbers are given in brackets. Human CD3% human CD3~ chain precursor (P09693, 149-181); Human CD3& human CD36 chain precursor (P04234, 138-171); CD3c, mouse CD3c chain precursor (P22646, 159-184); CD3~, human CD3~ chain precursor (P20963, 61-94, 99-130, 131-163); CD79a, mouse CD79a (MB1) precursor (Pl1911, 171-204); CD79b, mouse CD79b (B29) precursor (P15530, 184-217); FccRfl, rat Igc receptor fl subunit (P13386, 207-239); FceR% rat Igc receptor ,y subunit precursor (P20411, 54-86); Human CD5, human CD5 antigen precursor (P06127, 442-475).
complexes on B cells, T cells, and the IgE receptor on mast cells ls6. This is now called ITAM is7,~s8 or "immunoreceptor tyrosine-based activation motif" and alignments are shown in Fig. 3 7. One common feature of these molecules is that they are components of membrane complexes which when crosslinked give signals that lead to cell activation. This can result in cell proliferation in the case of the antigen receptors and to degranulation of mast cells. The role of the ITAM motif in this signalling response has been extensively studied. Receptor clustering results in the rapid activation of Src family protein tyrosine kinases which phosphorylate the ITAM on each tyrosine. This results in the association of the Syk and/or Zap tyrosine kinases via their tandem SH2 domains with the phosphorylated ITAM. This leads to the activation of Syk and/or Zap which is a critical step for the generation of a signal transduction cascade (reviewed in refs 159 and 160). A second motif involved in giving an inhibitory signal is called "immunoreceptor tyrosine-based inhibition motif" or ITIM. In B cells the ITIM motif of the FcTRIIB is thought to be the target for the SH2-containing phosphatase SHP which is critical in determining the threshold by which B cells respond to antigens ~s7. Similar motifs are found in CD22, IL-2Rfl (CD122) and IL-3Rfl (CDwl31). However the motif is short and not that clearly defined. Thus an alignment is not given but is reviewed in refs 157 and 161. Both SH2 and SH3 domains are common in cytosolic proteins and recognize phosphotyrosine residues and proline-rich motifs respectively. These motifs are common in other cytosolic proteins, particularly those involved in signal transduction. A proline-rich motif consensus sequence for SH3 domain binding is Xp~PPXP, where X is any amino acid, p is usually a Pro and 9 is a hydrophobic
81
Pc, d,
K
PPll Vitronectin PC1 d2
-r_Q.JGR CI T SST~cCC-[KG R C[ w NKL_ ~ R CI
-t
OR c Ii ~ -
- R T F S N~R~D A A~VQEFG S L G i ~ i ~ F FES QE i~ AFDKHHQ H NAR GFNVDKK Q DEL SYYQ YTA KRLSRFV S ADD KTHN INYSS
VE PTH SDHEV KPQVT QDKKS
Figure 38 Somatomedin B domains. Residues identical in three or more sequences are boxed. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPROT database and the database accession number and residue numbers are given in brackets. PC1, mouse plasma cell antigen PC1 (P06802; dl 54-93; d2 95-137); PPll, human placental protein precursor (P21128, 47-88); Vitronectin, human vitronectin precursor (P04004; 22-63). residue 16e. A putative proline-rich motif has been described in the cytoplasmic domain of CD2 ~63
Somatomedin B domains (Fig. 38) Somatomedin B is a serum peptide derived from vitronectin (also called serumspreading factor) by proteolysis. The plasma cell surface antigen PC1 was noted to contain two somatomedin B repeats 99. This glycoprotein has nucleotide pyrophosphatase/alkaline phosphodiesterase activity 164 but this is associated with a different region of the molecule than that containing the somatomedin B repeat. The domain has not been found on other cell surface molecules although it is present in placental protein 11 (PP11) 16s.
Transmembrane 4 pass (TM4) superfamily and FcERIfl/CD20 superfamily (Figs 39 a n d 40) The TM4 superfamily (also called the tetraspan superfamily) is a large group of proteins that are thought to traverse the lipid bilayer four times with both the N- and Cterminii on the cytoplasmic face of the membrane. This superfamily includes several leucocyte antigens such as CD9, CD37, CD53, CD63, CD81 and CD82166. Alignments for the TM4SF are shown in Fig. 39. The genomic organization of the leucocyte TM4SF proteins are similar, pointing to a single primordial gene. The non-leucocyte antigens form a separate group that seems to have branched off in
Figure 39 (opposite) Transmembrane 4 pass (TM4) superfamily. Residues identical in four or more sequences are boxed. The putative positions of the four transmembrane sequences are indicated with a solid bar above the sequences. The dashed line indicates the extracellular loops. The sequences of the following proteins are from the SWISSPROT database and the database accession number is given in brackets (the complete sequences are shown). Sm23, Schistosoma mansoni protein Sm23 (P19331); CD63, human melanoma-associated antigen ME491 (P08962); CD9, human CD9 antigen (P21926); CD81, human CD81 (TAPA-1) antigen (P18582); Co02, human tumour associated antigen Co-029 (P19075); CD82, human CD82 (R2) antigen (P27701); CD37, human CD37 antigen (Pl1049); CD53, human CD53 antigen (P19397). 82
Figure 39 Transmembrane 4 pass (TM4) superfamily and Fc&RI,!?/CD20superfamily Sm23 CD63 CD9 CD81 COO29 CD82 CD37 CD53
I D S m l D A L h l I G A - - I I ) K I’ I K E - - I lm N N ~ R Q Q - -~ - M E N Y P K N N H ~ A S ~
V hl V I
I A SL) L K L E K S 1 . R L N t Y M A
f - F Y K D ? Y N K L K I K D t l ’ Q R M Q - F Y D Q A l . O Q , \ V V D D D A N h A K h E 1. I _ Y t- N T K I. 1. S A T G I: S E K Q I Q@ ti I V - 7 t L 1 R D Y N S S R E D - S L Y 0 D V L - E K . ~ I Q K Y G T N P E E T A A E ~ K t i L - T D S I H R Y H S - - - D N ~ T K A
83
CD20Fc~.RI~ ~ T ~ - - E- DN K ~ ~ TS~N DALP G !~A Q~ IP ~ i S A G~ Dl -- ELA!~iS~ HTm4 ASHEVDNIAIE GSA HGT . . . . . .
-P A~FP~ FRRMS A PEKPA E NTSVY
~ V G P W-- [QS F[F M RF~.~K IlL Gr_~V~M[N G LIFH I [AL GIG C[~]MJ_lP- - A G 1 Y P W~QS F] L~ K[_~L]E F]L G[_~T]QM L[V]G LII C[LIC FIGIT V V CIS]T L D~ [~P[PHIPI 1 N Q Q T_ P D Y QIKIA KIL~QVL[.L.__Q~I [Q I L N]A A a l IL A L GIV F~]GISI L Q Q TS PY HIFI A TP DI Di~F L L L Y R - i ~ QKH FFTFY
-
~i~
LF ~ M ~ I
VI ; ~ F C
L
SF ~ A I MA i~G i ~ i ~ T VV
K L i W~! K RN~M L~ R Q F
LATS FI ~ LI VIGIT SHM A G-I LFI All. L~S {~-n SL_~_]NI "LL]-[i DI"iI ]A ~LsN]i V N K Q[-~I E] H L LF] ] K~ M ] E~ S] L~ N~ RF ASC I R NHi ~ P~ASESPMI~_I~N__
~ E PANP S E~N S p S Tr._Qli~y S I ! ~ ! ~ i L G I ~ V ~ A iiF Q{E--~V I~G I V ICI . . . . . . . IKID I V E D~DD F V1 V ElL VIL[M_L L F L I ILIA F~ClS F V L .............. N YM S N GM~JS--LIL LFf]L L[L E L CIV T I S T E N E ~ I K R T C S R P K S ~ I VLLS AEEKK i ~S 1]EL LIIY ................... RI E ~"~ K E E ~ V G L T E T S S Q P K I A MFW-]C. . . . . . NA[-~ . . . . . . . . . C R ............... NEeD , E I , - ~ Q E E LE~E E T~TN F p E~-PlQDCQE~SA~i END S S P L ELHVYS SA DT AF . . . . . . . . . . VVS . . . . ....................... S S~-lN S V . . . . . . . . . . . . Figure 40 FceRIfl/CD20 superfamily. Residues identical in two or more sequences are boxed. The putative positions of the four transmembrane sequences are indicated with a bar above the sequences. The sequences of the following proteins are from the SWISSPR OT database or the reference given and the database accession number is given in brackets (the complete sequences are shown). Human CD20 (P22836); mouse FceRI fl chain (P20490) and the HTm4 (translated from EMBL accession number L35848).
evolution from the leucocyte members 167. The majority of the differences in sequence between TM4SF molecules reside in the extracellular loop between TM sequences 3 and 4 where there are considerable differences in sequence length. This loop of sequence is known to be extracellular because it includes the N-linked glycosylation sites and the MRC OX44 epitope, which can be labelled at the cell surface, maps to an Ile/Thr interchange in this region 168. One feature of this superfamily is the high degree of sequence identity between species homologues, e.g. mouse CD81 is 92% identical to human CD81. This superfamily appears to have a role outside the immune system because TM4SF members have been identified in schistosomes 169, the nematode Caenorhabditis elegans 17o and Drosophila 171. There is no known function for any member of this superfamily although recent data have suggested roles in tumour cell metastasis 172? T cell development 173 and neural synapse formation 171. TM4SF proteins probably mediate these functions as components of multimolecular complexes since they are known to associate with molecules such as the B and T cell antigen receptor complex, integrins and other TM4SF members (reviewed in ref. 166). There is one example of a protein predicted to pass through the membrane five times, the CD47 or integrin-associated protein. A related protein has been found in vaccinia but so far a superfamily of TM5 molecules has not been defined.
84
T?i Fn
LTO I TB CD40L
CD30L T
OX40L CD271.
4-1nnt
1"Fn
LTn LTB CD401. CD30L OX40L CD27L 4-1nn~
-
85
Figure 41 Tumour necrosis factor (TNF) superfamily repeats. Residues identical in five or more sequences are boxed. The positions of the 7!, sheets determined for TNFalxl~'x"are indicated b y bars above the sequence. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section II. The sequences of the following proteins are from the SWISSPROT database and the accession number and residue numbers are given in brackets. TNFa, human tumour necrosis factor cy (P01375, 87-233); LTn; human lymphotoxin a (P01374, 62-205); LTl?, lymphotoxin B (Q06643, 87-243); CD4OL, human CD40 ligand (P29965, 121-261); CD3OL, human CD30 ligand (P32971, 97-225); FasL, human Fas ligand (P48023, 144-281); OX40L, human OX40 ligand (P23510, 58-1 72); CD27L, human CD27 ligand or CD70 (P32970,55-191); 4-lBBL, human 4-1BB ligand (1'41273, 90-240).
Protein superfamilies and cell surface molecules
FdtL
There are other leucocyte proteins that are predicted to traverse the plasma membrane four times, including CD20 and the FccRI fl chain but these do not show sequence similarity to the TM4SF given above. However, CD20 and the FccRI fl chain show clear sequence similarities to each other 174 in three of the four transmembrane regions of these molecules and a lower similarity to another gene called HTm417s, as shown in Fig. 40. All three genes are closely linked in humans to chromosome 11q12-13.1, and CD20 and the FceRI fl chain genes are very closely linked on mouse chromosome 19174. Thus these three sequences form a small distinct superfamily termed FccRIfl/CD20 superfamily. There are data to suggest that CD20 is a Ca ~+ channel 176. It seems likely that CD20 and HTm4 are associated with other membrane proteins since the FceRIfl is one chain of the complex which forms the high affinity receptor for IgE on mast cells and functions as an amplifier of signals transduced by the FccR1177
Tumour necrosis factor superfamily (TNFSF)(Figs 41 and 42) This domain type is found in the extracellular domain of type II membrane proteins. It is not associated with any other extracellular domain type. The extracellular regions of
Figure 42 The folding pattern of tumour necrosis factor (TNF) and TNFR superfamily. The structure shown is of one receptor (TNFR) molecule binding one monomer of TNF and is based on the structure of the complex of TNFR I and TNFf112s. The TNF is itself trimeric but the receptor is thought to be monomeric and only made trimeric by binding of ligand. The TNF is on the left of the figure and the receptor on the right. The N-terminus of the receptor is at the top of the figure and the membrane attachment would be at the base of the figure. TNF consist of a group of fl sheets arranged into a "jelly-roll" whilst the receptor contains four TNFR repeats characterized by their lack of ~ helix or fl sheet and held together by disulfide bridges.
86
OX40 ( 2 ) TNFRl(2) TNrnll(2) NGFR ('2) CD40 ( 2 ) TNFRI(3)
Figure 43 Tumour necrosis factor receptor (TNFR) superfamily repeats. Residues identical in five or more sequences are boxed. The asterisks mark the positions of the conserved residues used to identify domains for the corresponding entries in Section I I . The sequences of the following proteins are from the SWISSPROT database unless otherwise indicated and the database accession number and residue numbers are given in brackets. The sequences are contiguous over the four repeats except for OX40 which contains a short sequence in place of the third repeat. 0 x 4 0 , rat OX40 (CD134) antigen precursor (P15725, 25-102, 124-164); TNFRI, hum an tumour necrosis factor receptor precursor I (CD12Oa) (P19438, 43-1 96); TNFRII, h u m a n tumour necrosis factor receptor precursor I I (CD120b) (P20333, 39-201); NGFR, rat nerve growth factor receptor precursor (PO7174, 32-190); CD40, human CD40 antigen precursor (PIR: S04460,25-187).
87
88
Figure 44 n r o s i n e kinase domains II
I
111
KIT
N
VI
V
VI1
X
IX
VIII
XI
IV
Figure44 (opposite)
Tyrosine kinase domains. Residues identical in five or more sequences are boxed. The bars above the sequences represent the subdomains defined in this superfamily and the numbering is as in ref. 191. These correspond to structural features as determined from the structure of the mouse cAMP-dependent protein kinase ~ subunit illustrated in Fig. 45 and discussed in ref. 194. The sequences of the following proteins are from the SWISSPR OT database and the database accession number and residue numbers are given in brackets. SR C, human src proto-oncogene tyrosine kinase (P12931, 268-526); LCK, human T cell-specific tyrosine kinase (P06239, 243-501); M-CSFR, human macrophage colony-stimulating factor receptor precursor (P07333, 580-917); KIT, human kit proto-oncogene precursor (CD117) (P10721, 587-931); EGFR, human EGF receptor precursor (P00533, 710-975); PKA-C, mouse cAMP-dependent protein kinase (P05132, 41-297).
members are often released by proteolysis to give soluble factors with biological activity such as TNF, lymphotoxin, FasL (reviewed in refs 178-180). Several members have been defined recently and are often called ligands of the respective receptor, e.g. CD40L, CD27L (also called CD70), 4-1BBL, OX40L, FasL (CD95L). The receptors themselves form a superfamily termed the TNFRSF (see below). Structural studies by X-ray crystallography on TNF~ lsl"ls2, TNFfl 125 and CD40 ligand ~8a show they form homotrimers with a characteristic "jelly roll" fl sandwich. The stoichiometry of the interaction between TNFRI and TNFfl trimer is 3:1 ~25 This ratio is probably true for most of the other members of the superfamily although the 4-1BB ligand forms a disulfide-linked homodimer ~84,~ss and unlike most of the receptors CD27 seems to form a homodimer and may interact with its ligand in a different manner. The structure of TNFRI complexed with TNFfl is illustrated in Fig. 42 and discussed below.
Tumour
necrosis factor receptor superfamily (TNFRSF)(Figs 4 2
a n d 43)
This superfamily was previously called nerve growth factor receptor (NGFR) superfamily as NGFR was the first member to be defined of this superfamily. Four cysteine-rich repeats were recognized in the extracellular part of the low-affinity NGFR and subsequently related sequence repeats have been identified in a number of leucocyte cell surface antigens including CD40, CD134 (OX40), CD27, TNF receptors (CD120), CD30 and 4-1BB 178,179. Figure 43 shows an alignment of some of the repeats. All these examples show more similarity amongst themselves, for instance in gene structure and nature of ligands, than with NGFR. Thus this group is now more generally known as the TNFRSF after the well-characterized TNFR. One unusual feature is that most of the TNFRSF molecules contain 3 or 4 repeats. No single TNFRSF repeat sequence has been found and the repeat has not been associated with any other domain types. The gene structures for TNFRSF members show that the boundaries of each repeat do not correspond to exon boundaries. The gene for NGFR shows a different pattern of intron/exon boundaries to those members on leucocytes which form a separate group 186,187. It seems possible that a primordial gene with four repeats may have evolved by unequal crossing-over during recombination. This gene probably gave rise to all known members of the TNFR
89
superfamily by duplication and divergence, with the NGFR forming a separate branch from that of the leucocyte members 187. The ligands for the leucocyte members form a group of type II membrane proteins with sequence similarities to TNF (see above). The structure of the TNFRI complexed with TNFfl has been determined by X-ray crystallography 12s. Three receptor molecules form a sheath around the trimeric TNFfl. Figure 42 shows the structure of one receptor and one monomer unit of the ligand 12s. The four repeats in the receptor are arranged in a linear array. Each receptor molecule contacts two of the ligand molecules through a combination of hydrophobic and hydrophilic interactions. There is an absence of fl sheet or ~ helix structure, with the three disulfide bridges in each repeat forming a ladder-like pattern. The interaction is intriguing as the two related receptors TNFRI and II share about 24% identity in amino sequence in the Cys-rich region yet both can bind the same ligands TNFc~ and TNFfl, both of which themselves share only 33 % identity.
Tyrosine kinase domains
(Figs 44 a n d 45)
Two groups of tyrosine kinases can be distinguished: receptor tyrosine kinases which are transmembrane proteins where the tyrosine kinase domains are found in the cytoplasmic part, and non-receptor tyrosine kinases which are located in the cytoplasm. The non-receptor group of kinases includes members of the Src family, all of which are anchored to the inner leaflet of the plasma membrane with a myristate moiety. On activation they phosphorylate Tyr residues on their own cytoplasmic domains or on other proteins in the cytoplasm and this is believed to be
Figure45
The folding pattern of a tyrosine kinase domain. The structure shown is that of the mouse cAMP-dependent protein kinase c~ subunit 194. The correlation between the secondary structure and the subdomains indicated in Fig. 44 is discussed in ref. 194.
90
one of the early events in signal transduction pathways after ligand recognition. The Src kinases are expressed in association with SH2 and SH3 domains which can mediate interactions in the cascade (see details on ITAM earlier under Signal transduction sequence motifs). In leucocytes the best studied example is Lck, which associates with the cytoplasmic domains of CD4 and CD8 and regulates signal transduction by these molecules. Other examples are Fyn which associates with the T cell receptor complex 188, and Lyn, Fyn and Blk, which couple to the membrane Ig complex of B cells (reviewed in refs 159 and 189). Receptor tyrosine kinases are expressed on a wide variety of cells and examples include the PDGF receptor (CD140), EGF receptor and c-kit (CDllT). When these receptors bind their natural ligands they oligomerize and the cytoplasmic tyrosine kinase domains become activated and autophosphorylated. This leads to the phosphorylation and activation of various intracellular substrates including phospholipase C-7, phosphatidylinositol 3-kinase and the c-raf serine kinase. These effector molecules concomitantly associate with the activated receptor kinases. Tyrosine kinase domains consist of about 260-360 amino acids. The difference in size is due to insertion of a "kinase insert domain" of about 70-100 amino acids in certain receptor kinases, including the PDGFR (CD140), M-CSFR (CD115), and ckit (CD117) kinases. These insert regions appear to regulate the interaction of the kinase with certain cellular substrates/effector molecules (reviewed in refs 159 and 190). The tyrosine kinase domain of a particular molecule is particularly wellconserved across species and the identities between molecules within the superfamily are about 40% as illustrated in Fig. 44. This is much higher than for many of the superfamilies with domains that are found at the cell surface. The amino acid sequences of tyrosine kinase domains are not conserved uniformly, but consist of 11 highly conserved subdomains (I-XI) separated by regions of lower conservation ~9, Subdomain I contains the Gly-X-Gly-X-X-Gly consensus which forms part of the binding site for ATP. Subdomain II contains an invariant lysine, which appears to be directly involved in the phosphotransfer reaction. Subdomain VIII contains a Pro-Ile/Val-Lys/Arg-Trp-Thr/Met-Ala-Pro-Glu consensus which is characteristic of the tyrosine kinases. In the serine/threonine kinases the consensus is Gly-Thr/Ser-X-X-Tyr/Phe-X-Ala-Pro-Glu. These subdomains have now been correlated with structural elements from the X-ray crystallography structure of the cAMP-dependent protein kinase c~ subunit shown in Fig. 45.
References 1 Baron, M. et al. (1991) Protein modules. Trends Biochem. Sci. 16, 13-17. 2 Shapiro, L. et al. (1995) Structural basis of cell-cell adhesion by cadherins. Nature 374, 327-336. 3 Nagar, B. et al. (1996) Structural basis of calcium-induced E-cadherin rigidification and dimerization. Nature 380, 360-364. 4 Kolodkin, A.L. (1996) Semaphorins: Mediators of repulsive growth cone guidance. Trends Cell Biol. 6, 15-22. s Mott, H. and Campbell, I. (1995) Four helix bundle growth factors and their receptors: protein-protein interactions. Curr. Opin. Struct. Biol. 5, 114-121. 6 Doolittle, R. (1995) The multiplicity of domains in proteins. Annu. Rev. Biochem. 64, 287-314. 7 Fletcher, C.M. et al. (1994) Structure of a soluble, glycosylated form of the humancomplement regulatory protein CD59. Structure 2, 185-199.
91
8 Kieffer, B. et al. (1994)3-dimensional solution structure of the extracellular region of the complement regulatory protein CD59, a new cell-surface protein domain related to snake-venom neurotoxins. Biochemistry 33, 4471-4482. 9 Patthy, L. (1987) Intron-dependent evolution: preferred types of exons and introns. FEBS Lett. 214, 1- 7. lo Bork, P. et al. (1996) Structure and distribution of modules in extracellular proteins. Q. Rev. Biophys. 29, 119-167. 11 Kobe, B. and Deisenhofer, J. (1993)Crystal structure of porcine ribonuclease inhibitor, a protein with leucine-rich repeats. Nature 366, 751-756. 12 Kobe, B. and Deisenhofer, J. (1994) The leucine-rich repeat: a versatile binding motif. Trends Biochem. Sci. 19, 415-421. 13 Williams, A.F. and Barclay, A.N. (1988) The immunoglobulin superfamily domains for cell surface recognition. Annu. Rev. Immunol. 6, 381-405. 14 Brutlag, D. and Sternberg, M. (1996) Sequences and topology, challenges for algorithms and experts. Curr. Opin. Struct. Biol. 6, 343-345. is Bork, P. and Koonin, E. (1996) Protein sequence motifs. Curr. Opin. Struct. Biol. 6, 366-376. 16 Henikoff, S. (1996) Scores for sequence searches and alignments. Curr. Opin. Struct. Biol. 6, 353-360. 17 Pearson, W.R. (1995) Comparison of methods for searching protein-sequence databases. Protein Sci. 4, 1145-1160. 18 Pearson, W.R. and Lipman, D.J. (1988) Improved tools for biological sequence comparison. Proc. Natl Acad. Sci. USA 85, 2444-2448. 19 Altschul, S.F. et al. (1990)Basic local alignment search tool. J. Mol. Biol. 215, 403-410. 20 Patthy, L. (1990) Homology of a domain of the growth hormone/prolactin receptor family with type III modules of fibronectin [letter]. Cell 61, 13-14. 21 Bairoch, A. (1993)The PROSITE dictionary of sites and patterns in proteins, its current status. Nucleic Acids Res. 21, 3097-3103. 22 WWW. http://expasy.hcuge.ch/ 23 Dayhoff, M.O. et al. (1983) Establishing homologies in protein sequences. Meth. Enzymol. 91,524-545. 24 George, D.G. et al. (1990) Mutation data matrix and its uses. Meth. Enzymol. 183, 333-351. 2s Henikoff, S. and Henikoff, J.G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci. USA 89, 10915-10919. 26 Brady, R.L. and Barclay, A.N. (1995)The structure of CD4. In Current Topics in Microbiology and Immunology, vol. 205: the CD4 Molecule (Littman, D.R., ed.). Springer-Verlag, Berlin, pp. 1-18. 27 Amzel, L.M. and Poljak, R.J. (1979)Three-dimensional structure of immunoglobulins. Annu. Rev. Biochem. 48, 961-997. 28 Holmgren, A. and Branden, C.I. (1989) Crystal structure of chaperone protein PapD reveals an immunoglobulin fold. Nature 342, 248-251. 29 de Vos, A.M. et al. (1992) Human growth hormone and extracellular domain of its receptor: crystal structure of the complex. Science 255, 306-312. 30 Baron, M. et al. (1992) 1H NMR assignment and secondary structure of the cell adhesion type III module of fibronectin. Biochemistry 31, 2068-2073. 31 Overduin, M. et al. (1995) Solution structure of the epithelial cadherin domain responsible for selective cell adhesion. Science 386-389.
92
32 Williams, A.F. (1987) A year in the life of the immunoglobulin superfamily. Immunol. Today 8, 298-303. 33 Wong, E. et al. (1993) Leukocyte common antigen-related phosphatase (LRP)gene structure: Conservation of organization of transmembrane protein tyrosine phosphatases. Genomics 17, 33-38. 34 Patthy, L. (1987) Detecting homology of distantly related proteins with consensus sequences. J. Mol. Biol. 198, 567-577. 3s Sharp, P.A. (1981) Speculations on RNA splicing. Cell 23, 643-646. 36 Bowen, M.A. et al. (1997) Structure and chromosomal location of the human CD6 gene. J. Immunol. 158, 1149-1156. 37 Doherty, P. et al. (1992)The VASE exon downregulates the neurite growthpromoting activity of NCAM 140. Nature 356, 791-793. 3s Murray, A.J. et al. (1995) One sequence, two folds: a metastable structure of CD2. Proc. Natl Acad. Sci. USA 92, 7337-7341. 39 Lobsanov, Y.D. et al. (1993) X-ray crystal structure of the human dimeric S-Lac lectin, L-14-II, in complex with lactose at 2.9-A resolution. J. Biol. Chem. 268, 27034-27038. 4o Barton, G.J. and Sternberg, M.J. (1987)A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. J. Mol. Biol. 198, 327-337. 41 Devereux, J. et al. (1984) A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 12, 387-395. 42 Ryu, S.E. et al. (1990) Crystal structure of an HIV-binding recombinant fragment of human CD4. Nature 348, 419-426. 43 Wang, J. et al. (1990) Atomic structure of a fragment of human CD4 containing two immunoglobulin-like domains. Nature 348, 411-418. 44 Brady, R.L. et al. (1993) Crystal structure of domains 3 and 4 of rat CD4: relationship to the NH2-terminal domains. Science 260, 979-983. 4s Politou, A.S. et al. (1994) Immunoglobulin-type domains of titin are stabilized by amino-terminal extension. FEBS Lett. 352, 27-31. 46 Reid, K.B. and Day, A.J. (1989) Structure-function relationships of the complement components. Immunol. Today 10, 177-180. 47 Barlow, P.N. et al. (1993) Solution structure of a pair of complement modules by nuclear magnetic resonance. J. Mol. Biol. 237, 268-284. 4s Barlow, P.N. et al. (1991) Secondary structure of a complement control protein module by two-dimensional 1H NMR. Biochemistry 30, 997-1004. 49 Idzerda, R.L. et al. (1990) Human interleukin 4 receptor confers biological responsiveness and defines a novel receptor superfamily. J. Exp. Med. 171, 861-873. so Cosman, D. et al. (1990) A new cytokine receptor superfamily. Trends Biochem. Sci. 15, 265-270. sl Goodwin, R.G. et al. (1990) Cloning of the human and murine interleukin-7 receptors: demonstration of a soluble form and homology to a new receptor superfamily. Cell 60, 941-951. s2 Somers, W. et al. (1994) The X-ray structure of growth hormone-prolactin receptor complex. Nature 372, 478-481. s3 Jones, E.Y. et al. (1992)Crystal structure of a soluble form of the cell adhesion molecule CD2 at 2.8 A. Nature 360, 232-239.
93
Protein superfamilies and cell surface m o l e c u l e s
s4 Bodian, D.L. et al. (1994) Crystal structure of the extracellular region of the human cell adhesion molecule CD2 at 2.5 A resolution. Structure 2, 755-766. ss Holden, H.M. et al. (1992) X-ray structure determination of telokin, the Cterminal domain of myosin light chain kinase, at 2.8 A resolution. J. Mol. Biol. 227, 840-851. s6 Bazan, J.F. (1990) Structural design and molecular evolution of a cytokine receptor superfamily. Proc. Natl Acad. Sci. USA 87, 6934-6938. sz Cooke, R.M. et al. (1987) The solution structure of human epidermal growth factor. Nature 327, 339-341. s8 Tappin, M.J. et al. (1989) A high-resolution 1H-NMR study of human transforming growth factor alpha. Structure and pH-dependent conformational interconversion. Eur. J. Biochem. 179, 629-637. s9 Handford, P.A. et al. (1990) The first EGF-like domain from human factor IX contains a high-affinity calcium binding site. EMBO J. 9, 475-480. 6o Rao, Z. et al. (1995)The structure of a Ca2+-binding epidermal growth factor-like domain: Its role in protein-protein interactions. Cell 82, 131-141. 61 Graves, B.J. et al. (1994) Insight into E-selectin/ligand interaction from the crystal structure and mutagenesis of the lec/EGF domains. Nature 367, 532-538. 62 Handford, P.A. et al. (1991) Key residues involved in calcium-binding motifs in EGF-like domains. Nature 351, 164-167. 63 Downing, A.K. et al. (1996) Solution structure of a pair of calcium-binding epidermal growth factor- like domains: Implications for the Marfan syndrome and other genetic disorders. Cell 85, 597-605. 64 Constantine, K.L. et al. (1991) Sequence-specific 1H NMR assignments and structural characterization of bovine seminal fluid protein PDC-109 domain b. Biochemistry 30, 1663-1672. 6s Brummendorf, T. and Rathjen, F. (1993)Axonal glycoproteins with immunoglobulin and fibronectin type III-related domains in vertebrates: structural features, binding activities, and signal transduction. J. Neurochem. 61, 1207-1219. 66 Benian, G.M. et al. (1989) Sequence of an unusually large protein implicated in regulation of myosin activity in C. elegans. Nature 342, 45-50. 67 Labeit, S. et al. (1990) A regular pattern of two types of 100-residue motif in the sequence of titin. Nature 345, 273-276. 68 Labeit, S. and Kolmerer, B. (1995) Titins: Giant proteins in charge of muscle ultrastructure and elasticity. Science 270, 293-296. 69 Suzuki, S. and Naitoh, Y. (1990)Amino acid sequence of a novel integrin beta 4 subunit and primary expression of the mRNA in epithelial cells. EMBO J. 9, 757763. 7o Leahy, D.J. et al. (1996) 2.0 A Crystal structure of a four-domain segment of human fibronectin encompassing the RGD loop and synergy region. Cell 84, 155-164. 71 Huber, A.H. et al. (1994) Crystal structure of tandem type III fibronectin domains from drosophila neuroglian at 2.0 A. Neuron 12, 717-731. 72 Drickamer, K. (1988)Two distinct classes of carbohydrate-recognition domains in animal lectins. J. Biol. Chem. 263, 9557-9560. 73 Barondes, S. et al. (1994) Galectins: A family of animal fl-galactoside-binding lectins. Cell 76, 597-598. 74 Frigeri, L.G. et al. (1990)Expression of biologically active recombinant rat IgEbinding protein in Escherichia coli. J. Biol. Chem. 265, 20763-20769.
94
7s Watson, S. and Arkinstall, S. (1994) The G-Protein Linked Receptor FactsBook. Academic Press, London, pp. 1-427. 76 WWW. http://www.sander.embl-heidelberg.de/7tm/ 77 McKnight, A. and Gordon, S. (1996) EGF-TM7: a novel subfamily of seventransmembrane-region leukocyte cell surface molecules. Immunol. Today 17, 283-287. 7s Bjorkman, P.J. et al. (1987) Structure of the human class I histocompatibility antigen, HLA-A2. Nature 329, 506-512. 79 Brown, J.H. et al. (1993)Three-dimensional structure of the human class II histocompatibility antigen HLA-DR1. Nature 364, 33-39. s o Madden, D.R. (1995)The three-dimensional structure of peptide-MHC complexes. Annu. Rev. Immunol. 13, 587-622. sl Jardetzky, T.S. et al. (1994)Three-dimensional structure of a human class II histocompatibility molecule complexed with superantigen. Nature 368, 711-718. se Leahy, D.J. et al. (1992)oCrystal structure of a soluble form of the human T cell coreceptor CD8 at 2.6 A resolution. Cell 68, 1145-1162. s3 Driscoll, P.C. et al. (1991) Structure of domain 1 of rat T lymphocyte CD2 antigen. Nature 353, 762-765. s 4 Jones, E.Y. et al. (1995) Crystal structure of an integrin-binding fragment of vascular cell adhesion molecule-1 at 1.8 A resolution. Nature 373, 539-544. ss Bentley, G.A. et al. (1995) Crystal structure of the beta chain of a T cell antigen receptor. Science 267, 1984-1987. s6 Fields, B.A. et al. (1995) Crystal structure of the V(alpha) domain of a T cell antigen receptor. Science 270, 1821-1824. sz Garboczi, D. et al. (1996) Structure of the complex between T-cell receptor, viral peptide and HLA-A2. Nature 384, 134-141. ss Garcia, K. et al. (1996) An c~flT cell receptor structure at 2.5 A and its orientation in the TCR-MHC complex. Science 274, 209-219. s9 Harpaz, Y. and Chothia, C. (1994) Many of the immunoglobulin superfamily domains in cell-adhesion molecules and surface-receptors belong to a new structural set which is close to that containing variable domains. J. Mol. Biol. 238, 528-539. 9o Thomsen, N. et al. (1996)The three-dimensional structure of the first domain of neural cell adhesion molecule. Nature Struct. Biol. 3, 581-585. 91 Erle, D.J. et al. (1991) Complete amino acid sequence of an integrin beta subunit (beta 7) identified in leukocytes. J. Biol. Chem. 266, 11009-11016. 9e Takada, Y. and Hemler, M.E. (1989) The primary structure of the VLA-2/collagen receptor alpha 2 subunit (platelet GPIa): homology to other integrins and the presence of a possible collagen-binding domain. J. Cell Biol. 109, 397-407. 93 Hemler, M.E. (1990)VLA proteins in the integrin family: structures, functions, and their role on leukocytes. Annu. Rev. Immunol. 8, 365-400. 94 Stewart, M. et al. (1995) Leukocyte integrins. Current Biol. 7, 690-696. 9s Piggott, R. and Power, C. (1993) The Adhesion Molecule FactsBook. Academic Press, London. 96 Lee, J.O. et al. (1995) Crystal structure of the A domain from the alpha subunit of integrin CR3 (CD1 lb/CD18). Cell 80, 631-638. 97 Qu, A. and Leahy, D.J. (1995)Crystal structure of the I-domain from the CD1 la/CD18 (LFA-1, alpha(L)beta2)integrin. Proc. Natl Acad. Sci. USA 92, 10277-10281.
95
98 Drickamer, K. (1993)Ca2+-dependent carbohydrate-recognition domains in animal proteins. Curr. Opin. Struct. Biol. 3, 393-400. 99 Patthy, L. (1988) Detecting distant homologies of mosaic proteins. Analysis of the sequences of thrombomodulin, thrombospondin complement components C9, C8 alpha and C8 beta, vitronectin and plasma cell membrane glycoprotein PC- 1. J. Mol. Biol. 202, 689-696. loo Lasky, L.A. et al. (1989) Cloning of a lymphocyte homing receptor reveals a lectin domain. Cell 56, 1045-1055. lol Hoyle, G.W. and Hill, R.L. (1991) Structure of the gene for a carbohydrate-binding receptor unique to rat kupffer cells. J. Biol. Chem. 266, 1850-1857. lo2 Doege, K. et al. (1987) Complete primary structure of the rat cartilage proteoglycan core protein deduced from cDNA clones. J. Biol. Chem. 262, 1775717767. lo3 Collins, T. et al. (1991) Structure and chromosomal location of the gene for endothelial-leukocyte adhesion molecule 1. J. Biol. Chem. 266, 2466-2473. lo4 Johnston, G.I. et al. (1990) Structure of the human gene encoding granule membrane protein-140, a member of the selectin family of adhesion receptors for leukocytes. J. Biol. Chem. 265, 21381-21385. los Ezekowitz, R.A. et al. (1990) Molecular characterization of the human macrophage mannose receptor: demonstration of multiple carbohydrate recognition-like domains and phagocytosis of yeasts in Cos-1 cells. J. Exp. Med. 172, 1785-1794. lo6 Kim, S. et al. (1992) Organisation of the gene encoding the human macrophage mannose receptor (MRC1). Genomics 14, 721-727. lo7 Harris, N. et al. (1994)The exon-intron structure and chromosomal localization of the mouse macrophage mannose receptor gene Mrc 1: identification of a ricin like domain at the N-terminus of the receptor. Biochem. Biophys. Res. Commun. 198, 682-692. lo8 Weis, W.I. et al. (1992) Structure of the calcium-dependent lectin domain from a rat mannose-binding protein determined by MAD phasing. Science 254, 16081615. lo9 Sheriff, S. et al. (1994) Human mannose-binding protein carbohydrate recognition domain trimerizes through a triple ~-helical coiled-coil. Nature Struct. Biol. 1, 789-794. 11o Iobst, S.T. and Drickamer, K. (1994) Binding of sugar ligands to Ca~+-dependent animal lectins. II. Generation of high-affinity galactose binding by site-directed mutagenesis. J. Biol. Chem. 269, 15512-15519. 111 Blanck, O. et al. (1996) Introduction of selectin-like binding specificity into a homologous mannose-binding protein. J. Biol. Chem. 271, 7289-7292. 112 Kogan, T.P. et al. (1995) A single amino acid residue can determine the ligand specificity of E-selectin. J. Biol. Chem. 270, 14047-14055. 113 Hickey, M.J. et al. (1989) Human platelet glycoprotein IX: an adhesive prototype of leucine-rich glycoproteins with flank-center-flank structures. Proc. Natl Acad. Sci. USA 86, 6773-6777. 114 Schneider, R. and Schweiger, M. (1991) A novel modular mosaic of cell adhesion motifs in the extracellular domains of the neurogenic trk and trkB tyrosine kinase receptors. Oncogene 6, 1807-1811. 11s Wenger, R.H. et al. (1988) Structure of the human blood platelet membrane glycoprotein l b alpha gene. Biochem. Biophys. Res. Commun. 156, 389-395.
96
Protein superfamilies and cell surface molecules
1~6 Perin, J.P. et al. (1987) Link protein interactions with hyaluronate and proteoglycans. Characterization of two distinct domains in bovine cartilage link proteins. J. Biol. Chem. 262, 13269-13272. 117 Aruffo, A. et al. (1990) CD44 is the principal cell surface receptor for hyaluronate. Cell 61, 1303-1313. 118 Miyake, K. et al. (1990) Hyaluronate can function as a cell adhesion molecule and CD44 participates in hyaluronate recognition. J. Exp. Med. 172, 69-75. 119 Kohda, D. et al. (1996) Solution structure of the link module: a hyaluronanbinding domain involved in extracellular matrix stability and cell migration. Cell 86, 767-775. ~2o Stein, P. et al. (1994) The crystal structure of pertussis toxin. Structure 2, 45-57. 121 Yamamato, T. et al. (1984) The human LDL receptor: a cysteine-rich protein with multiple Alu sequences in its mRNA. Cell 39, 27-38. 122 Sfidhof, T.C. et al. (1985)The LDL receptor gene: a mosaic of exons shared with different proteins. Science 228, 815-822. 123 Esser, V. et al. (1988) Mutational analysis of the ligand binding domain of the low density lipoprotein receptor. J. Biol. Chem. 263, 13282-13290. ~24 Daly, N. et al. (1995)3-dimensional structure of a cysteine-rich repeat from the low-density-lipoprotein receptor. Proc. Natl Acad. Sci. USA 92, 6334-6338. 12s Banner, D.W. et al. (1993)Crystal structure of the soluble human 55 kd TNF receptor-human TNFfl complex: implications for TNF receptor activation. Cell 73, 431-445. 126 Shevach, E.M. and Korty, P.E. (1989) Ly-6: a multigene family in search of a function. Immunol. Today 10, 195-200. 127 Williams, A.F. (1991) Emergence of the Ly-6 superfamily of GPI-anchored molecules. Cell Biol. Int. Reports 15, 769-777. 128 LeClair, K.P. et al. (1986) Isolation of a murine Ly-6 cDNA reveals a new multigene family. EMBO J. 5, 3227-3234. 129 Williams, A.F. et al. (1988) Squid glycoproteins with structural similarities to Thy-1 and Ly-6 antigens. Immunogenetics 27, 265-272. 13o Orr, H.T. et al. (1979) Complete amino acid sequence of a papain-solubilized human histocompatibility antigen, HLA-B7.2. Sequence determination and search for homologies. Biochemistry 18, 5711-5720. 131 Simister, N.E. and Mostov, K.E. (1989)An Fc receptor structurally related to MHC class I antigens. Nature 33 7, 184-187. 132 Beck, S. and Barrell, B.G. (1988) Human cytomegalovirus encodes a glycoprotein homologous to MHC class-I antigens. Nature 331,269-272. 133 Brown, J.H. et al. (1988) A hypothetical model of the foreign antigen binding site of Class II histocompatibility molecules. Nature 332, 845-850. 134 Bjorkman, P.J. and Parham, P. (1990) Structure, function, and diversity of class I major histocompatibility complex molecules. Annu. Rev. Biochem. 59, 253-288. 13s Lawlor, D.A. et al. (1990) Evolution of class-I MHC genes and proteins: from natural selection to thymic selection. Annu. Rev. Immunol. 8, 23-63. 136 Burmeister, W.P. et al. (1994) Crystal structure at 2.2 A resolution of the MHCrelated neonatal Fc receptor. Nature 372, 336-343. 137 Thomas, M.L. et al. (1985) Evidence from cDNA clones that the rat leucocytecommon antigen (T200) spans the lipid bilayer and contains a cytoplasmic domain of 80,000 Mr. Cell 41, 83-93.
97
138 Charbonneau, H. et al. (1988)The leucocyte common antigen (CD45): a putative receptor-linked protein tyrosine phosphatase. Proc. Natl Acad. Sci. USA 85, 7182-7186. 139 Tonks, N.K. et al. (1988) Demonstration that the leucocyte common antigen CD45 is a protein tyrosine phosphatase. Biochemistry 27, 8695-8701. 14o Streuli, M. et al. (1990) Distinct functional roles of the two intracellular phosphatase like domains of the receptor-linked protein tyrosine phosphatases LCA and LAR. EMBO J. 9, 2399-2407. 141 Streuli, M. et al. (1988) A new member of the immunoglobulin superfamily that has a cytoplasmic region homologous to the leukocyte common antigen. J. Exp. Med. 168, 1523-1530. 142 Streuli, M. et al. (1989) A family of receptor-linked protein tyrosine phosphatases in humans and Drosophila. Proc. Natl Acad. Sci. USA 86, 8698-8702. 143 Hall, L.R. et al. (1988)Complete exon-intron organization of the human leukocyte common antigen (CD45)gene. J. Immunol. 141, 2781-2787. 144 Barford, D. et al. (1994) Crystal-structure of human protein-tyrosine-phosphatase lB. Science 263, 1397-1404. 14s Freeman, M. et al. (1990) An ancient, highly conserved family of cysteine-rich protein domains revealed by cloning type I and type II murine macrophage scavenger receptors. Proc. Natl Acad. Sci. USA 87, 8810-8814. ~46 Walker, I.D. et al. (1994) A novel multi-gene family of sheep gammadelta T cells. Immunology 83, 517-523. 147 Law, S.K. et al. (1993) A new macrophage differentiation antigen which is a member of the scavenger receptor superfamily. Eur. J. Immunol. 23, 2320-2325. 148 Freeman, M. et al. (1991) Expression of type I and type II bovine scavenger receptors in Chinese hamster ovary cells: lipid droplet accumulation and nonreciprocal cross competition by acetylated and oxidized low density lipoprotein. Proc. Natl Acad. Sci. USA 88, 4931-4935. 149 Huang, H.J. et al. (1987) Molecular cloning of Ly-1, a membrane glycoprotein of mouse T lymphocytes and a subset of B cells: molecular homology to its human counterpart Leu-1/T1 (CD5). Proc. Natl Acad. Sci. USA 84, 204-208. ~so Bowen, M.A. et al. (1995) Cloning, mapping, and characterization of activated leucocyte-cell adhesion molecule (ALCAM), a CD6 ligand. J. Exp. Med. 181, 2213-2220. lsl Bajorath, J. et al. (1995) Molecular model of the N-terminal receptor-binding domain of the human CD6 ligand ALCAM. Protein Sci. 4, 1644-1647. ls2 Feinstein, E. et al. (1995) The death domain - a module shared by proteins with diverse cellular functions. Trends Biochem. Sci. 20, 342-344. is3 Cleveland, J. and Ihle, J. (1995) Contenders in FasL/TNF death signaling. Cell 81, 479-482. ls4 Hsu, H.L. et al. (1996)TNF-dependent recruitment of the protein-kinase RIP to the TNF receptor-1 signaling complex. Immunity 4, 387-396. lss Hofmann, K. and Tschopp, J. (1995)The death domain motif found in Fas (Apo-1) and TNF receptor is present in proteins involved in apoptosis and axonal guidance. FEBS Lett. 371,321-323. ~s6 Reth, M. (1989) Antigen receptor tail clue. Nature 338, 383-384. ls7 Thomas, M. (1995) Of ITAMs and ITIMs; Turning on and off the B cell antigen receptor. J. Exp. Med. 181, 1953-1956.
98
lss Cambier, J. et al. (1994) Signal transduction by the B cell antigen receptor or its
coreceptors. Annu. Rev. Immunol. 12, 457-486. ls9 Chan, A.C. and Shaw, A.S. (1996) Regulation of antigen receptor signal
16o
161
162 163 164
16s
166 ~67 16s
~69
17o 17~ 17e lz3 ~74 17s
~76 ~77
transduction by protein tyrosine kinases. Curr. Opin. Immunol. 8, 394-401. Cambier, J.C. (1995) Antigen and Fc receptor signaling the awesome power of the immunoreceptor tyrosine-based activation motif (ITAM). J. Immunol. 155, 3281-3285. Daeron, M. et al. (1995) The same tyrosine-based inhibition motif, in the intracytoplasmic domain of FcgammaRIIB, regulates negatively BCR-, TCR-, and FcR-dependent cell activation. Immunity 3, 635-646. Yu, H. et al. (1994) Structural basis for the binding of proline-rich peptides to SH3 domains. Cell 76, 933-945. Bell, G.M. et al. (1996)The SH3 domain of p56(lck) binds to proline-rich sequences in the cytoplasmic domain of CD2. J. Exp. Med. 183, 169-178. Rebbe, N.F. et al. (1991) Identification of nucleotide pyrophosphatase/alkaline phosphodiesterase I activity associated with the mouse plasma cell differentiation antigen PC-1. Proc. Natl Acad. Sci. USA 88, 5192-5196. Grundmann, U. et al. (1990) Cloning and expression of a cDNA encoding human placental protein 11, a putative serine protease with diagnostic significance as a tumor marker. DNA Cell Biol. 9, 243-250. Wright, M.D. and Tomlinson, M.G. (1994) The ins and outs of the transmembrane 4 superfamily. Immunol. Today 15, 588-594. Tomlinson, M.G. and Wright, M.D. (1996)Characterization of mouse CD37: cDNA and genomic cloning. Mol. Immunol. 33, 867-872. Tomlinson, M.G. et al. (1993)Epitope mapping of anti-rat CD53 monoclonal antibodies. Implications for the membrane orientation of the transmembrane 4 superfamily. Eur. J. Immunol. 23, 136-140. Wright, M.D. et al. (1990) An immunogenic Mr 23,000 integral membrane protein of Schistosoma m a n s o n i worms that closely resembles a human tumorassociated antigen. J. Immunol. 144, 3195-3200. Tomlinson, M. and Wright, M. (1996) A new transmembrane 4 superfamily molecule in the nematode C. elegans. J. Mol. Evol. 43, 312-314. Kopczynski, C. et al. (1996) A neural tetraspanin, encoded by late bloomer, that facilitates synapse formation. Science 271, 1867-1870. Dong, J.-T. et al. (1995) KA11, a metastasis suppressor gene for prostrate cancer on human chromosome 1 l p 11.2. Science 268, 884-886. Boismenu, R. et al. (1996) A role for CD81 in early T cell development. Science 271, 198-200. Hupp, K. et al. (1989)Gene mapping of the three subunits of the high affinity FcR for IgE to mouse chromosomes 1 and 19. J. Immunol. 143, 3787-3791. Adra, C.N. et al. (1994) Cloning of the cDNA for a hemapoietic cell-specific protein related to CD20 and the fl subunit of the high-affinity IgE receptor: evidence for a family of proteins with four membrane-spanning regions. Proc. Natl Acad. Sci. USA 91, 10178-10182. Tedder, T.F. and Engel, P. (1994) CD20: a regulator of cell-cycle progression of B lymphocytes. Immunol. Today 15, 450-454. Lin, S. et al. (1996)The FccRIfl subunit functions as an amplifier of FccRIT-mediated cell activation signals. Cell 85, 985-995.
99
Protein superfamilies and cell surface molecules
178 Van Kooten, C. and Banchereau, J. (1996) CD40-CD40 ligand: A multifunctional receptor-ligand pair. Adv. Immunol. 1-77. 179 Armitage, R.J. (1994)Tumor necrosis factor receptor superfamily members and their ligands. Curr. Opin. Immunol. 6, 407-413. 1 8 o Gruss, H.J. and Dower, S.K. (1995) Tumor necrosis factor ligand superfamily: involvement in the pathology of malignant lymphomas. Blood 85, 33 78-3404. 181 Eck, M.J. and Sprang, S.R. (1989) The structure of tumor necrosis factor-alpha at 2.6 A resolution. Implications for receptor binding. J. Biol. Chem. 264, 17595-17605. 182 Jones, E.Y. et al. (1989) Structure of tumour necrosis factor. Nature 338, 225-228. 183 Karpusas, M. et al. (1995) 2 A crystal structure of an extracellular fragment of human CD40 ligand. Structure 3, 1031-1039. 184 Goodwin, R.G. et al. (1993) Molecular cloning of a ligand for the inducible T cell gene 4-1BB: A member of an emerging family of cytokines with homology to tumor necrosis factor. Eur. J. Immunol. 23, 2631-2641. 18s Camerini, D. et al. (1991) The T cell activation antigen CD27 is a member of the nerve growth factor receptor gene gamily. J. Immunol. 147, 3165-3169. 186 Sehgal, A. et al. (1988)A constitutive promoter directs expression of the nerve growth factor receptor gene. Mol. Cell. Biol. 8, 3160-3167. 187 Birkeland, M.L. et al. (1995) Gene structure and chromosomal localization of the mouse homologue of rat OX40 protein. Eur. J. Immunol. 25, 926-930. 188 Samelson, L.E. et al. (1990) Association of the fyn protein-tyrosine kinase with the T-cell antigen receptor. Proc. Natl Acad. Sci. USA 87, 4358-4362. 189 Chan, A.C. et al. (1994) The role of protein tyrosine kinases and protein tyrosine phosphatases in T cell antigen receptor signal transduction. Annu. Rev. Immunol. 12, 555-592. ~9o Hardie, G. and Hanks, S. (1995)The Protein Kinase FactsBook. Academic Press, London. 191 Hanks, S.K. et al. (1988)The protein kinase family: conserved features and deduced phylogeny of the catalytic domains. Science 241, 42-52. 192 Holmes, W.E. et al. (1991) Structure and functional expression of a human IL-8 receptor. Science 253, 1278-1280. ~93 Barclay, A.N. et al. (1993)The Leucocyte Antigen FactsBook, 1st edn. Academic Press, London. 194 Hanks, S.K. and Hunter, T. (1995) The eukaryotic protein kinase superfamily: kinase (catalytic)domain structure and classification. FASEB J. 9, 576-596.
[0s
4 The architecture and interactions of leucocyte surface molecules INTRODUCTION The proteins at the surface of leucocytes play key roles in all aspects of leucocyte functions such as differentiation and maturation, controlling their patterns of migration, response to foreign antigen and the control of the immune response via cytokines and interactions with other cell types. In the previous chapter, the types of domains found in proteins at the surface of leucocytes were described. In this chapter we discuss the organization of these proteins on cells and their interactions under five broad topics: (1) the integration of proteins into the membrane; (2) the carbohydrate structures on leucocyte membrane proteins; (3) the functions of the membrane antigens; (4) the types of interactions they mediate and (5) the architecture of the cell surface including factors such as the distribution of antigens at the cell surface and their abundance.
INTEGRATION OF PROTEINS INTO THE MEMBRANE Heterogeneity of integration mechanisms Since the original models for the integration of proteins within a fluid lipid bilayer were established (see Chapter 2), several different methods of protein attachment have been described 1. Type I and type II proteins have a single transmembrane region whilst type III and IV proteins contain multiple transmembrane regions and type V proteins utilize lipid anchors (Fig. 1). Type IV proteins (not shown) are distinguished from type III proteins by the presence of a water-filled transmembrane channel. They often contain several subunits and are usually transport proteins. Type IV proteins are widely distributed on many cell types, and as we are not aware of any examples restricted to leucocytes, are not discussed further in this review. Two classes of type V proteins, which use lipid to attach to membranes have been described; one involving cell surface proteins linked by glycosyl-phosphatidylinositol (GPI) anchors and the second in which cytoplasmic proteins are linked by lipid moieties such as myristoyl groups. The latter include a large number of signalling proteins such as tyrosine kinases and GTP-binding proteins but these are beyond the scope of this book and are reviewed elsewhere 2,3. In addition to these common modes of attachment some novel membrane anchors have recently been proposed, although not yet in leucocytes. A type VI membrane anchor has been proposed which contains both an uncleaved leader sequence and a GPI anchor and an example of which is ponticulin in the slime mould Dictyostelium discoideum 4. A comparable orientation is also proposed for CD36 but with two transmembrane regions. X-ray crystallography suggests that the integral membrane glycoprotein prostaglandin H2 synthase-1 integrates into only one leaflet of the lipid bilayer s. In this case a patch of hydrophobic side-chains of the amino acids of an ~ helix integrates with the lipid but is not sufficient to traverse the membrane. This type of protein was termed monotopic 6 but there is now a case for calling it type VII in accordance with the common nomenclature.
101
The architecture and interactions of leucocyte surface molecules Type I One-pass transmembrane CD4
Type II One-pass transmembrane
CD10
Type III Multi-pass transmembrane CD37
Type V Glycophosphatidylinositol anchor
Type V Cytoplasmic lipid anchor
Thy-1
Lck
NH2
TTTTTqTTTTTTTTTTT5 TTTTTTTTTTT \ ( TTTTTTT S! TTTTTTTTTTTTTTTTT COOH
NH2
NH2
COOH
~
,
Figure 1 Modes of integration of proteins into the membrane bilayer. A novel method of membrane attachment has been proposed for the peripheral myelin protein Po based on the X-ray crystal structure 7. In addition to a type I transmembrane attachment an additional interaction has been proposed between the lipid bilayer and the single IgSF domain involving the side-chains of two exposed Trp residues. It should be noted that in most cases there is little biochemical evidence for the assignment of the transmembrane sequences of membrane proteins and these are usually predicted from hydrophobicity analysis. It is likely that some of the assignments given will be refined when more structural data become available. The frequency of each type of membrane attachment found in the leucocyte membrane proteins described in this book is summarized in Table 1.
Type I transmembrane attachment The most common mode of membrane integration is the type I single-pass hydrophobic transmembrane sequence (Table 1). As illustrated for CD4 in Fig. 1, type I molecules have the C-terminus of the molecule in the cytoplasm and the Nterminus outside the cell. In the biosynthesis of a type I molecule an N-terminal signal sequence is cleaved as the molecule passes through the bilayer of the endoplasmic reticulum. This signal sequence has a loose pattern of conserved residues. The features of this pattern plus predictions for signal sequence cleavage points have been reviewed s. In the entries for antigens in this volume consensus
lOi
Table 1 Frequency of types of membrane
attachment in leucocyte antigens Membrane attachment
% of each type
Type I Type II Type III
69 12 <1 <1 5 <1 3 <1 8
2 pass 3 pass 4 pass 5 pass 7 pass 12 pass Type V (GPI)
rules are used to predict N-termini in cases where these have not been determined by protein sequencing. Type I cell surface molecules have a transmembrane sequence of about 25 hydrophobic amino acid residues that is usually followed by a cluster of basic amino acids that are believed to bind to phospholipid head groups inside the membrane bilayer. Amino acids that are usually excluded from transmembrane sequences include Asn, Asp, Glu, Gln, His, Lys and Arg. In the exceptions where the transmembrane sequence contains some charged residues, the transmembrane sequence is usually associated with transmembrane sequences of other cell surface proteins to form a multimeric complex in the membrane. The classical example of this is the TCR complex in which the TCR ~ and fl chains and CD3 6, c, 7 and ~ chains all have charged residues in their transmembrane sequences. The presence of a charged residue in a type I transmembrane sequence can be reasonably taken as prima facie evidence that the molecule in question will be part of a multimeric complex.
Type II transmembrane attachment Type n single-pass transmembrane molecules have the opposite orientation to type I molecules (Fig. 1). The N-terminus is found in the cytoplasm and the C-terminus is extracellular, e.g. ref. 9. They usually have a small cytoplasmic region and the transmembrane sequence often resembles an uncleaved signal sequence for secretion. A high proportion of extracellular domains with enzymatic activity are type II membrane proteins although overall the type II proteins are much less abundant than type I (Table 1).
Type III multipass transmembrane attachment The type III category of membrane attachment consists of those molecules that cross the bilayer numerous times (Fig. 1). One very large group of cell surface molecules in this category is composed of G protein-linked receptors that pass through the membrane seven times (see Chapter 3), many of which function as receptors for soluble molecules such as prostaglandins and chemokines ~o. Structural studies have shown that the seven transmembrane regions are all ~ helices11. On leucocytes, members of this group have usually been identified functionally rather than by antigenicity, e.g. the IL-8R (CDw128), C5aR (CD88) and N-formyl peptide receptor
10~
The architecture and interactions of leucocyte surface molecules
(FPR). A new subgroup of seven-pass transmembrane sequences has been recently described which includes the F4/80 and CD97 antigens which contain several extracellular EGF domains 12 One group of multipass proteins commonly found on leucocytes contains four transmembrane regions and is called the transmembrane 4 (TM4) superfamily 13. There are seven members in this superfamily present on leucocytes (Table 1) and a model for one (CD37) is illustrated in Fig. 1. All these molecules probably have both their N- and C-termini inside the cell, an orientation indicated by the fact that the loop of sequence between TM sequences 3 and 4 of CD53 is known to be extracellular since it encodes an extracellular antigenic determinant 14. The CD20 antigen and the FccR fl chain are also four-pass transmembrane proteins but their sequences are not related to the TM4 superfamily (see Chapter 3)is. There is one example of a 12-pass transmembrane protein on leucocytes - the multidrug transporter MDR1 16. CD47 is the only example thus far of a leucocyte protein predicted to contain five transmembrane sequences. CD47 also contains an IgSF domain which is unusual for a multipass transmembrane protein.
Type V lipid attachment Glycosyl-phosphatidylinositol (GPI) anchors This common method of membrane integration utilizes a glycosyl-phosphatidylinositol (GPI) anchor attached to the C-terminal residue of the protein (Fig. 1 and Table 1) 17. The structures for GPI anchors of the Trypanosoma brucei parasite coat protein and the rat Thy-1 antigen TM are shown in Fig. 2. The backbone components and their linkages have been totally conserved during the evolution of Trypanosoma and Rattus but side-chain residues can vary between GPI anchors from different species as illustrated in Fig. 2. Within a species there can be cell typespecific differences as shown by differences in attached mannose and galactosamine residues between anchors of Thy-1 from brain and thymus. An additional palmitate
protein last amino acid
ethanolamine
Thy-1
e~ hanolamine ~ N-acetylgalactosamine 2
mannose + mannose~
mannose
/131-4
mannose ~glucosamine ~ myoinositol
7".brucei/~l ga a' 3aq gal|
-3 glycerol I
I
fatty acids
Figure 2 Glycosyl-phosphatidylinositol anchors (GPI) for T. brucei and rat brain Thy-1. The data are from ref. 18. 104
residue can also be tissue-specific 18. GPI anchors can be cleaved by bacterial phosphatidylinositol phospholipase C (PI-PLC) enzymes 18 and the release of an antigen from the cell surface with PI-PLC is diagnostic for GPI anchor attachment. However some GPI-anchored molecules are resistant to PI-PLC cleavage and for instance this occurs when a palmitate residue is attached to the myoinositol ring as is found in bovine, but not human, acetylcholinesterase x9 The presence of a GPI anchor can be indicated by the predicted protein sequence of a molecule. GPI-anchored molecules have a secretion signal sequence at their Nterminus plus another signal sequence at their C-terminus that is cleaved off and replaced by the GPI-anchor shortly after the biosynthesis of the molecule and entry in the endoplasmic reticulum. Examples of GPI signal sequences in cases where the cleavage site is known, are given in Fig. 3. Thus despite some exceptions, the general rules for a GPI-signal sequence are as follows. ( 1 ) T h e presence of a hydrophobic region at the C-terminus of the molecule that is not followed by a cluster of basic residues. (2)At about 7-12 residues before the hydrophobic region some small amino acids where cleavage of the precursor and attachment site of the GPI anchor occurs. The features of a GPI-signal sequence are quite similar to those of signal sequences for secretion and it has been shown that a secretion signal peptide attached to the C-terminus can function as a GPI-signal sequence 2o. The hydrophobic region can be 10-20 residues long and may seem indistinguishable from a sequence that might form a transmembrane sequence. In fact in the case of the CD58 antigen there are two alternatively spliced forms of the molecule that yield a GPI anchor or a transmembrane form of the molecule 21'22. The same sequence forms the transmembrane spanning region and the GPI-signal sequence and the outcome is determined by the fact that the transmembrane form has extra sequence following on from the hydrophobic sequence. It is usual for GPI-signal sequences to lack the basic charged residues after the hydrophobic region that are found in the sequences of type I membrane molecules. The GPI anchor appears to associate specifically with sphingomyelin lipids 23 and follows a different path of transport to the cell surface compared to some proteinanchored molecules following biosynthesis. On polarized cells GPI-anchored molecules are restricted to the apical surface 24. GPI-anchored molecules are excluded from coated pits but can recycle through the cell via small uncoated vesicles 24. Complexes enriched in GPI-anchored proteins and Src kinases can be
T. brucei VSG MITat 1.4a T. brucei VSG MITat 1.1BC T. brucei VSG ILTat 1.1 T. brucei VSG MITat 1.5b RatThy-1 DAF Rat CD48 Placental ATPase Scrapie prion CD59 CD52 CD73
C C T C V T A T R L S K
K C G R K T R T R E P F
D D S N C S S D S N S S
SSILVTKKFALTVVSAAFVALLF GSFLVNKKFALMVYDFVSL NSFVIHKAPLFLAFLLF GSFLTSKQFALMVSAAFVTL GG I SLLVQNTSWL L LLL GTTRLLSGHTCFTLTGL SGVHWl AAWLVVTLS I I AAHPGRSVVPALLPLLAGTLLLLTATP SAVLFSSPPV ILL I SFL GGTSLSEKTVLLLVTPFLAAAWSLHP ASSNI SGGI FL FFVANA TGSHCHGSFSL I FLSLWAVI
LAF L F LSLSFLQATDF LGT I VTMGL PSI L LA
I SL LT
I FLMVG I I HL FCFS FVLYQ
Figure 3 Signal sequences for glycosyl-phosphatidylinositol (GPI) anchors. The arrow indicates the position of cleavage of the precursor protein. The lipid is attached to the residue immediately preceding the arrow. The data are from refs. 17, 107-111. LO~
enriched by detergent solubilization. These detergent-insoluble complexes have been equated with calveolae but this seems to be an oversimplification (reviewed in ref. 25).
Fatty acyl or prenyl anchors in cytoplasmic proteins Many cytoplasmic proteins associate with the lipid bilayer but lack typical hydrophobic transmembrane regions. The commonest examples found in leucocytes and in many other cell types belong to the family of Src-related kinases 26,27. These often have a myristic acid residue at the N-terminus which is necessary for membrane association 28. The myristoylation is stable but in addition a cysteine near the N-terminus is often reversibly modified with palmitic acid, e.g. in Lck; palmitoylation enhances interactions with GPI-anchored proteins and subcellular compartments 29 The myristoyl membrane interactions may be enhanced by interactions of polar headgroups of the phospholipids with groups of charged amino acid side-chains at the N-terminus. This interaction can be modified by phosphorylation, the "myristoyl-electrostatic switch" which may provide a reversible modulator of protein-membrane interactions 3o. It seems likely that the complexity of post-translation modifications with lipid will be revealed when more biochemical data become available 26.
CARBOHYDRATE STRUCTURES ON LEUCOCYTE MEMBRANE PROTEINS Major glycoproteins and dimensions of carbohydrate structures A major feature of cell surfaces is the presence of carbohydrate structures on both lipids and proteins. Most of the leucocyte surface antigens are glycoproteins and in this section we will deal exclusively with the carbohydrates of glycoproteins. However it should be noted that glycolipids are an abundant component of the cell surface and the diversity of their carbohydrate structures is as great as that of glycoproteins. N-linked glycosylation occurs on Asn residues within the motif Asn-Xaa-Thr or Asn-Xaa-Ser with the exception of Asn-Pro-Thr/Ser or Asn-Xaa-Thr/ Ser-Pro sequences which are not usually glycosylated (see Chapter 1). O-linked glycosylation occurs at Ser or Thr amino acids within stretches of sequence that include a preponderance of the amino acids Ser, Thr and Pro (see Chapter 1). The levels of glycosylation of membrane proteins vary considerably, with some expressing very high levels of carbohydrate. This can be seen in Fig. 4 which shows the major molecules that display carbohydrate on thymocytes and lymphocytes. These are the main molecules visualized when cells are labelled with [3H]borohydride in their sialic acid or galactose residues 31,32. For rat thymocytes three main bands are seen and these correspond to Thy-1, CD43 and CD45; for T cells the main bands are CD43 and CD45; whilst for B cells there is only one strong band and this is accounted for by CD45. On T and B cells the CD45 molecule has a variable amount of extra sequence at the N-terminus due to alternative splicing of exons. This extra sequence has an extended structure which is heavily O-glycosylated 33-3s. Very few membrane proteins lack carbohydrate completely and it seems likely that these unglycosylated polypeptides are associated with glycoproteins, e.g. the CD3 c chain is part of a multi-polypeptide complex (i.e. the CD3/TCR complex)and CD81 is probably also associated with glycoproteins.
10~
-",~CD45
~
~r~era~itea In:~gemsp:S in~g c ofexons "~
CD43
" ~
q
Thy-1
rrr !!r!T, Trrrr TT r,Trrrt rrrTrr ~lOnm
rTr r Tr, rr , T ~ ~\
COOH COOH
Figure 4 The three major heavily glycosylated proteins on rat lymphocytes. Thy-1 is present on rat thymocytes and CD43 on thymocytes and T cells. Thymocytes express CD45 without any of the three alternative segments A, B, C whilst various combinations of these are expressed on B and T cells. For further information see entries in Section II and Fig. 9.
Models such as those in Fig. 4 underestimate the contribution of glycosylation to the glycoprotein structure since the carbohydrates are not drawn to scale. Figures 5 and 6 show typical N- and O-linked carbohydrates with models for some of the structures drawn to scale. It is evident that for many cell surface molecules much of the protein surface must be obscured by the carbohydrate groups. This is illustrated in Fig. 7, which shows the three N-linked carbohydrates of Thy-1 antigen drawn to scale in relation to an immunoglobulin variable domain representing the Thy-1 protein backbone.
Carbohydrate antigenicity Of the 206 entries in this book only nine antigens are defined by mAbs which recognize carbohydrate epitopes rather than protein epitopes of a glycoprotein or protein molecule. This may be considered surprising given the amount of carbohydrate at the cell surface. Carbohydrate epitopes are not intrinsically non-immunogenic since in other cases where mAbs are raised against cells, the antibodies can be predominantly against carbohydrate epitopes. This is the case for mAbs produced
LO~
Manor --~ 2Manor \
6 Man~l
\
Manor1 ~ 2Manor 7 3 Man~l --+ 2Manor -~ 2Man~l
7
~Manfll --, 4GIcNAcfll --, 4GIcNAc oligomannose
GIcNAc Man~l\6Man~
/~1
1
I
\,,_ 4
Mane173
7
~Manfll -~ 4GIcNAcfll -~ 4GIcNAc hybrid
GIcNAcfll -~ 2Manc~l
Fuc
NeuNAc~2 --, 6Galfll --+ 4GIcNAcfll --. 2 M a n ~ l \
I
6 6 Manfll -~ 4GIcNAcfll --, 4GIcNAc
NeuNAc=2 ---, 6Galfll --+ 4GIcNAcfll ~ 2Man~l 7 3
sialylated complex biantennary Fuc
I
(Galfll ~ 4GIcNAcfll --, 3)2 -~ Galfll -~ 4GIcNAcfll -~ 2 M a n ~ l \~ 36 Manfll 4GIcNAcfll (Galfll -~ 4GIcNAcfll --~ :3)2 -~ Galfll --, 4GIcNAcfll -~ 2 M a n 0 r
6 4GIcNAc
biantennary lactosaminoglycan
B
Figure 5 Diagram to show the relative sizes of 4 representative N-linked oligosaccharide side-chains (A) and their structural configurations (B). The terminal GlcNAc is linked to an Ash residue on the glycoprotein. In (B) the structures from left to right are; oligomannose, hybrid, sialylated biantennary, and biantennary lactosaminoglycan. The approximate dimensions of these oligosaccharides are shown. This figure is adapted from ref. 112 with permission from the authors and Annual Reviews of Biochemistry.
10~
A. Core structures of O-linked glycans
Core class 1" Gall31--~ 3GalNAc-SerFFhr GIcNAc131
6
Core class 2:Gall31 --~ 3GalNAc-Ser/Thr Core class 3:GIcNAc131 ~3GalNAc-SerFFhr GIcNAc131,,~6 Core class 4:GicNAc131 ---~3GalNAc-SerFFhr Core class 5: GalNAco~l---~3GalNAc-SerFFhr B. Examples of terminal structures
Fuco~l N4 GalI31-~3GlcNAc-
Blood group Lewisa (Lea)
Fuco~l,~4 Fuco~1--~ 2G all31--~ 3GIcNAc-
Blood group Lewisb(Le b)
GalI31N4 Fuco~1~ 3G IcNAc-
Lewis x (Lex)
Neu NAco~2---~3Ga1131"~4 Fuc(zl --~ 3GIcNAc-
SialyI-Lewis x (sLe x)
Gall31 ---~4GIcNAcl31 ---~3Ga1131-~4GIcNAcl31 ---~3-R
Polylactosaminoglycan Blood group i
Figure 6 The structure of some typical O-linked oligosaccharides. (A) Examples of
the core residues found commonly in O-linked carbohydrate. (B) Examples of terminal residues found on O-linked carbohydrates 113 . against the slime mould Polysphondylium pallidum 36. The main reason that protein epitopes predominate in immunizations between vertebrates is probably because most carbohydrates are shared amongst the higher animals and thus the animal is tolerant to the carbohydrate determinants. It may be that anti-carbohydrate mAbs are raised where this does not hold. For example, a mouse mAb recognizing the human carbohydrate blood group A antigen was one of the first mAbs made against human leucocytes and mice are negative for blood group A 37,3s. In slime moulds the carbohydrate structures seem quite different to vertebrate structures 39,4o and this
10~
Man~lN~ 6 Manal Man~l
73
"'~ 6Manfll --~ 4GIcNAcfll ~ 4GIcNAc Man~l 7
Asn 23 Fuc Gala1 -~ 3Galfll --+ 4GIcNAcfll --~ 2 M a n ~ l " M G a36n f l l ~
I
6 --+ 4 G I c N A c f l l --+ 4GIcNAc
(Galfll -+ 4 G I c N A c f l l --, 3)3 --+ Galfll -~ 4GIcNAcfll -~ 2 M a n a l / Asn 74
NeuNAc~2 -~ 6Galfll --+ 4GIcNAcfll --+ 2 M a n ~ l "N 6 M a n f l l ~ 4GIcNAcfll --+ 4GIcNAc NeuNAc~2 -~ 6Galfll -~ 4GIcNAcfll -~ 2 M a n ~ l 7 3 Asn 98
L,,
,
3nm
Figure 7 A model for the Thy-1 antigen from rat thymocytes and the major oligosaccharides present at the three N-glycosylation sites (Ash23, Asn 74 and Asn98) 46. The structure is a model based on the c~-carbon coordinates of the VL domain of Fab NEW, which Thy-1 resembles in sequence. The three oligosaccharides are shown perpendicular to the protein surface and in an extended conformation. Each sugar residue is represented by a sphere of 0.608 nm in diameter. This figure is adapted from ref. 112 with permission from the authors and
Annual Reviews of Biochemistry. LI[
may explain the strong immunogenicity of carbohydrates in immunizations against these organisms. Terminal c~-galactose is not found on human glycoproteins but is common in other species and humans can make a strong response to this carbohydrate; this is an important factor in attempts at organ xenotransplantation 41 With antibodies against heavily O-glycosylated glycoproteins it is common to find epitopes that are specific to one glycoprotein and yet are apparently dependent on glycosylation 42,43. A common result is that antigenicity may be lost if sialic acid is removed, or that expression of an epitope may differ between cell types presumably due to differential glycosylation. It might be inferred from this that the epitope consists of both protein and carbohydrate components, but studies on completely unglycosylated forms of the glycoproteins expressed in E. coli make this interpretation unlikely 42,44. In a number of cases the mAbs whose binding is affected by removal of sialic acid also bind to the unglycosylated forms. This suggests that the epitope specificity is due to the protein sequence but that the availability of the epitope can be influenced by the glycosylation state. In the case of CD45 one antibody has been shown to bind to the native glycosylated protein with an affinity different to that of the unglycosylated backbone 44
Cell
type specificity of glycosylation
One of the key points about glycosylation is that the carbohydrate on a particular protein backbone can vary considerably depending on the cell type in which it is expressed. This was first seen for Thy-1 (CD90) antigen from brain and thymus where it was established that all of the complex N-linked structures differed between the two forms 4s,46. These differences were superimposed upon a sitespecific pattern in which Asn23 carried mostly oligomannose structures, Asn74 showed the most extended complex structures, and Asn98 carried smaller complex structures in both forms. It seems as if a site-specific pattern is dictated by the structure of the molecule and that a cell type-specific pattern is superimposed upon this 46 Major differences in glycosylation are also established between leucocyte populations. A notable example is the differences in O-linked structures on resting T cells, activated T cells and neutrophils. Most of O-linked carbohydrate is present on the CD43 antigen, a major glycoprotein on all three cell types 47,48. Resting T cells express simple structures (Fig. 8A)whereas more complicated structures are found on activated T cells (Fig. 8B) and neutrophils (Fig. 8C). These differences may be of functional importance since extended mucin-like proteins such as CD43 (and the CD45 N-terminus) are well-suited to display carbohydrates to natural lectins. Carbohydrates carried by glycoproteins and glycolipids have been shown to be important ligands for the selectin (CD62E, CD62L and CD62P)49,so and sialoadhesin (CD22, CD33 and sialoadhesin)sl families of proteins. The selectin ligands provide a clear example of the importance of cell type-specific glycosylation. Thus GlyCAM-1 is not only expressed in lymph nodes where it is a ligand for L-selectin but is also present in milk s2. However the GlyCAM-1 present in milk lacks the sulfation of the carbohydrate necessary for CD62L binding s2-s4. The E-selectin ligand (ESL-1) is widely expressed but only that expressed by myeloid cells is a ligand for CD62E ss. The P-selectin glycoprotein ligand 1 (PSGL-1 or CD162) is also widely distributed but is only an active ligand on a subset of the PSLG-l-expressing leucocytes s6. Sulfation is also important in the PSLG-1 reaction with P-selectin but in this case it
ll1
A. Resting T cells, thymocytes, lymphoid and erythroid cell lines core 1 structures NeuNAco~2 GalNAc-Ser/Thr Neu NAc(z2-~ 3G all31 "~'
B. Activated T cells, thymocytes, lymphoid cell lines core 2 structures N eu NAc(z2--~ 3Gall31--~- 4GIcNAcl31,~ 6GalNAc-Ser/Thr NeuNAco~2--~ 3Gall31~3
C. Granulocytes, myeloid cell lines elongated core 2 structures N eu NAcoc2. --~ 3(Gall31 --~ 4GIcNAcl31 --~ 3) n-~ Gall31 --~ 4GIcNAcl31~ 6 GalNAc-Ser/Thr 3 n = 0, 1 or 2
NeuNAco~2--~3Gall311
Figure 8 Examples of typical heterogeneity of O-linked carbohydrate structures found on different leucocytes 47,48. is sulfation of tyrosine residues near the N-terminus of the protein rather than carbohydrate that is involved 57,s8
FUNCTIONS
OF L E U C O C Y T E M E M B R A N E A N T I G E N S
Leucocytes are characteristically migratory cells which cooperate to recognize and dispose of invasive pathogens or tumour cells. For these functions it is necessary for cell surface antigens to interact with soluble proteins or glycoproteins as well as with the surfaces of other cells and with the extracellular matrix. In addition many of these interactions will lead to signalling events transduced to the cell interior by proteins interacting with the cytoplasmic regions of the cell surface antigens. These signals lead to differentiation and altered migration of the leucocytes. In the entries in Section II there is a section called "Ligands and associated molecules" where the known molecular interactions are listed. Ligands have been identified for about 50% of leucocyte surface proteins and in this chapter we concentrate on the interactions of the extracellular parts of leucocyte antigens. Table 2 summarizes the frequency of different types of functions mediated by membrane proteins of leucocytes. It includes enzymes as a separate small group and these are summarized in Table 3. The leucocyte cell surface must also contain proteins necessary for the metabolism of the cell such as ion channels but these molecules are unlikely to be restricted to leucocytes and hence are not classified as leucocyte antigens. One exception is the CD20 antigen which has ion channel activity s9.
112
Table 2
Frequency of the roles of leucocyte surface antigens
Enzymatic activity in extracellular domains Receptors for cytokines Receptors for other soluble proteins (e.g. Fc receptors) Receptors for cell surface proteins or extracellular matrix Others (e.g. ion channels and transporters) Unknown
3% 13 % 10% 25 % 1% 49%
Enzymatic activity present in leucocyte membrane proteins The n u m b e r of enzymes identified at the leucocyte cell surface is low (Table 3). Enzymes may only need to be present at low site numbers per cell, because of their catalytic role, compared to proteins mediating adhesion events where multiple interactions involving large surface areas are likely. Thus it is possible that m a n y more enzymes are present but at low levels not detectable by labelling with mAbs and analysis with flow cytofluorography. It is easy to see how important these might be in providing a m e t h o d for rapidly decreasing cell surface expression of glycoproteins as shown for CD62L 6~ These proteases often show activities typical of metalloproteases and seem likely to be associated with cell surface molecules rather than secreted proteins61. Proteases also provide a powerful way of modifying the activity of cytokines in the vicinity of cell surfaces. Proteolytic activity has been identified in the ectodomains of several antigens (e.g. CD10; Table 3) but its physiological role is not clear. Indeed the role in the i m m u n e system of the activities described in Table 3 is poorly understood (reviewed in ref. 62). T r a n s m e m b r a n e proteins containing cytoplasmic regions with kinase and phosphatase activities are c o m m o n in biology although relatively few are restricted to leucocytes (Table 3). However m a n y of the cytoplasmic regions of t r a n s m e m b r a n e proteins interact directly with enzymes, for example tyrosine kinases (e.g. CD4 and
Table 3
Enzymatic activities present in leucocyte membrane proteins
Extracellular activities Aminopeptidase A, EC 3.4.11.7 CD10 Neutral endopeptidase, EC 3.4.24.11 CD 13 Aminopeptidase N, EC 3.4.11.2 CD26 Dipeptidylpeptidase IV, EC 3.4.14.5 CD38 ADP-ribosyl cyclase CD39 Ecto (Ca2+, Mg2+)-apyrase (ecto-ATPase) PC 1 5'-nucleotidase phosphodiesterase I, EC 3.1.4.1 and nucleotide pyrophosphatase, EC 3.6.1.9 RT6 NAD glycohydrolase
Intracellular activities CD45 CD 115 CD 117 CD 148 Ltk
Protein Protein Protein Protein Protein
tyrosine tyrosine tyrosine tyrosine tyrosine
phosphatase, EC 3.1.3.48 kinase, EC 2.7.1.112 kinase, EC 2.7.1.112 phosphatase, EC 3.1.3.48 kinase, EC 2.7.1.112
See refs 62-64 for details of enzymatic activities.
112
Table 4 Interactions of leucocyte membrane proteins with soluble proteins Cell surface antigen
Domain type involved
Ligand
CD 14 CD21 CD23 CD71 CD87 (UPAR) CD 117 (c-kit) IL-1R IL-4R IL-8R LDLR Scavenger receptor TNFR (CD120)
Short repeats CCP Lectin C-type No SF Ly-6 IgSF IgSF CytokineR/Fn3 G protein-coupled R LDLR No SF TNFR
LPS binding protein Complement C3d IgE (not carbohydrate) a Transferrin Urokinase plasminogen activator b Stem cell factor IL-1 IL-4 IL-8 Low-density lipoproteins Acetdylated lipoproteins c TNF
aThe lectin C-type domain of CD23 interacts with IgE but not through its carbohydrate and hence is an example of a lectin C-type domain binding an IgSF domain. bThe interaction of CD87 has been shown to involve the Ly-6 domain binding to an EGF domain in urokinase plasminogen activator and not the protease domain 66 CThe scavenger receptor cysteine-rich domain is not involved in this binding as the form produced by alternative splicing that lacks this domain still binds ligands. dSeveral members of the TNFSF exist both as soluble and cell surface forms and are also included in Table 6.
CD8) or tyrosine phosphatases (e.g. CD22, FcTRIIB, K I R ) a n d they may also be substrates for enzymes (e.g. CD3 chains). The cytoplasmic regions rarely contain clearly defined domains and may adopt a flexible extended structure. Consistent with this, their interactions usually involve short linear stretches of sequence, such as the ITAM motif (see Chapter 3).
INTERACTIONS OF LEUCOCYTE MEMBRANE PROTEINS Interactions of leucocyte membrane proteins with soluble protein ligands (Table 4) This group includes m a n y of the receptors for cytokines. These interactions are generally of high affinity (see below). Apart from TNF and related proteins the cytokines themselves usually contain domain types not found in cell surface proteins and these are reviewed in ref. 65. Table 4 gives examples of interactions of leucocyte m e m b r a n e proteins with soluble proteins and the different types of domains involved.
Interactions of leucocyte membrane proteins with carbohydrate ligands (Table 5) The complexity and abundance of carbohydrate indicates that protein-carbohydrate interactions are likely to be important. A n u m b e r of examples have been identified involving link, IgSF and lectin C-type domains.
114
Table 5 Interaction of membrane proteins with carbohydrates. Examples of interactions involving different domain types with carbohydrates Cell surface antigen
Domain type involved
Ligand
CD22 CD31 CD44 CD62 Mannose receptor
IgSF IgSF Link Lectin C-type Lectin C-type
Sialoglycoconjugates Glycosaminoglycans Hyaluronan Various carbohydrates including CD 15s Mannose (in polymeric form)
Interaction of membrane proteins with other membrane-associated ligands (cell adhesion and accessory molecules)(Tables 6 and 7) A large n u m b e r of cell surface ligands have been identified for leucocyte surface proteins. Most of these involve heterophilic interactions (Table 6) although there are some h o m o p h i l i c interactions involving IgSF domains (Table 7). IgSF domains are particularly c o m m o n and interact both w i t h other IgSF d o m a i n s and a large variety of different d o m a i n types. Integrins have various ligands including extracellular m a t r i x proteins and cell surface proteins containing IgSF and cadherin domains.
Table 6 Heterophilic interactions of leucocyte membrane antigens. Interaction of membrane proteins with other membrane components Cell surface antigen Domain type involved
Ligand
Domain type involved
CD2 CD6 MHC Class I and II CD 1la/CD 18 CD31 CD40 CD55
IgSF SRCR MHC Integrin IgSF TNFR CCP
CD48, CD58 CD 166 (ALCAM) Peptide + TCR CD50, CD58, CD 102 c~vfl3 integrin CD40L (CD 164) CD97
c~Efl7
Integrin
E-cadherin
IgSF IgSF IgSF IgSF Integrin TNF a EGF or G-protein-coupled receptor TM7 region b Cadherin
aSeveral members of the TNFSF exist both as soluble and cell-associated forms and are therefore also included in Table 5. bCD97 is a member of the G protein-coupled receptor superfamily but with EGF domains and some other sequence in the extracellular region. The site of interaction with the CCP domains of CD55 has not been determined 67
Table 7 Homotypic interactions of membrane proteins Cell surface antigen
Domain type involved
CD31 CD66a, c
IgSF IgSF
There are relatively few examples of homotypic interactions between leucocytes and so far these seem to be confined to proteins containing IgSF domains.
LI~
The architecture and interactions of leucocyte surface molecules
Affinity of interactions of cell surface antigens The affinities of cell surface receptors for cytokines are generally high with slow dissociation rate constants. This would ensure a high level of occupancy of receptors at relatively low concentrations of cytokine and sufficient time of occupancy to permit signal transduction. In contrast, interactions b e t w e e n cell surface antigens often have very low affinities 6s. For instance the interaction b e t w e e n CD2 and its ligand in rodents, CD48, has been estimated to have a Kd=-75 #M and a dissociation rate constant of >6s -1 (see ref. 68). Several other affinities are in the same range (Table 8). A wide range of affinities has been reported for the TCR interaction with M H C peptide complexes but these are m u c h lower than for cytokines and their receptors. Several integrins can exist in one of two states with different binding affinities. It is possible to induce the high-affinity "activated state" by a n u m b e r of stimuli including signals from within the cell, a p h e n o m e n o n t e r m e d "inside-outside signalling" 69. The interactions between proteins on opposing cells during cell contact are clearly different from the interactions between proteins in solution. In the latter, diffusion is unlikely to be rate limiting whereas for proteins tethered to the cell surface the m o v e m e n t perpendicular to the cell will be restricted by the a t t a c h m e n t to the m e m b r a n e and the diffusion in the other two dimensions will be dependent on the mobility of the protein in the lipid bilayer. This in turn will depend on w h e t h e r the proteins are linked by GPI anchors (more m o b i l e ) o r t r a n s m e m b r a n e sequences (less mobile) and w h e t h e r some or all of the protein interacts with other proteins, e.g. through their cytoplasmic regions with the cytoskeleton or other cytoplasmic proteins (discussed in ref. 68). The local distribution- of the interacting molecules and their accessibility will also affect their ability to interact (see below, Architecture of the leucocyte cell surface).
Table 8 M o n o m e r i c affinities a n d rate constants for leucocyte m e m b r a n e protein interactions Interaction
Temperature (~
Soluble ligands IL-1 to IL-1 receptor Antibody (OX34) to protein (CD2) Cell surface ligands CD2 to CD58 (human) CD2toCD48(rat) CD54 to CD1 la/CD18 (LFA-1) CD80 to CD28 CD80 to CTLA-4 T cellreceptor to peptide/MHC CD8afl to MHC Class I CD62E to sialyl Lex CD62L to GlyCAM- 1
11~
8 37 37 37 37 37 37 -25 -25 -25 37
Ka (a -1 )
kon (M-1 .S -1 )
koff (S-1 )
Refs
6s 6s
1010
_106
2 x 109
4 x 10s
_10-4 2 x 10-4
4-10x 10 4 1.1-1.6 x104 1.1 x 104 2.5 x 105 2.5 x 106
_>4x l0 s >lxl05 _ > 7 x 10s >9x 105 103_105 3 x 103 _ ~ 10s
_>4 70 ->6 6s _ 6s > 1.6 71 >0.43 7~ 10-3_10-1 72 0.05 73 _ 74 ->10 75
104_107
5 x 104 104 104
Multimeric complexes at the leucocyte surface Most cell surface antigens are probably present as monomers at the cell surface but there are several examples where the antigens associate to form stable complexes with 2-7 different polypeptides as illustrated in Table 9. Some of the chains are linked by disulfide bonds but others form stable complexes without such bonds, e.g. MHC Class II antigens. The largest complexes are the Fc receptors and the B and T cell antigen receptors. The latter contains a total of six or seven different polypeptide chains, namely, a disulfide-linked heterodimer with antigen binding specificity (TCR~fl), a disulfide-linked homodimer (CD3~) or heterodimer (CD3~) important for signalling, and three non-covalently associated chains, CD36, CD3-y and CD3c. All these chains are required in order to get cell surface expression. The exact stoichiometry is not known but there are probably two copies of CD3c per complex 76. In addition to the relatively stable multimers mentioned above there are many examples where weak associations have been described. Typically these associations can only be demonstrated when the membrane proteins are solubilized with very weak detergents such as digitonin and often when highly sensitive detection methods are used, e.g. protein kinase assays. For instance most of the TM4SF proteins seem to be associated with other cell surface proteins 77. The finding that GPI-anchored proteins can give activation signals on crosslinking and can be co-precipitated with cytoplasmic kinases indicated interactions with other components 17. However these must be indirect as these proteins do not traverse the lipid bilayer and are probably a consequence of the co-localization of these proteins in lipid microdomains which are insoluble in the weak detergents 78.
Stoichiometry of interactions of cell surface antigens One consequence of the presence of multimers at the cell surface is that they provide variation in the types of interactions with ligands. Table 10 gives examples of cell surface molecules that interact with soluble or membrane molecules with different stoichiornetries. This variety of interactions may enable different types of responses to ligand interactions and is very common for the cytokine receptors. A minority of cytokines signal via conformational changes in receptors, e.g. IL-8 binds a G proteincoupled receptor. However the majority of signals transduced by cytokines involve the association of subunits, for example, c-kit ligand forms a dimer and binds a monomeric receptor ( C D l l 7 ) w h i c h contains five IgSF domains and a tyrosine kinase domain. This is similar to platelet-derived growth factor receptor (CD140) and c-kit ligand probably also signals by dimerizing the CD117 and activating the Table 9 Examples of stable multimeric protein complexes at the cell surface
Covalently associated Homodimers Homotrimers Heterodimers
CD8aa, CD28, CD69, CD162, mIg Scavenger receptor CD8~fl, CD94/NKG2, TCR~fl
Non-covalently associated Homodimers Heterodimers Homotrimers Three or more different chains
CD 10, CD26 Integrins, MHC Class II CD23, CD 154 CD3~, 7, ~/TCR complex, IL-2R
11~
Table 10 Examples of the stoichiometry of interactions of cell surface proteins Cell surface antigen
Number of polypeptides
Polypeptides
Ligand
CD2 CD 117 (c-kit) CD 120 (TNFR) IL-2R TCR/CD3
1 1 1 3 8
CD 162 (PSLG-1)
2
CD2 CD 117 CD 120 CD25, CD122, CD132 TCR~, TCRfl, CD33, CD3~(2), CD3?, CD3~(2) Dimer
CD48 (monomer) c-kitL (dimer) TNF (trimer) IL-2 (monomer) MHC + peptide (monomer) CD62P (monomer)
cytoplasmic tyrosine kinase 79. TNF and several members of the TNFSF are soluble trimeric proteins and signal by binding and hence associating their monomeric receptors (TNFRSF)8o. However, TNFSF members may also exist as trimers at the cell surface and further complexity is indicated by their ability to form heterotrimers, e.g. between lymphotoxin fl and lymphotoxin c~ (TNFfl)81. The possible relevance of multimeric complexes in providing different sensitivities to ligand availability is illustrated by the IL-2R which consists of three different polypeptides; one chain gives low-affinity binding of IL-2 (a monomeric cytokine), the second chain increases the affinity of binding and a third chain mediates signalling. Thus the sensitivity of a cell to IL-2 can potentially be controlled by the relative expression of each polypeptide. Several other cytokine receptors have similar arrangements and often share the signalling polypeptide, e.g. CD130. The precise stoichiometry of many of the cell-cell interactions has not been determined and many arrangements are possible due to the presence of dimeric and trimeric proteins (Table 10).
ARCHITECTURE
OF THE LEUCOCYTE
CELL SURFACE
The binding or enzymatic activities of leucocyte antigens will be affected by factors such as their abundance, accessibility and distribution on the cell surface. On the basis of the accumulated molecular data given in this book it is possible to gain a view of how the cell surface looks with its various molecular forms. In reviewing this subject attention will be focused mainly on the surface proteins of thymocytes and resting T lymphocytes since the membranes of these cells are probably the best characterized of all the leucocytes.
Abundance of leucocyte surface antigens What proportion of the T cell surface has been accounted for in terms of characterized molecules? It has been estimated that T and B cells differ only in the expression of about 200-300 different genes (see ref. 82). As only about 3% of the mRNA of lymphocytes is in the membrane bound polysome fraction, many of these T cellspecific molecules will not be at the cell surface. If this is the case, most of the surface molecules are probably already known. We believe this is likely, at least in relation to molecules that are susceptible to easy detection by flow cytometry (-5000 molecules per cell), for the following reasons. First, the major bands on T and
tl~
B cell s u r f a c e s i d e n t i f i e d b y r a d i o l a b e l l i n g of p r o t e i n or c a r b o h y d r a t e c a n be a c c o u n t e d for b y k n o w n m o l e c u l e s 32. S e c o n d l y , w i t h m A b s m a d e a g a i n s t cells f r o m a v a r i e t y of species, e s s e n t i a l l y t h e s a m e a n t i g e n s h a v e b e e n d i s c o v e r e d . If t h e k n o w n a n t i g e n s w e r e a m i n o r set of a large t o t a l p o o l t h a t r e m a i n s to be d i s c o v e r e d , o n e m i g h t e x p e c t m u c h less o v e r l a p b e t w e e n a n t i g e n s i d e n t i f i e d in i m m u n i z a t i o n s p e r f o r m e d b e t w e e n d i f f e r e n t s p e c i e s or in d i f f e r e n t w a y s . T h i r d l y , e s t i m a t e s of t h e s u r f a c e area of a T l y m p h o c y t e c o v e r e d b y k n o w n cell s u r f a c e m o l e c u l e s s u g g e s t t h a t m u c h of t h e s u r f a c e c a n be a c c o u n t e d for. R e l e v a n t figures for rat t h y m o c y t e s a n d T a n d B l y m p h o c y t e s are g i v e n in T a b l e 11 w i t h areas for m o l e c u l e s d e r i v e d o n t h e basis of t h e m o l e c u l a r d i m e n s i o n s d i s c u s s e d below. O b v i o u s l y t h e r e m a y be errors in t h e s e figures b u t t h e s e e s t i m a t e s s u g g e s t t h a t k n o w n m o l e c u l e s c o u l d c o v e r up to 6 0 % of t h e T l y m p h o c y t e surface, if o n e a s s u m e s t h a t t h e m e m b r a n e s u r f a c e is s m o o t h . H o w e v e r s c a n n i n g e l e c t r o n m i c r o s c o p y s h o w s t h a t it is m o r e l i k e l y to be c o v e r e d T a b l e 11 Site numbers for lymphocyte surface antigens and an estimate of the surface
area they may cover Antigen
mIg MHC Class I MHC Class II Thy-1 CD2 CD4 CD8 CD43 CD44 CD45 MRC OX2
Thymocytes
T cells
Numbers
% area
. . l06 1.4 x 104 1.5 x 104 4 x 104 105 1.3 x 104 7 x 104 1.4 x 104
. 24 0.3 0.5 1.3 19 13 0.3
Numbers
B cells % area
Numbers
. 2 x 105 . . 1.4 x 104 3 x 104 105 1.5 x 105 * 10 ~ -
7 x 104 2 x 105 2.4 x 105
5.8 . 0.3 1.0 3.3 28 23 -
% area 8.2 5.8 7.0
. * 105 5 x 103
27 0.08
Values for the site numbers of antigens present on resting rat lymphocytes were determined by binding of [125I]Fab (for CD4 and Thy-1) or by quantitative radioimmunoassay using the numbers obtained with [125I]Fab for CD4 or Thy-1 as standards s6. The dashes indicate that the antigen is not present on that cell type. Where an antigen is present on a subpopulation of cells the number given is calculated per positive cell. Additional data are from refs 87 and 88 and unpublished. The asterisk denotes that the site number for rat CD44 has not been determined on T cells or B cells but from flow cytometry analysis and comparison with other markers, it is probably in the order of 8 x 104 on thoracic duct lymphocytes. The percentage of the surface that each antigen may cover was calculated as follows with further information in refs 42, 89 and 90. The surface area of a lymphocyte was assumed to be 120 ~m ~, with a volume of 125/~m 3 (ref. 86). For these calculations Thy-1 was assumed to be a sphere of radius 3 nm, CD43 a rod of 45 x 5 nm 42; CD45, thymocyte form 28 x 8 nm, T cell form 3 5 x 8 n m , and B cell form 41x8nm90'91; CD4 1 3 x 3 n m 92 ; CD2 7 x 3 n m 93 ; MHC Class I 7 x 5 nm 94; MHC Class II 7 x 5 n m 9s. CD8 is assumed to have the same dimensions as CD4, OX2 the same as CD2. The size of IgM is estimated as 3.5 Fab fragments of 8 x5 nm 96. The molecular size of CD44 has not been estimated but it could be quite large due to its posttranslational modifications. These calculations are approximate and should be used as an indicator as to the proportion of the surface covered by an antigen. It is notable that a few major molecules make up the majority of the exposed cell surface proteins. The percentages are probably an overestimate as the surface of the lymphocyte could be considerably larger than the smooth sphere assumed due to the presence of microvilli, e.g. possibly 3-4-fold greater 83, and the cell surface molecules may also project away from the surface at least for some of the time.
LI~
with microvilli. In the F11 hybrid neuronal line where the surface area was actually measured, it was 3-4-fold greater than that of a smooth sphere 83. If one assumes a similar figure for lymphocytes and takes into account the fact that many of the molecules are probably not layered over the surface but project outwards from the cell, a figure of about 20% is probably the best estimate of the amount of surface covered by the known, well-characterized proteins. It is notable that the majority of this is due to a few major molecules and it is unlikely that further molecules of this abundance remain to be identified. The proteins will not cover the whole surface, as glycolipids are also readily recognized by antibodies reacting with lymphocytes 84,8s. The levels of expression of cell surface antigens often change rapidly on, for instance, activation of T cells. In addition there are cases where intracellular pools can be directed to the surface giving more rapid increase in cell surface expression as illustrated for CD62P on activation of platelets and endothelial cells. The selectins also provide an example where levels of surface expression are modified by enzyme cleavage and this is discussed above in the section on enzymatic activity of cell surface antigens.
Dimensions of cell surface glycoproteins The dimensions of many cell surface glycoproteins can now be accurately assessed on the basis of tertiary structure determination of domains, or electron microscopy studies on whole proteins or their extracellular regions. Some of the leucocyte surface molecules with known dimensions are shown in Fig. 9. It is clear that there is a wide variation in size of leucocyte surface antigens and their degree of glycosylation. This, together with the wide range in their abundance (Table 11)will have implications in cell-cell interactions. The TCR and some of the accessory molecules involved in antigen recognition are relatively small. This is illustrated in Fig. 10 which shows the approximate dimensions of some molecules that are involved in B and T cell activation and adhesion between these cells. The dimensions are based on the data in Fig. 9. The recent determination of the structure of the TCR/peptide/MHC ternary complex indicates that the distance between opposing cells at the site of T cellantigen-presenting cell interaction, is about 14 nm 97,98. A similar distance (14 nm) has been estimated for the CD2/CD48 complex on the basis of structural and mutagenesis data 93,99 and the CD2/CD58 complex is likely to be similar in size. Structural and mutagenesis data suggest that the CD4/MHC Class II complex will have a similar size loo. Fewer data are available for CD8/MHC Class I and CD28/CD80 interactions but their predicted structures are consistent with sizes similar to the TCR/MHC complex lo1,1o2. The CD40/CD154 interaction probably has similar dimensions based on analogy with the structure of the TNFR/TNF complex 8~ This has an overall length of 8.5 nm but there is a short hinge-like sequence in CD154 which may help to span the intercellular difference. Whereas larger adhesion pairs such as LFA-1/CD54 may mediate the initial contacts between cells it is likely that close membrane approximation requires interactions of the smaller molecules such as CD2, CD28 and CD4 or CD8. Furthermore, large, abundant molecules such as CD43 will either need to lie flat on the membrane or move away from the TCR in order to allow close approximation of the membranes. The finding that T cells from a CD43 knockout animal are easier to stimulate and stickier is consistent with the idea that CD43 provides a barrier to the close interactions needed in T cell activation lo3. It is also interesting to consider how this recognition of antigen by T cells might differ from the triggering of B cells by antigens. Figure 10 also shows the dimensions
L2C
The architecture and interactions of leucocyte surface molecules
,~ CR1 (CD35) %tj 0D45 (LC-A) -
, ~
"~
Extra segments ~ upto about23nm ..,~ ,ntotal ~, .J
~/' ~ /
(85nm) . ~
CD43 (45nm)
~
" Cfr'k~
~
PSGL-1 " ~ (50nm) CD62P (38am)
'~, "' . ~
14moredomains ~ ~
~LJ.,,.I " ,.__'.~
~ 9 s~,~/ .~.,~
_
8nm
~ COOH I
~ 10nrn
COOH
COOH
COOH
~,OOH
I
LFA-1 (CD11a/CD18) 0D54 (21nm) (19nm) TCR CD2
CD4
(7nm) (7nm) (13nm)
(~,,~
~
COOH
~
~
~
MHC Class I (7nm)
nc~
~ COOH
-(~ COOH
IgM
~#-~
Ganglios~~
..........
COOS
COOH
..,"
~-'~
9~ "
~ 0~' j" l
I--,~I
~
~z.-
~
COOS
~
COOH
Figure 9 Schematic view of molecules at the surface of leucocytes for which structural data are available from electron microscopy or X-ray crystallography studies. The molecules shown will not all be expressed simultaneously on the same cell and this figure is solely to illustrate the different sizes of common leucocyte surface antigens. The abundance of the molecules varies considerably and this is illustrated in Table 11. The molecules are drawn roughly to size and shape with the approximate height of the molecule from the cell surface indicated. The N-linked (~--) and O-linked (m) glycosylation sites are indicated. These are not drawn to scale but are shown to indicate the degree of glycosylation. See Figs 5-7 for an idea of the size and types of glycosylation that occur on leucocyte surface proteins. The overall dimensions of the molecules determined by electron microscopy include the contribution for carbohydrate. The model for ganglioside GM1 is based on the structure and the size of sugar residues (see Fig. 5)84. The "P" denotes the phosphotyrosine phosphatase domains in the CD45 cytoplasmic domain. The models are based on the following data: CD4 separate X-ray crystallography studies of the first two 92,114and second two domains of CD4 11s; X-ray crystallography of CD2 93 X-ray crystallography of MHC Class 194; X-ray crystallography of TCR/MHC Class I complex 97"98 and electron microscopy of CD35 116 CD4342 CD45 9o,91,117 CD54 118 CD62P 119 and PSGL-1 (CD162)1r The CDlla/CD18 model is based on the size determined by electron microscopy for another integrin, the fibronectin receptor 121 The IgM model is based on the size determined for Fab fragments of IgG 96.
121
[2_2 Figure 10 Models for molecules involved in T and B lymphocyte activation. The dimensions are based on the approximate sizes determined for the molecules in Fig. 9. IgSF domains are indicated b y shaded ovals, CCP domains b y lightly shaded ovals (CD21) and the fibronectin type III domains in CD45 by clear ovals (the NH2 domain in CD45 shows no sequence similarities and is also indicated b y an oval). The CD40lCD154 interaction is based o n the structure of TNFRITNF (see text) and is indicted b y the trimeric CD154 binding three copies of the monomeric CD40. See Chapter 3 for more details of the superfamilies and their structures.
of some molecules involved in B cell antigen recognition. B cell triggering is potentially quite different to that of T cells since presumably surface Ig is crosslinked to give a triggering reaction via antigen complexes. This activation is controlled by interactions of the complement and Fc receptors giving stimulatory and inhibitory signals respectively 1~ In this scheme, B cell triggering via the B cell antigen receptor will occur away from the tight interface between the B cell (acting as an antigen presenting cell) and a T helper cell where T cell activation and lymphokine production occurs.
Distribution of m o l e c u l e s at cell surfaces Cell surface molecules are not all evenly distributed over the cell surface and there is evidence that the local distribution can play a crucial role in their function. The initial tethering and rolling of leucocytes to endothelial cells under flow conditions can only be mediated by a subset of adhesion molecules on the surfaces of these cells. Interestingly the glycoproteins on leucocytes which participate in adhesion under flow are found on the tips of microvilli (e.g. CD62L, PSGL-1 (CD162) and c~4fll). In contrast, adhesion molecules excluded from the microvilli (e.g. c~Mfl2) are not able to mediate adhesion under flow although c~Mfl2 is involved in later adhesion events l~ Furthermore mutant CD62L (L-selectin) molecules that are excluded from microvilli are much less effective at mediating rolling and tethering of leucocytes under flow lo6
References 1 Singer, S.J. (1990)The structure and insertion of integral proteins in membranes. Annu. Rev. Cell Biol. 6, 247-296. z Chan, A.C. et al. (1994)The role of protein tyrosine kinases and protein tyrosine phosphatases in T cell antigen receptor signal transduction. Annu. Rev. Immunol. 12, 555-592. 3 Superti-Furga, G. and Courtneidge, S.A. (1995) Structure-function relationships in Src family and related protein tyrosine kinases. Bioessays 17, 321-330. 4 Hitt, A.L. et al. (1994) Ponticulin is the major high affinity link between the plasma membrane and the cortical actin network in Dictyostelium. J. Cell Biol. 126, 1433-1444. s Picot, D. et al. (1994) The X-ray crystal structure of the membrane protein prostaglandin H2 synthase-1. Nature 367, 243-249. 6 Blobel, G. (1980) Intracellular protein topogenesis. Proc. Natl Acad. Sci. USA 77, 1496-1500. 7 Shapiro, L. et al. (1996) Crystal structure of the extracellular domain from P0, the major structural protein of peripheral nerve myelin. Neuron 17, 435-449. 8 von Heijne, G. (1986)A new method for predicting signal sequence cleavage sites. Nucleic Acids Res. 14, 4683-4690. 9 0 g a t a , S. et al. (1989) Primary structure of rat liver dipeptidyl peptidase IV deduced from its cDNA and identification of the NH2-terminal signal sequence as the membrane-anchoring domain. J. Biol. Chem. 264, 3596-3601. lo Dohlman, H.G. et al. (1991) Model systems for the study of seventransmembrane-segment receptors. Annu. Rev. Biochem. 60, 653-688. 11 Havelka, W.A. et al. (1995) Three-dimensional structure of halorhodopsin at 7 resolution. J. Mol. Biol. 247, 726-738.
12~
12 McKnight, A.J. and Gordon, S. (1996) EGF-TM7: a novel subfamily of seventransmembrane-region leukocyte cell surface molecules. Immunol. Today 17, 283-287. 13 Wright, M.D. et al. (1993)Gene structure, chromosomal location and protein sequence of mouse CD53: evidence that the transmembrane 4 superfamily arose by gene duplication. Int. Immunol. 5, 209-216. 14 Tomlinson, M.G. et al. (1993) Epitope mapping of anti-rat CD53 monoclonal antibodies. Implications for the membrane orientation of the transmembrane 4 superfamily. Eur. J. Immunol. 23, 136-140. is Hupp, K. et al. (1989)Gene mapping of the three subunits of the high affinity FcR for IgE to mouse chromosomes 1 and 19. J. Immunol. 143, 3787-3791. 16 Gottesman, M.N. and Pastan, I. (1988) The multidrug transporter, a double-edged sword. J. Biol. Chem. 263, 12163-12166. 17 Ferguson, M.A. and Williams, A.F. (1988) Cell-surface anchoring of proteins via glycosyl-phosphatidylinositol structures. Annu. Rev. Biochem. 57, 285-320. is Homans, S.W. et al. (1988) Complete structure of the glycosyl phosphatidylinositol membrane anchor of rat brain Thy-1 glycoprotein. Nature 333, 269-272. 19 Roberts, W.L. et al. (1988) Lipid analysis of the glycoinositol phospholipid membrane anchor of human erythrocyte acetylcholinesterase. Palmitoylation of inositol results in resistance to phosphatidylinositol-specific phospholipase C. J. Biol. Chem. 263, 18766-18775. 20 Caras, I.W. and Weddell, G.N. (1989) Signal peptide for protein secretion directing glycophospholipid membrane anchor attachment. Science 243, 1196-1198. 21 Wallner, B.P. et al. (1987) Primary structure of lymphocyte function-associated antigen 3 (LFA-3). The ligand of the T lymphocyte CD2 glycoprotein. J. Exp. Med. 166, 923-932. 22 Seed, B. (1987) An LFA-3 cDNA encodes a phospholipid-linked membrane protein homologous to its receptor CD2. Nature 329, 840-842. 23 Brown, D.A. and Rose, J.K. (1992) Sorting of GPI-anchored proteins to glycolipidenriched membrane subdomains during transport to the apical cell surface. Cell 68, 533-544. 24 Lisanti, M.P. et al. (1989) A glycophospholipid membrane anchor acts as an apical targeting signal in polarized epithelial cells. J. Cell Biol. 109, 2145-2156. 2s Parton, R. and Simons, K. (1995) Digging into Caveolae. Science 269, 1398-1399. 26 Resh, M.D. (1994)Myristylation and palmitylation of Src family members: The fats of the matter. Cell 76, 411-413. 27 Shenoy-Scaria, A.M. et al. (1994) Cysteine3 of Src family protein tyrosine kinases determines palmitoylation and localization in caveolae. J. Cell Biol. 126, 353-363. 2s Nadler, M. et al. (1993) Treatment of T cells with 2-hydroxymyristic acid inhibits the myristoylation and alters the stability of p56(lck). Biochemistry 32, 9250-9255. 29 Shenoy-Scaria, A.M. et al. (1993) Palmitylation of an amino-terminal cysteine motif of protein tyrosine kinases p56(lck)and p59(fyn)mediates interaction with glycosyl-phosphatidylinositol-anchored proteins. Mol. Cell. Biol. 13, 6385-6392. 3 0 McLaughlin, S. and Aderem, A. (1995) The myristoyl-electrostatic switch: a modulator of reversible protein-membrane interactions. Trends Biochem. Sci. 20, 272-276. 31 Gahmberg, C.G. et al. (1976) Characterization of surface glycoproteins of mouse lymphoid cells. J. Cell Biol. 68, 642-653.
L24
32 Woollett, G.R. et al. (1985) Molecular and antigenic heterogeneity of the rat leukocyte-common antigen from thymocytes and T and B lymphocytes. Eur. J. Immunol. 15, 168-173. 33 Barclay, A.N. et al. (1987) Lymphocyte specific heterogeneity in the rat leucocyte common antigen (T200) is due to differences in polypeptide sequences near the NH2-terminus. EMBO J. 6, 1259-1264. 34 Jackson, D.I. and Barclay, A.N. (1989)The extra segments of sequence in rat leucocyte common antigen (L-CA) are derived by alternative splicing of only three exons and show extensive O-linked glycosylation. Immunogenetics 29, 281-287. 3s Ralph, S.J. et al. (1987) Structural variants of human T200 glycoprotein (leukocyte-common antigen). EMBO J. 6, 1251-1257. 36 Toda, K. et al. (1984) Monoclonal anti-glycoprotein antibody that blocks cell adhesion in Polysphondylium pallidum. Eur. J. Biochem. 140, 73-81. 37 Barnstable, C.J. et al. (1978) Production of monoclonal antibodies to group A erythrocytes, HLA and other human cell surface antigens - new tools for genetic analysis. Cell 14, 9-20. 38 Gooi, H.C. et al. (1985) Differing reactions of monoclonal anti-A antibodies with oligosaccharides related to blood group A. J. Biol. Chem. 260, 13218-13224. 39 Sharkey, D.J. and Kornfeld, R. (1991) Developmental regulation of processing alpha-mannosidases and "intersecting" N-acetylglucosaminyltransferase in Dictyostelium discoideum. J. Biol. Chem. 266, 18477-18484. 4o Couso, R. et al. (1987)The high mannose oligosaccharides of Dictyostelium discoideum glycoproteins contain a novel intersecting N-acetylglucosamine residue. J. Biol. Chem. 262, 4521-4527. 41 Galili, U. et al. (1993) One percent of human circulating B lymphocytes are capable of producing the natural anti-Gal antibody. Blood 82, 2485-2493. 42 Cyster, J.G. et al. (1991)The dimensions of the T lymphocyte glycoprotein leukosialin and identification of linear protein epitopes that can be modified by glycosylation. EMBO J. 10, 893-902. 43 O'Connell, P.J. et al. (1991) Variable O-glycosylation of CD 13 (aminopeptidase N). J. Biol. Chem. 266, 4593-4597. 44 Cyster, J.G. et al. (1994) Antigenic determinants encoded by alternatively spliced exons of CD45 are determined by the polypeptide but influenced by glycosylation. Int. Immunol. 6, 1875-1881. 4s Barclay, A.N. et al. (1976) Chemical characterization of the Thy-1 glycoproteins from the membranes of rat thymocytes and brain. Nature 263, 563-567. 46 Parekh, R.B. et al. (1987) Tissue-specific N-glycosylation, site-specific oligosaccharide patterns and lentil lectin recognition of rat Thy-1. EMBO J. 6, 1233-1244. 47 Fukuda, M. et al. (1986) Structures of O-linked oligosaccharides isolated from normal granulocytes, chronic myelogenous leukemia cells, and acute myelogenous leukemia cells. J. Biol. Chem. 261, 12796-12806. 48 Carlsson, S.R. et al. (1986) Structural variations of O-linked oligosaccharides present in leukosialin isolated from erythroid, myeloid, and T-lymphoid cell lines. J. Biol. Chem. 261, 12787-12795. 49 Varki, A. (1994) Selectin ligands. Proc. Natl Acad. Sci. USA 91, 7390-7397. so Lasky, L.A. (1995) Selectin-carbohydrate interactions and the initiation of the inflammatory response. Annu. Rev. Biochem. 64, 113-139. sl Powell, L. and Varki, A. (1995)I-type lectins. J. Biol. Chem. 270, 14243-14246.
125
s2 Dowbenko, D. et al. (1993)Glycosylation-dependent cell adhesion molecule 1 (GlyCAM-1) mucin is expressed by lactating m a m m a r y gland epithelial cells and is present in milk. J. Clin. Invest. 92, 952-960. s3 Hemmerich, S. et al. (1995) Structure of the O-glycans in GlyCAM-1, an endothelial-derived ligand for L-selectin. J. Biol. Chem. 270, 12035-12047. s4 Rosen, S. and Bertozzi, C. (1996) Leukocyte adhesion: Two selectins converge on sulfate. Curr. Biol. 6, 261-264. ss Steegmaier, M. et al. (1995)The E-selectin-ligand ESL-1 is a variant of a receptor for fibroblast growth factor. Nature 373, 615-620. s6 Vachino, G. et al. (1995) P-selectin glycoprotein ligand-1 is the major counter-receptor for P-selectin on stimulated T cells and is widely distributed in non-functional form on many lymphocytic cells. J. Biol. Chem. 270, 21966-21974. s7 Pouyani, T. and Seed, B. (1995) PSGL-1 recognition of P-selectin is controlled by a tyrosine sulfation concensus at the PSGL-1 amino terminus. Cell 83, 333-343. s8 Sako, D. et al. (1995)A sulfated peptide segment at the amino terminus of PSGL-1 is critical for P-selectin binding. Cell 83, 323-331. s9 Kanzaki, M. et al. (1995) Expression of calcium-permeable cation channel CD20 accelerates progression through the G(1) phase in Balb/c 3T3 cells. J. Biol. Chem. 270, 13099-13104. 6o Chen, A. et al. (1995) Structural requirements regulate endoproteolytic release of the L-selectin receptor from the surface of leukocytes. J. Exp. Med. 182, 519-530. 61 Arribas, J. et al. (1996) Diverse cell surface protein ectodomains are shed by a system sensitive to metallopreotease inhibitors. J. Biol. Chem. 271, 11376-11382. 62 Shipp, M.A. and Look, A.T. (1993)Hematopoietic differentiation antigens that are membrane-associated enzymes: Cutting is the key! Blood 82, 1052-1070. 63 Deterre, P. et al. (1996) Coordinated regulation in human T cells of nucleotidehydrolyzing ecto-enzymatic activities, including CD38 and PC-1. Possible role in the recycling of nicotinamide adenine dinucleotide metabolites. J. Immunol. 157, 1381-1388. 64 Wang, F. and Guidotti, G. (1996)CD39 is an ecto (Ca ?++,Mg~+§ J. Biol. Chem. 271, 9898-9901. 6s Callard, R. and Gearing, A. (1994)The Cytokine Factsbook. Academic Press, London. 66 Magdolen, V. et al. (1996) Systematic mutational analysis of the receptor-binding region of the human urokinase-type plasminogen activator. Eur. J. Biochem. 23 7, 743-751. 67 Hamann, J. et al. (1996) The seven-span transmembrane receptor CD97 has a cellular ligand (CD55, DAF). J. Exp. Med. 184, 1-5. 68 van der Merwe, P.A. and Barclay, A.N. (1994)Transient inter-cellular adhesion the importance of weak protein-protein interactions. Trends Biochem. Sci. 19, 354-358. 69 Schwartz, M. et al. (1995) Integrins: Emerging paradigms of signal transduction. Annu. Rev. Cell Dev. Biol. 11,549-599. 7o van der Merwe, P.A. et al. (1994) The human cell adhesion molecule CD2 binds CD58 (LFA-3) with a very low affinity and an extremely fast dissociation rate but does not bind CD48 or CD59. Biochemistry 33, 10149-10160. 71 van der Merwe, P. et al. (1997)CD80 (B7-1)binds both CD28 and CTLA-4 with a low affinity and very fast kinetics. J. Exp. Med. 185, 393-403.
12~
72 Fremont, D. et al. (1996) Biophysical studies of T cell receptors and their ligands. Curr. Opin. Immunol. 8, 93-100. 73 Garcia, K. et al. (1996) CD8 enhances formation of stable T cell receptor/MHC class I molecule complexes. Nature 384, 577-581. 74 Jacob, G.S. et al. (1995)Binding of sialyl Lewis X to E-selectin as measured by fluorescence polarization. Biochemistry 34, 1210-1217. 7s Nicholson, M. et al. unpublished data 76 Malissen, B. and Malissen, M. (1996) Functions of TCR and pre-TCR subunits: lessons from gene ablation. Curr. Opin. Immunol. 8, 383-393. 77 Wright, M.D. and Tomlinson, M.G. (1994)The ins and outs of the transmembrane 4 superfamily. Immunol. Today 15, 588-594. zs Cerny, J. et al. (1996) Noncovalent associations of T lymphocyte surface proteins. Eur. J. Immunol. 26, 2335-2343. 79 Arakawa, T. et al. (1991) Glycosylated and unglycosylated recombinant-derived human stem cell factors are dimeric and have extensive regular secondary structure. J. Biol. Chem. 266, 18942-18948. s o Banner, D.W. et al. (1993)Crystal structure of the soluble human 55kd TNF receptor-human TNFfl complex: implications for TNF receptor activation. Cell 73, 431-445. sl Browning, J. et al. (1993) Lymphotoxin fl, a novel member of the TNF family that forms a heteromeric complex with lymphotoxin on the cell surface. Cell 72, 847-856. s2 Hedrick, S.M. et al. (1984) Isolation of cDNA clones encoding T cell-specific membrane-associated proteins. Nature 308, 149-153. s3 Yang, P. et al. (1992) Intercellular space is affected by the polysialic acid content of NCAM. J. Cell Biol. 116, 1487-1496. s4 Stein-Douglas, K.E. et al. (1976) Gangliosides as markers of murine subpopulations. J. Exp. Med. 143, 822-832. ss Hershey, P. et al. (1989) Augmentation of lymphocyte responses by monoclonal antibodies to the gangliosides GD3 and GD2: the role of protein kinase C, cyclic nucleotides and intracellular calcium. Cell. Immunol. 119, 263-278. s6 Williams, A.F. and Barclay, A.N. (1986)Glycoprotein antigens of the lymphocyte surface and their purification by antibody affinity chromatography. In Handbook of Experimental Immunology (Weir, D.M. ed.), Blackwell Scientific Publications, Oxford, pp. 22.1-22.24. s7 Paterson, D.J. et al. (1987)Antigens of activated rat T lymphocytes including a molecule of 50,000 Mr detected only on CD4 positive T blasts. Mol. Immunol. 24, 1281-1290. ss Clark, S.J. et al. (1988) Activation of rat T lymphocytes by anti-CD2 monoclonal antibodies. J. Exp. Med. 167, 1861-1872. s9 Williams, A.F. et al. (1987)Similarities in sequences and cellular expression between rat CD2 and CD4 antigens. J. Exp. Med. 165, 368-380. 90 Woollett, G.R. et al. (1985)Visualisation by low-angle shadowing of the leucocytecommon antigen. A major cell surface glycoprotein of lymphocytes. EMBO J. 4, 2827-2830. 91 McCall, M.N. et al. (1992) Epression of soluble forms of rat CD45. Analysis by electron microscopy and use in epitope mapping of anti-CD45R monoclonal antibodies. Immunology 76, 310-317. 92 Wang, J. et al. (1990) Atomic structure of a fragment of human CD4 containing two immunoglobulin-like domains. Nature 348, 411-418.
L27
93 Jones, E.Y. et al. (19920) Crystal structure of a soluble form of the cell adhesion molecule CD2 at 2.8 A. Nature 360, 232-239. 94 Bjorkman, P.J. et al. (1987) Structure of the human class I histocompatibility antigen, HLA-A2. Nature 329, 506-512. 9s Brown, J.H. et al. (1993)Three-dimensional structure of the human class II histocompatibility antigen HLA-DR1. Nature 364, 33-39. 96 Amzel, L.M. and Poljak, R.J. (1979)Three-dimensional structure of immunoglobulins. Annu. Rev. Biochem. 48, 961-997. 97 Garcia, K. et al. (1996) An c~flT cell receptor structure at 2.5 A and its orientation in the TCR-MHC complex. Science 274, 209-219. 98 Garboczi, D. et al. (1996) Structure of the complex between T-cell receptor, viral peptide and HLA-A2. Nature 384, 134-141. 99 van der Merwe, P.A. et al. (1995)Topology of the CD2-CD48 cell-adhesion molecule complex: implications for antigen recognition by T cells. Curr. Biol. 5, 74-84. loo Doyle, C. and Strominger, J.L. (1987) Interaction between CD4 and class II MHC molecules mediates cell adhesion. Nature 330, 256-259. lol Norment, A.M. et al. (1988) Cell-cell adhesion mediated by CD8 and MHC class I molecules. Nature 336, 79-81. lo2 Linsley, P.S. et al. (1990) T-cell antigen CD28 mediates adhesion with B cells by interacting with activation antigen B7/BB-1. Proc. Natl Acad. Sci. USA 87, 5031-5035. loa Manjunath, N. et al. (1995) Negative regulation of T-cell adhesion and activation by CD43. Nature 377, 535-538. lo4 Doody, G. et al. (1996) Activation of B lymphocytes: integrating signals from CD19, CD22 and FcTRIIbl. Curr. Opin. Immunol. 8, 378-382. los Erlandsen, S.L. et al. (1993)Detection and spatial distribution of the beta 2 integrin (Mac-1) and L-selectin (LECAM-1) adherence receptors on human neutrophils by high-resolution field emission SEM. J. Histochem. Cytochem. 41, 327-333. lo6 von-Andrian, U.H. et al. (1995) A central role for microvillous receptor presentation in leukocyte adhesion under flow. Cell 82, 989-999. lo7 Misumi, Y. et al. (1990)Primary structure of human placental 5'-nucleotidase and identification of the glycolipid anchor in the mature form. Eur. J. Biochem. 191, 563-569. l o 8 Stahl, N. and Prusiner, S.B. (1991) Prions and prion proteins. FASEB. J. 5, 2799-2807. lo9 Gerber, L.D. et al. (1992) Phosphatidylinositol glycan (PI-G) anchored membraneamino-acid-requirements adjacent to the site of cleavage and. J. Biol. Chem. 267, 12168-12173. 11o Sugita, Y. et al. (1993)Determination of carboxyl-terminal residue and disulfide bonds of MACIF (CD59), a glycosyl-phosphatidylinositol-anchored protein. J. Biochem. 114, 473-477. 111 Xia, M.Q. et al. (1993) Structure of the CAMPATH-1 antigen, a glycosylphosphatidylinositol-anchored glycoprotein which is an exceptionally good target for complement lysis. Biochem. J. 293, 633-640. 112 Rademacher, T.W. et al. (1988) Glycobiology. Annu. Rev. Biochem. 57, 785-838. 11a Schachter, H. and Brockhausen, I. (1989) The biosynthesis of branched O-glycans. Symp. Soc. Exp. Biol. 43, 1-26.
L28
114 Ryu, S.E. et al. (1990) Crystal structure of an HW-binding recombinant fragment of human CD4. Nature 348, 419-426. 11s Brady, R.L. et al. (1993) Crystal structure of domains 3 and 4 of rat CD4: relationship to the NH2-terminal domains. Science 260, 979-983. 116 Weisman, H.F. et al. (1990) Soluble human complement receptor type 1: in vivo inhibitor of complement suppressing post-ischemic myocardial inflammation and necrosis. Science 249, 146-151. 117 Barford, D. et al. (1994) Crystal-structure of human protein-tyrosine-phosphatase lB. Science 263, 1397-1404. 118 Kirchhausen, T. et al. (1993) Location of the domains of ICAM-1 by immunolabeling and single-molecule electron microscopy. J. Leukocyte Biol. 53, 342-346. 119 Ushiyama, S. et al. (1993) Structural and functional characterization of monomeric soluble P-selectin and comparison with membrane P-selectin. J. Biol. Chem. 268, 15229-15237. ~2o Li, F. et al. (1996) Visualization of P-selectin glycoprotein ligand-1 as a highly extended molecule and mapping of protein epitopes for monoclonal antibodies. J. Biol. Chem. 271, 6342-6348. 121 Nermut, M.V. et al. (1988) Electron microscopy and structural model of human fibronectin receptor. EMBO J. 7, 4093-4099.
12c~
This Page Intentionally Left Blank
Section II
THE LEUCOCYTE ANTIGENS
CD1
T6
CDla
Molecular weights Polypeptide CD 1a
35323
SDS-PAGE reduced unreduced
43-49 kDa 43-49 kDa
Carbohydrate N-linked sites O-linked
TT T TT TT
4 nil
Human gene location and size lq22-23; five genes within 190 kb 1 OOOH
CDla
CHV
Domains
Exon boundaries
DGL
I
EYP
I
QVK
I
LSCI
Cl
ITMICYI
WEH
RCF
Tissue distribution The CD1 antigens are expressed on cortical thymocytes and expression is inversely correlated with that of TCR and MHC Class 11,e. CD 1 antigens are also expressed on some dendritic cells and cytokine-activated monocytes 1,2. CD 1c is expressed on B cells 1,e. CD 1d but not CD l a, b or c is expressed on intestinal epithelium 1,2. Different CD 1 molecules can be coexpressed on the same cell 1. Surface expression has been demonstrated for human CDla, b, c and d 1.
Structure CD 1 has a domain organization similar to that of MHC Class I and is expressed in association with fl2-microglobulin 1"2. It shows comparable levels of sequence similarity to both MHC Class I and Class II. The CD 1 genes form a multigene family with five genes in human, two in mouse, both homologues of human CDld, and eight in rabbit 1. However, unlike MHC Class I, CD1 does not show significant polymorphism 1,2. The N-terminus of human CD 1a has been determined 1.
Ligands and associated molecules Some T cells recognize antigen in a CD 1-restricted manner 2. Mouse NK1 + T lymphocytes which have a restricted TCR repertoire recognize murine CD 1, although there are conflicting data 3,4. Antigens presented by human CDlb were identified as microbial lipids 1,2,s. A recombinant form of the murine CD1 bound synthetic peptides with micromolar affinities 6. It showed a
132
preference for longer peptides than seen for Class I and resembles Class II peptide binding.
Function As suggested by its similarity to MHC antigens and as inferred above there is evidence for roles in presentation of lipid and peptides to T cells z. CD 1 has a role in positive selection of NK1 + T cells 7. The pathway for presentation of exogenous antigen by CD 1 is different from that for MHC Class I and II 1,2.
Comments CD1 genes, except CDlb, are transcribed in the same direction and all lack classical promoter elements 1.
Database accession numbers Human CDla Mouse CDI.1 Rat C D 1
FIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A02242 S01297
P06126 P11609
X04450 X13170 D26439
8 9 lo
A m i n o acid s e q u e n c e of h u m a n C D l a MLFLLLPLLA DGLKEPLSFH PWSRGNFSNE CELHSGKVSG QHENDITHNL LQLVCHVSGF AAGEAADLSC LWFRKRCFC
VLPGDGNA VIWIASFYNH EWKELETLFR SFLQLAYQGS LSDTCPRFIL YPKPVWVMWM RVKHSSLEGQ
SWKQNLVSGW IRTIRSFEGI DFVSFQNNSW GLLDAGKAHL RGEQEQQGTQ DIVLYWEHHS
LSDLQTHTWD RRYAHELQFE LPYPVAGNMA QRQVKPEAWL RGDILPSADG SVGFIILAVI
SNSSTIVFLW YPFEIQVTGG KHFCKVLNQN SHGPSPGPGH TWYLRATLEV VPLLLLIGLA
-i 50 i00 150 200 250 300 309
References 1 Porcelli, S.A. (1995) Adv. Immunol. 59, 1-98. z 3 4 s 6 7 8 9 lo
Melian, A. et al. (1996) Curr. Opin. Immunol. 8, 82-88. Bendelac, A. et al. (1995) Science 268, 863-865. Bendelac, A (1995) Curr. Opin. Immunol. 7, 367-374. Sieling, P.A. et al. (1995) Science 269, 227-230. Castano, A.R. et al. (1995) Science 269, 223-226. Bendelac, A. (1995)J. Exp. Med. 182, 2091-2096. Martin, L.H. et al. (1987) Proc. Natl Acad. Sci. USA 84, 9189-9193. Bradbury, A. et al. (1988) EMBO J. 7, 3081-3086. Ichimiya, S. et al. (1994) J. Immunol. 153, 1112-1123.
13~
T11, LFA-2
._C
Molecular weights Polypeptide 36 844 SDS PAGE reduced unreduced
45-58 kDa 45-58 kDa
Carbohydrate N-linked sites O-linked
3 nil
Human gene location and size lp13.1; N12 kb 1,z
COOH Domains Exon boundaries
is,l
IPS ,[
YKVI
v
KGA
I, QER
CEV [ FKCI C2
,JT l,c l
PEK NDE
Tissue distribution CD2 is expressed on virtually all thymocytes, T lymphocytes and NK cells. CD2 is also expressed on mouse B cells and rat and sheep splenic macrophages 1,3,4 Structure CD2 is the best-characterized member of a family of structurally related cell surface IgSF molecules which includes CD48, CD58, 2B4, Ly-9, and CD150 s. The gene for CD58 lies 60-250 kb from the CD2 gene. The genes encoding CD48, Ly-9 and 2B4 also lie close together but at a different locus (lq21-23). These two loci encoding CD2-related molecules appear to have arisen by duplication of an entire chromosomal region 6,7. The structure of the entire extracellular portion of CD2 has been determined by X-ray diffraction s. As predicted for other members of this family, it contains a membrane-distal V set domain lacking the canonical inter-// sheet disulfide linked by a somewhat flexible segment to a membrane-proximal C2 set domain with an additional disulfide s. The cytoplasmic domain is rich in basic and proline residues and is highly conserved across species 1,3,s. The V-set domain Nlinked glycan has been proposed to be essential for maintaining the structure of CD2, but this is controversial s. The N-terminus of the mature polypeptide has been established by protein sequencing 1. Ligands and associated molecules The major ligand for the extracellular portion of human CD2 is CD581. No CD58 homologue has been identified in the rat or mouse and CD48 appears
L34
CD2
to be the major ligand in these species s. Human CD2 has also been reported to bind CD48 and CD59, but this is controversial s. CD58 and CD48 both bind to the GFCC'C" fl sheet of the membrane-distal IgSF domain of CD2 s. Structural and mutagenesis studies suggest that CD2 interacts with CD48 and CD58 in a head-to-head orientation with the complex predicted to span -134 A s, similar to the dimensions of a T cell receptor/peptide/MHC complex (also -134 A)8. CD2 binds CD58 in solution with a very low affinity (Kd 9-22 /~M) and dissociates rapidly (koff>4s-1) s Membrane-attached CD58 binds halfmaximally to cell surface CD2 at a surface density o f - 1 0 - 2 0 molecules/#m ) and there is rapid exchange of bound and free CD58 at the contact interface 9. Immunoprecipitation studies indicate an association between CD2 and several transmembrane (T cell receptor/CD3 complex, CD5, CD45 and C D 5 3 ) a n d cytoplasmic (Lck, Fyn, phosphatidylinositol 3-kinase, tubulin) proteins s, lo,11. The interaction with Lck involves two of the prolinerich regions in CD2 which bind the SH3 domain of Lck 12
Function The interaction between CD2 and its ligands, CD48 and CD58, enhances T cell Ag recognition 1'3's This is partly a consequence of improved adhesion between T cells and antigen-presenting cells or target cells, but may also be the result of signals transmitted through the CD2 cytoplasmic domain 1,3,4. No abnormality has been detected in CD2-deficient mice s,13
Database accession numbers Human Rat Mouse
3L
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
A28967 A33071 B28967
P06729 P08921 P08920
M16445 X05111 Y00023
1 14 is
Amino acidsequenceofhuman CD2 MSFPCKFVAS KEITNALETW KETFKEKDTY KIQERVSKPK KWTTSLSAKF VFVALLVFYI NPATSQHPPP GPPLPRPRVQ
FLLIFNVSSK GALGQDINLD KLFKNGTLKI ISWTCINTTL KCTAGNKVSK TKRKKQRSRR PPGHRSQAPS PKPPHGAAEN
GAVS IPSFQMSDDI KHLKTDDQDI TCEVMNGTDP ESSVEPVSCP NDEELETRAH HRPPPPGHRV SLSPSSN
DDIKWEKTSD YKVSIYDTKG ELNLYQDGKH EKGLDIYLII RVATEERGRK QHQPQKRPPA
KKKIAQFRKE KNVLEKIFDL LKLSQRVITH GICGGGSLLM PQQIPASTPQ PSGTQVHQQK
-i 50 i00 150 200 250 300 327
References 1 2 3 4 s 6 7 8 9
Moingeon, P. et al. (1989)Immunol. Rev. 111, 111-144. Mitchell, E.L.D. et al. (1995) Cytogenet. Cell Genet. 70, 183-185. Bierer, B.E. and Burakoff, S.J. (1989) Immunol. Rev. 111,267-94. Beyers, A.D. et al. (1989)Immunol. Rev. 111, 59-77. Davis, S.J. and van der Merwe, P.A. (1996) Immunol. Today 17, 177-187. Wong, Y.W. et al. (1990) J. Exp. Med. 171, 2115-2130. Kingsmore, S.F. et al. (1995) Immunogenetics 42, 59-62. Garboczi, D.N. et al. (1996) Nature 384, 134-141. Dustin, M.L. et al. (1996) J. Cell. Biol. 132, 465-474.
t3~
lo 11 12 la 14 is
[3~
Bell, G.M. et al. (1992) J. Exp. Med. 175, 527-36. Offringa, R. and Bierer, B.E. {1993) J. Biol. Chem. 268, 4979-88. Bell, G.M. et al. (1996) J. Exp. Med. 183, 169-178. Killeen, N. et al. (1992} EMBO J. 11, 4329-4336. Williams, A.F. et al. (1987)J. Exp. Med. 165, 368-380. Sewell, W.A. et al. (1987) Eur. J. Immunol. 17, 1015-1020.
CD3/TCR
T cell receptor complex TCR
o~
CF~
C2~ssC2~ss ~, K(cls s-~---t~ C1/) ~C2 ['''~ } .~
COOH
COOH COOH~#
Molecular weights (reduced) TCRa TCR# TCRy TCR(~ pTCRa CD3y CD3fi CD3e chain q chain
13
45-60 kDa 40-50 kDa 45-60 kDa 40-60 kDa 33 kDa 25-28 kDa 20kDa 20 kDa 16 kDa 22 kDa
COOH~ COOH
Carbohydrate
Gene location
(N-linked sites only} 5 2 4 2 2 2 2 none none none
and size 14q 11.2; >800 kb 1 7q35; 685 kb 2 7p15; 160 kb a 14ql 1.2; >195 kb 4's 6p21.2-p 12 6 11 q23; 9 kb r 1 lq23; 3.7kb s 11 q23; 13 kb 9 1q22-q23 1q22-q23
The CD3/T cell receptor (TCR) gene organization, structure and function have been extensively reviewed (e.g. see ref. 10) and only a brief overview is given here.
Tissue distribution Expressed during thymopoiesis and on mature T cells in the periphery (reviewed in refs 9 and 10). Less than 10% of human peripheral T cells express the V/(~TCR complex but in the mouse the great majority of T cells present i]n some epithelial tissues are V/(~§ and have limited receptor diversity tt. Pre-TCR a (pTCRa) is expressed in immature but not mature T cells 6.
Structure CD3/TCR consists of both IgSF and non-IgSF proteins 1~ The stoichiometry of a CD3/TCR complex is not established but is generally thought to contain a TCR hetrodimer, two CD3e chains, a CD3? and a CD36 chain
137
CD3/TCR and a ~ hom0dimer 13. The a/fl and 7/~ heterodimers are clonotypic and consist of Ig-like variable and constant domains 9,14,1s. The a/fl heterodimer has been crystallized and the structure confirms the predicted IgSF domains 14,1s. In immature T cells, pTCRa which is comprised of a single conserved IgSF domain in its extracellular region is expressed instead of TCRa 6,13. The transmembrane domains of the clonotypic and invariant chains contain oppositely charged amino acids lo. The ~ chain forms disulfide-linked homodimers or, less frequently, heterodimers with its splicing variant, the r/chain 12. The ~ chain is related to the 7 chain of the IgE Fc receptor and can also associate with the Fc receptor CD1612. The cytop~smic domains of the CD3 and ~ chain contain ITAM motifs 12.
Ligands and associated molecules The ~/fl and 7/~ heterodimers recognize peptide antigen bound to MHC antigens 14"1s. The affinity of the interaction between the TCR and the MHC/peptide complex is in the range 10 -7-10 -4 M 16. The cocrystal structure of TCR and peptide-MHC revealsthat the TCR VDJC junction, which is equivalent to the third complementarity-determining region (CDR3) of antibodies, interacts directly with the peptide and that the CDR1- and CDR2-1ike regions of the TCR contact peptide and the MHC antigen 13. Superantigen binds to non-polymorphic regions of TCRVfl is. Intracellularly, Fyn is associated with the CD3/TCR complex 12'17. Phosphorylated ITAM motifs of the CD3 and ~ chains bind to SH2 domains of intracellular signalling molecules, e.g phosphorylated ~ chains bind to ZAP- 7012,16-18.
Function Recognition of antigen leads to signal transduction mediated by the invariant chains and subsequently T cell activation 17. Consequences of binding by TCR depend on antigen density and affinity of TCR for antigen and may result in unresponsiveness. Signal transduction involves tyrosine kinase and phospholipase C activation followed by phosphoinositide turnover and activation of several second messenger pathways 17,18.
Diversity and ontogeny Although TCR genes contain fewer variable-region segments 19 than antibody genes, the potential repertoire can be argued to be higher than that of antibodies due to the relative abundance of J-region segments and greater flexibility in the joining of variable (V)-, diversity (D)- and joining (J)-segments 9. Rearrangement of the TCR genes is similar to that of antibody genes and occurs at the CD4-/ CD8- stage of thymic development, but the TCR genes do not undergo somatic mutation after rearrangement 8. Selection of the receptor repertoire takes place in the thymic cortex while the thymocytes coexpress CD4 and CD8 and low levels of the TCR. Mature thymocytes expressing CD4 or CD8 and high levels of the TCR complex then leave the medulla for the periphery. Mice deficient in components of the CD3/TCR complex are arrested in development of their T cell repertoire 13.
13~
Database accession numbers PIR Human Human Human Human Human Human
pTCR~ chain CD3? CD3~ CD3a ~ chain ~ chain
A25468 A02245 A25769 A31768
SWISSPR OT
EMBL/GENBANK
REFERENCE
P09693 P04234 P07766 P20963
U36759 X04145 X03934 X03884 J04132 M33158
6 2o 21 22 23 24
A m i n o acid s e q u e n c e s of t h e invariant c h a i n s (The sequences of the clonotypic chains are reviewed in refs 9 and 19.) pTCR~
MAGTWLLLLL LPTGVGGTPF NGSALDAFTY SRSTQPMHLS LTCSCLCDPA DRRWGDTPPG FFRGDLPPPL
CD37
MEQGKGLAVL QSIKGNHLVK NLGSNAKDPR IVSIFVLAVG HLQGNQLRRN
VDGKQQMVVV TNLAHLSLPS QEPLRGTPGG RLRALGSHRL GSYLSSYPTC
ILAIILLQGT VYDYQEDGSV GMYQCKGSQN VYFIAGQDGV
LA LLTCDAEAKN ITWFKDGKMI GFLTEDKKKW KSKPLQVYYR MCQNCIELNA ATISGFLFAE RQSRASDKQT LLPNDQLYQP LKDREDDQYS
-I 50 i00 150 160
VLATLLSQVS RVFVNCNTSI KESTVQVHYR RLSGAADTQA
P TWVEGTVGTL LSDITRLDLG KRILDPRGIY MCQSCVELDP ATVAGIIVTD VIATLLLALG LLRNDQVYQP LRDRDDAQYS HLGGNWARNK
-i 50 i00 150
GLCLLSVGVW QTPYKVSISG SLKEFSELEQ IVIVDICITG VPNPDYEPIR
GQ TTVILTCPQY SGYYVCYPRG GLLLLVYYWS KGQRDLYSGL
PGSEILWQHN DKNIGGDEDD SKPEDANFYL YLRARVCENC KNRKAKAKPV TRGAGAGGRQ NQRRI
-i 50 i00 150 185
ILQAQLPITE CYLLDGILFI EYDVLDKRRG RRGKGHDGLY
A YGVILTALFL RVKFSRSAEP PAYQQGQNQL RDPEMGGKPR RKNPQEGLYN ELQKDKMAEA QGLSTATKDT YDALHMQALP PR
-i 50 i00 142
ILQAQLPITE CYLLDGILFI EYDVLDKRRG RRGKGHDGLY QSCASVFSIP
A YGVILTALFL RDPEMGGKPR QDSHFQAVQF TLWSPWPPSS
-i 50 i00 150 184
CLVLDVAPPG EELASWEPLV ALWLGVLRLL HPATETGGRE PAQAWCSRSR
LDSPIWFSAG CHTGPGAEGH LFKLLLFDLL ATSSPRPQPR LRAPSSSLGA
CD36 MEHSTFLSGL FKIPIEELED RCNGTDIYKD VFCFAGHETG
CD3e
MQSGTHWRVL DGNEEMGGIT KNIGSDEDHL MEMDVMSVAT RGQNKERPPP
-i 50 i00 150 200 250 265
ALGCPA PSLAPPIMLL GPSPATDGTW GEASTARTCP GPLPSPATTT RKPGSPVWGE QAGAA
chain MKWKALFTAA QSFGLLDPKL YNELNLGRRE YSEIGMKGER
~chain
MKWKALFTAA QSFGLLDPKL YNELNLGRRE YSEIGMKGER PKGESTQQSS
RVKFSRSAEP PAYQQGQNQL RKNPQEGLYN ELQKDKMAEA GNRREREGSE LTRTLGLRAR SSQL
L3~
CD3/TCR
References 1 Wilson, R.K. et al. (1988) Immunol. Rev. 101, 149-172. z Rowen, L. et al. (1996) Science 272, 1755-1762. 3 Lefranc, M.-P. et al. (1989) Eur. J. Immunol. 19, 989-994. 4 Satyanarayana, K. et al. (1988) Proc. Natl Acad. Sci. USA 85, 8166-8170. s Iwashima, M. et al. (1988) Proc. Natl Acad. Sci. USA 85, 8161-8165. 6 Del Porto, P. et al. (1995) Proc. Natl Acad. Sci. USA 92, 12105-12109. 7 Tunnacliffe, A. et al. (1987) EMBO J. 6, 2953-2957. 8 Tunnacliffe, A. et al. (1986) EMBO J. 5, 1245-1252. 9 Clevers, H.C. et al. (1988) Proc. Natl Acad. Sci. USA 85, 8156-8160.
lo 11 12 13 14 is 16 ~7 18 19 zo 21 22 23 24
t4(
Davis, M.M. (1990) Annu. Rev. Biochem. 59, 475-496. Allison, J.P. and Havran, W.L. (1991) Annu. Rev. Immunol. 9, 679-705. Chan, A.C. et al. (1994) Ann. Rev. Immunol. 12, 555-592. Malissen, B. and Malissen, M. (1996) Curr. Opin. Immunol. 8, 383-393. Garcia, K.C. et al. (1996) Science 274, 209-219. Garboczi, D.N. et al. (1996) Nature 384, 134-141. Fremont, D.H. et al. (1996) Curr. Opin. Immunol. 8, 93-100. Weiss, A. and Littman, D.R. (1994) Cell 76, 263-274. Wange, R. and Samelson, L.E. (1996) I m m u n i t y 5, 197-205. Immunogenetics (1995)42, 1-540. Krissansen, G.W. et al. (1986) EMBO J. 5, 1799-1808. van den Elsen, P. et al. (1984) Nature 312, 413-418. Gold, D.P. et al. (1986)Nature 321,431-434. Weissman, A.M. et al. (1988) Proc. Natl Acad. Sci. USA 85, 9709-9713. Jin, Y.-J. et al. (1990) Proc. Natl Acad. Sci. USA 87, 3319-3323.
CD4 !
T4, L3T4 (mouse), W 3 / 2 5 (rat)
Molecular weights Polypeptide 48 400
! !
I
I ! !
!
SDS-PAGE reduced unreduced
55 kDa 55 kDa
Carbohydrate N-linked sites O-linked
nil
VC
2
!
!
Human gene location and size 12pter-pl2; 33 kb 1
COOH
Domains
Is,I
LLP
CTA I YIC
vI1
GPS
Exon boundaries
CRS I
I1
,, ,
LTA
WTC I
FPL
i1
LAF
I
v
LTL
CEV I
I1
RAT
WQC I
tT.l L
I1 ' Io \2 KVL R R Q \ \ QCP
Tissue distribution CD4 is expressed on most thymocytes and approximately two-thirds of peripheral blood T cells, which constitute the CD8- cells 1. In human and rat but not in mouse, CD4 is expressed on monocytes and macrophages 1. Structure = =
The extracellular domain is made up of four IgSF domains. The structures of the N-terminal two domains and separately, the membrane-proximal two domains have been determined by X-ray crystallography, confirming that they are Ig-like 2"3. Domain 2 is characterized by an unusual disulfide within one ]/ sheet and domain 3 lacks a disulfide in the position conserved in most IgSF domains. Cat CD4 shows some unusual features with 17 residues inserted between domains 1 and 2 4. There is an additional Cys in domain 1 and the Cys in the unusual//strand C position in domain 2 is replaced with a Trp and there is an extra Cys in the fl strand F 4. The position of the N-terminus has been established for the rat homologue s. CD4 shows particularly close similarities in overall structure to the LAG-3 protein (see page 531). The cytoplasmic domain of CD4 is phosphorylated at Ser residues 408, 415, 431 when T cells are activated by antigen or phorbol esters 6.
141
CD4
Ligands and associated molecules CD4 domains 1 and 2 bind to MHC Class II antigen 2,3. There is evidence that CD4 domains 3 and 4 are involved in cis interactions with the CD3/TCR complex 7. The cytoplasmic domain interacts with a lymphocyte-specific tyrosine kinase called Lck through a CXCP motif 2,8,9. CD4 is a receptor for HIV-1 and the binding of the viral gpl20 protein is to a region of the Nterminal domain 2,3. Function CD4 is an accessory molecule in the recognition of foreign antigens in association with MHC Class II antigens by T cells 1,2,3. Interactions with MHC Class II and with Lck have been shown to have a role in CD4 function 7,1o. MAbs against CD4 inhibit T cell functions in vivo and in vitro 1. Database accession numbers Human Rat Mouse
Amino
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A02109 A27449 A02110
P01730 P05540 P06332
M12807 M 15 768 M13816
1 s 1
a c i d s e q u e n c e of h u m a n
MNRGVPFRHL KKVVLGKKGD LNDRADSRRS TANSDTHLLQ LQDSGTWTCT AFTVEKLTGS KKLPLHLTLP LTCEVWGPTS GQVLLESNIK RRQAERMSQI
LLVLQLALLP TVELTCTASQ LWDQGNFPLI GQSLTLTLES VLQNQKKVEF GELWWQAERA QALPQYAGSG PKLMLSLKLE VLPTWSTPVQ KRLLSEKKTC
CD4
AATQG KKSIQFHWKN IKNLKIEDSD PPGSSPSVQC KIDIVVLAFQ SSSKSWITFD NLTLALEAKT NKEAKVSKRE PMALIVLGGV QCPHRFQKTC
SNQIKILGNQ TYICEVEDQK RSPRGKNIQG KASSIVYKKE LKNKEVSVKR GKLHQEVNLV KAVWVLNPEA AGLLLFIGLG SPI
GSFLTKGPSK EEVQLLVFGL GKTLSVSQLE GEQVEFSFPL VTQDPKLQMG VMRATQLQKN GMWQCLLSDS IFFCVRCRHR
References 1 Parnes, J.R. (1989)Adv. Immunol. 44, 265-311. 2 Littman, D.R. (ed.) (1996) The CD4 Molecule. Curr. Top. Microbiol. Immunol. 205. 3 Sakihama, T. et al. (1995) Immunol. Today 16, 581-587. 4 Norimine, J. et al. (1992) Immunology 75, 74-79. s Clark, S.J. et al. (1987) Proc. Natl Acad. Sci. USA 84, 1649-1653. a Shin, J. et al. (1990) EMBO J. 9, 425-434. 7 Vignali, D.A.A. et al. (1996) J. Exp. Med. 183, 2097-2107. 8 Zamoyska, R. (1994) Immunity 1, 243-246. 9 Turner, J.M. et al. (1990)Cell 60, 755-757. lo Itano, A. et al. (1996)J. Exp. Med. 183, 731-741.
L42
-i 50 i00 150 200
250 300
350
400
433
CD5
T1, Leu-1, Ly-1
Molecular weights Polypeptide
52 163
SDS-PAGE reduced unreduced
67 kDa 67 kDa
Carbohydrate N-linked sites O-linked
2 +
Human gene location llq13
Domains
Isi
GQL ]
so
LTq
I i
GVV I
so
GTV LLC, I i
so
VTq
ITM! cYI
COOH
Tissue distribution CD5 is expressed on all mature T cells, most thymocytes 1,2 and on a subset of mature B cells 3
Structure The extracellular region consists of three scavenger receptor cysteine-rich domains 4. Domain 1 is separated from domain 2 by a connecting peptide rich in Thr and Pro residues which contains O-linked carbohydrate (McAlister, M. et al., unpublished). The cytoplasmic domain contains an ITAM-like motif s (and see Chapter 3).
Ligands and associated molecules CD5 coprecipitates with the TCR and, more directly, with Lck s. A report that CD5 purified from cells bound to CD72 has not been substantiated by functional experiments 6'z or biochemical studies (Brown, M.H., unpublished).
Function A role for CD5 in signal transduction is postulated based on stimulatory effects of mAbs s,6. CD5 is phosphorylated on tyrosine residues on T cell activation 5. Thymocytes from CD5 knockout CD5 -/- mice gave increased responses to receptor-mediated stimulation 6. A role for CD5 in thymocyte selection is suggested by altered expression of TCR when CD5 -/- mice were crossed with TCR transgenic mice 6. Inhibition of T-B interaction by a mouse CD5 mAb is consistent with a role in cell-cell recognition z.
....i....
L4~
CD5
Database accession numbers Human Mouse Rat
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A26396 A29079
P06127 P13379 P51882
X04391 M15177 D10728
8 1 2
Amino acid sequence of human CD5 MPMGSLQPLA RLSWYDPDFQ SQASKVCQRL CHSLGLTCLE EFYSGSLGGT GEPREHQPLP QSRLVGGSSI NSYRVLDAGD LAAGTVASII QNMSFHRNHT SSMQPDNSSD
<
L44
TLYLLGMLVA ARLTRSNSKC NCGVPLSLGP PQKTTPPTTR ISYEAQDKTQ IQWKIQNSSC CEGTVEVRQG PTSRGLFCPH LALVLLVVLL ATVRSHAENP SDYDLHGAQR
SCLG QGQLEVYLKD FLVTYTPQSS PPPTTTPEPT DLENFLCNNL TSLEHCFRKI AQWAALCDSS QKLSQCHELW VVCGPLAYKK TASHVDNEYS L
GWHMVCSQSW IICYGQLGSF APPRLQLVAQ QCGSFLKHLP KPQKSGRVLA SARSSLRWEE ERNSYCKKVF LVKKFRQKKQ QPPRNSRLSA
GRSSKQWEDP SNCSHSRNDM SGGQHCAGVV ETEAGRAQDP LLCSGFQPKV VCREQQCGSV VTCQDPNPAG RQWIGPTGMN YPALEGVLHR
-I 50 I00 150 200 250 300 350 400 450 471
References
1 2 3 4 s 6 7 s
Huang, H-J.S. et al. (1987) Proc. Natl Acad. Sci. USA 84, 204-208. Murakami, T. and Matsuura, A. (1992) Sapporo Med. J. 61, 13-26. Kantor, A.B. and Herzenberg, L.A. ( 1991 ) Annu. Rev. Immunol. 11, 501-538. Resnick, D. et al. (1994) Trends Biochem. Sci. 19, 5-8. Raab, M. et al. (1994) Mol. Cell. Biol. 14, 2862-2870. Tarakhovsky, A. et al. (1995) Science 269, 535-537. Muthukkumar, S. and Bondada, S. (1995)Int. Immunol. 7, 305-315. Jones, N.H. et al. (1986) Nature 323, 346-349.
T12, Tpl20
Molecular weights Polypeptide
69 365
SDS-PAGE reduced unreduced
100-130 kDa 117 kDa
Carbohydrate N-linked sites O-linked
8 +
Human gene location 11
Domains
GTV
Is!
I
GRV
VTC
so
!
I
J
sc
vvc ,
GQV I
i
I
~
so
vLc
c I
--
I TM I c~l
COON
Tissue distribution CD6 is expressed on peripheral blood T cells and medullary thymocytes 1. It is also expressed by B cell chronic lymphocytic leukaemias and has been found in brain 1
Structure The extracellular region contains three scavenger receptor cysteine-rich domains and thus resembles another T cell molecule, CD5 2. CD6 has a short membrane-proximal stalk 1 and contains polysulfated O-linked oligosaccharides a. The large cytoplasmic domain contains potential SH2, SH3, PKC and casein kinase-2 binding sites and may be alternatively spliced 4-6.
Ligands and associated molecules The ligand for CD6 is CD166 or activated leucocyte cell adhesion molecule. CD166 contains five IgSF domains and is present on thymic epithelial cells and activated lymphocytes, monocytes and neural cells 7,8. The interaction involves the N-terminal domain of CD166 and the membrane-proximal scavenger receptor cysteine-rich domain of CD6 8,9.
Function Through its interaction with CD 166 in the thymus, CD6 may have a role in T cell development. A role for CD6 in signal transduction is postulated based on stimulatory effects of mAbs 1,3,7. CD6 is hyperphosphorylated on serine and phosphorylated on tyrosine residues on activation a-s. [4~
CD6
Database accession numbers Human Mouse
St
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
$26741
P30203
X60992 U37543
4
1,5
Amino acid sequence of human CD6 MWLFFGITGL DQLNTSSAES DSRAAEAVCR PALLCSGAEW EMLEHGEWGS RDQVNCSGAE EGQVEVHFRG RMYYSCNGEE PASVQTVTIE RIKGKYALPV PEDSDSGSDS LEEGLEELHA SPKSKLPPWN SGEWYQNFQP
LTAALSGHPS ELWEPGERLP ALGCGGAEAA RLCEVVEHAC VCDDTWDLED AYLWDCPGLP VWNTVCDSEW LTLSNCSWRF SSVTVKIENK MVNHQHLPTT DYEHYDFSAQ SHIPTANPGH PQVFSSERSS PPQPPSEEQF
PAPP VRLTNGSSSC SQLAPPTPEL RSDGRRARVT AHVVCRQLGC GQHYCGHKED YPSEAKVLCQ NNSNLCSQSL ESRELMLLIP IPAGSNSYQP PPVALTTFYN CITDPPSLGP FLEQPPNLEL GCPGSPSPQP
SGTVEVRLEA PPPPAAGNTS CAENRALRLV GWAVQALPGL AGVVCSEHQS SLGCGTAVER AARVLCSASR SIVLGILLLG VPITIPKEVF SQRHRVTDEE QYHPRSNSES AGTQPAFSAG DSTDNDDYDD
SWEPACGALW VAANATLAGA DGGGACAGRV HFTPGRGPIH WRLTGGADRC PKGLPHSLSG SLHNLSTPEV SLIFIAFILL MLPIQVQAPP VQQSRFQMPP STSSGEDYCN PPADDSSSTS ISAA
References 1 2 3 4 s 6 7 s 9
t4(
Aruffo, A. et al. (1991) J. Exp. Med. 174, 949-952. Resnick, D. et al. (1994)Trends Biochem. Sci. 19, 5-8. Swack, J.A. et al. (1991 ) J. Biol. Chem. 266, 713 7- 7143. Robinson, W.H. et al. (1995)J. Immunol. 155, 4739-4748. Robinson, W.H. et al. (1995) Eur. J. Immunol. 25, 2765-2769. Whitney, G. et al. (1995) Mol. Immunol. 32, 89-92. Bowen, M.A. et al. (1995) J. Exp. Med. 181, 2213-2220. Whitney, G.S. et al. (1995) J. Biol. Chem. 270, 18187-18190. Bajorath, J. et al. (1995) Protein Science 4, 1644-1647.
-I 50 i00 150 200 250 300 350 400 450 500 550 600 644
CD7
gp40, Tp41
Molecular weights Polypeptide
22 919
SDS-PAGE reduced unreduced
40 kDa 38 kDa
$
Carbohydrate N-linked sites O-linked
2 probable +
Human gene size and location 17q25.2-q25.3; 3.5 kb 1 Domain
Exon boundaries
CST
[
I1 QEV
YTC| TEE TQIK
COOH
Tissue distribution CD7 is the earliest marker antigen expressed in the T lineage, being found on T cell precursors in fetal liver and thorax prior to thymic colonization and in thymus and bone marrow 2'3. CD7 is expressed on pluripotential haematopoietic cells, most human thymocytes and a major subset of peripheral blood T cells and NK cells 2'3. CD7 is a marker for pluripotential stem cell leukaemias and T cell acute lymphocytic leukaemia 2
Structure A single IgSF domain is separated from the transmembrane sequence by four repeats containing a high proportion of Pro, Ser and Thr residues 4. This region is likely to be O-glycosylated and to form an extended structure like that proposed for the CD8 hinge. The gene for CD7 is similar to the murine Thy-1 gene which places it in a class of tissue-specific genes whose promoters lack a TATA element 1,&s
Ligands and associated molecules No ligand for the extracellular region of CD7 has been identified. On crosslinking with mAb, CD7 associates with PI 3-kinase possibly through a YXXM motif in the cytoplasmic domain 7.
Function The function of CD7 is unknown. A proposal tha~ CD7 was an IgM receptor was not confirmed in expression studies using the cDNA clone 4. CD7 mAbs co-stimulate T cell proliferation and induce second messengers 6,7. Soluble
L47
recombinant CD7 has been reported to inhibit antigen-specific T cell proliferation and a mixed lymphocyte reaction 6.
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
S03520
P09564 P50283
X06180 D 10329
4 5
A m i n o acid sequence of h u m a n CD7 MAGPPRLLLL AQEVQQSPHC VVPTTDRRFR GTLVLVTEEQ ASALPAALAV SHSRCNTLSS
PLLLALARGL TTVPVGASVN GRIDFSGSQD SQGWHRCSDA ISFLLGLGLG PNQYQ
PGALA ITCSTSGGLR NLTITMHRLQ PPRASALPAP VACVLARTQI
GIYLRQLGPQ LSDTGTYTCQ PTGSALPDPQ KKLCSWRDKN
PQDIIYYEDG AITEVNVYGS TASALPDPPA SAACVVYEDM
References 1 2 3 4 s 6 7
14~
Schanberg, L.E. et al. (1991) Proc. Natl Acad. Sci. USA 88, 603-607. Haynes, B.F. et al. (1989)Immunol. Today 10, 87-91. Schanberg, L.E. et al. (1995) J. Immunol. 155, 2407-2418. Aruffo, A. and Seed, B. (1987) EMBO J. 6, 3313-3316. Yoshikawa, K. et al. (1995) Immunogenetics 41, 159-161. Leta, E. et al. (1995) Cell. Immunol. 165, 101-109. Ward, S. et al. (1995) Eur. J. Immunol. 25, 502-507.
-i 50 I00 150 200 215
CD8
T8, Lyt2/3 (mouse)
o~
Molecular weights Polypeptide ~ 23 552 fll 21351 SDS-PAGE reduced fl unreduced
32-34 kDa 32-34 kDa 68 kDa plus higher multimers
13
F
E
E
Carbohydrate N-linked sites
nil 1
O-linked
+
+
I.................
Human gene location and size chain: 2p12; 7 kb 1 //chain: 2; > 15 kb 1 Domain
CO8 ~
Exon boundaries
CQV
IsI
I
YFC i
ITMIcu
1'~ 11\2
1I LHA
AVH
CEA
Domain CO8 Exon boundaries
Isl
,, TVL
COOH COOH
[
v
YFC 1
PRP
1 [TMIcYI hi, I, V D CCR KGP
Tissue
distribution
CD8 is expressed on most thymocytes and approximately one-third of peripheral blood T cells, which constitute the CD4- cells 1 CD8afl heterodimers are expressed only on TCRafl cells whereas CD8e homodimers can be expressed on eft and 73 T cells and some NK cells 1-3
Structure
E
L
CD8 is expressed as a heterodimer of CD8a and CD8fl or as a CD8a homodimer 1-3 CD8e is required for expression of CD8fl 3. The IgSF domains of CD8~ and CD8fl are separated from transmembrane sequences by hinge regions rich in Pro, Ser and Thr residues containing O-linked carbohydrate 1-4, with four sites identified in rat CD8~ 4. The N-terminus of the mature polypeptide has been established by protein sequencing s. Alternative splicing gives rise to a soluble form of CD8a 1 and a soluble form
14~
CD8
is predicted for CD8fl 2. An alternatively spliced form of mouse CD8a, called CD8W, has a shortened cytoplasmic domain 1. Partial genomic structure of h u m a n CD8fl shows it has a similar organization to the mouse CD8fl gene 1. The X-ray crystal structure of the IgSF domain of h u m a n CD8a did not contain the abnormal disulfide bond found in biochemical analysis of mouse and rat CD8 6. In the mouse the genes for a and fl chains are only 36 kb apart and are closely linked to the IgK gene locus 1
Ligands and associated molecules The IgSF domain of CD8~ binds to the ~3 domain of MHC Class 13. Like CD4, a CXCP motif in the cytoplasmic domain of CD8~ mediates binding to the tyrosine kinase Lck 3
Function CD8 acts as a co-receptor with MHC Class I-restricted TCRs in antigen recognition 1,3,7. Analysis of mice lacking CD8~ or CD8fl show that the coreceptor function of CD8 is important for selection of MHC Class Irestricted CD8 + T cells during development 3.
Database accession numbers Human ~ Human fll Human f12 Rat a Rat fl Mouse a Mouse fl
3<
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A01999 S01649 S01873 A24637 A24184 A01998 A27619
P01732 P10966 P14860 P07725 P05541 P01731 P10300
M27161 X13444 X13445 X03015 X04310 M12825 X07698
s 2 z 1 1 1 1
Amino acid sequence of human CD8~ MALPVTALLL SQFRVSPLDR YLSQNKPKAA IMYFSHFVPV VHTRGLDFAC VKSGDKPSLS
PLALLLHAAR TWNLGETVEL EGLDTQRFSG FLPAKPTTTP DIYIWAPLAG ARYV
P KCQVLLSNPT KRLGDTFVLT APRPPTPAPT TCGVLLLSLV
SGCSWLFQPR LSDFRRENEG IASQPLSLRP ITLYCNHRNR
GAAASPTFLL YYFCSALSNS EACRPAAGGA RRVCKCPRPV
-i 50 i00 150 200 214
Amino acid sequence of human CD8fll MRPRLWLLLA LQQTPAYIKV LWDSAKGTIH PELTFGKGTQ TLGLLVAGVL
AQLTVLHGNS QTNKMVMLSC GEEVEQEKIA LSVVDFLPTT VLLVSLGVAI
V EAKISLSNMR VFRDASRFIL AQPTKKSTLK HLCCRRRRAR
IYWLRQRQAP SSDSHHEFLA NLTSVKPEDS GIYFCMIVGS KRVCRLPRPE TQKGPLCSPI LRFMKQFYK
References 1 Parnes, J.R. (1989)Adv. Immunol. 44, 265-311. z N o r m e n t , A.M. and Littman, D.R. (1988) EMBO J. 7, 3433-3439.
3 Zamoyska, R. (1994) Immunity 1, 243-246.
15C
-I 50 i00 150 189
4 s 6 7
Classon, B.J. et al. (1992)Int. Immunol. 4, 215-225. Littman, D.R. et al. (1985)Cell 40, 237-246. Leahy, D.J. et al. (1992) Cell 68, 1145-1162. Luescher, I.F. et al. (1995) Nature 373, 353-356.
151
CD9
MRP-1, DRAP27 (monkey)
Molecular weights Polypeptide
25 277
SDS-PAGE reduced unreduced
22-27 kDa 22-27 kDa
Carbohydrate N-linked sites O-linked
1 nil
NH 2
Human gene location 12p13; >20 kb 1
Tissue distribution CD9 has a broad tissue distribution. Amongst haematopoietic cells, CD9 is mainly expressed by platelets, lymphoid progenitor cells and activated lymphocytes. CD9 is expressed to a lesser extent by eosinophils, granulocytes, monocytes and macrophages 2.
Structure CD9 is a member of the TM4 superfamily and is predicted to have four transmembrane regions, short cytoplasmic N- and C-termini, and two extracellular regions (reviewed in ref. 3). CD9 is highly conserved amongst vertebrates, the human CD9 protein sharing 65 % and 59% sequence identity with the shark and hagfish CD9 homologues, respectively (Tomlinson, Flajnik and Barclay, unpublished).
Ligands and associated molecules CD9 can associate in non-covalent complexes with the TM4SF molecules CD63 and CD81 and the integrins CD29/CD49c (VLA-3) and CD29/CD49f (VLA-6) 4. In monkey kidney vero cells CD9 associates non-covalently with CD29/CD49c (VLA-3)and the membrane-anchored heparin-binding EGF-like growth factor at cell-cell contact sites s. CD9 associates non-covalently with the CD29 (ill) integrin subunit in a pre-B cell and a megakaryocytic cell line 6. No extracellular ligand has been identified for CD9.
i
Function A role for CD9 in cell adhesion through the regulation of integrin function and the activation of protein tyrosine kinases seems likely 5-9. CD9 mAbs promote pre-B cell adhesion in vitro via the integrins CD29/CD49d (VLA-4) and CD29/ CD49e (VLA-5)7. Studies using CD9-transfected cells showed that CD29 (ill) integrin-dependent motility of a B cell line was enhanced by CD9 expression and was dependent on protein tyrosine kinases, and another study showed that tumour cell motility and metastasis were suppressed by CD9
15~
CD9
expression 3,s. CD9 mAbs are potent activators of platelet aggregation and induce activation of the non-receptor protein tyrosine kinase Syk 9. In mouse T cells, a CD9 mAb has been shown to deliver a potent co-stimulatory signal lo.
Database accession numbers Human Rat Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A40402 $39262
P21926 P40241 P40240
M38690 X76489 L08115
11,12 13 14
Amino acid sequence of human CD9 MPVKGGTKSI KYLLFGFNFI FWLAGIAVLA NNNSSFYTGV YILIGAAALM MLVGFLGCCG AIEIAAAIWG YSHKDEVIKE VQEFYKDTYN NCCGLAGGVE QFISDICPKK DVLETFTVKS IGIAVVMIFGM IFSMILCCAI RRNREMV
IGLWLRFDSQ AVQESQCMLG KLKTKDEPQR CPDAIKEVFD
TKSIFEQETN LFFGFLLVIF ETLKAIHYAL NKFHIIGAVG
50 I00 150 200 228
References
!
I
1 Rubinstein, E. et al. (1993)Genomics 16, 132-138. z ]ennings, L.K. et al. (1995) Leucocyte Typing V, 1249-1251. 3 Wright, M.D. and Tomlinson, M.G. (1994) Immunol. Today 15, 588-594. 4 Berditchevski, F. et al. (1996) Mol. Biol. Cell 7, 193-207. s Nakamura, K. et al. (1995)]. Cell Biol. 129, 1691-1705. 6 Rubinstein, E. et al. (1994) Eur. J. Immunol. 24, 3005-3013. 7 Masellis-Smith, A. and Shaw, A.R.E. (1994) J. Immunol. 152, 2 7 6 8 - 2 7 7 7 . 8 Shaw, A.R.E. et al. (1995)]. Biol. Chem. 270, 2 4 0 9 2 - 2 4 0 9 9 . 9 0 z a k i , Y. et al. (1995) ]. Biol. Chem. 270, 15119-15124. lO Tai, X.-G. et al. (1996) ]. Exp. Med. 184, 753-758. 11 Lanza, F. et al. (1991) ]. Biol. Chem. 266, 10638-10645. 12 Boucheix, C. et al. (1991) J. Biol. Chem. 266, 117-122. 13 Kaprielian, Z. et al. (1995) J. Neurosci. 15, 562-573. 14 Rubinstein, E. et al. (1993) Thromb. Res. 71,377-383.
15~
C o m m o n acute lymphoblastic leukaemia antigen (CALLA)I
Other names Neutral endopeptidase (EC 3.4.24.11) (NEP) Neprilysin Enkephalinase gplO0 Molecular weights Polypeptide 85 607 SDS-PAGE reduced unreduced
100 kDa 100 kDa
Carbohydrate N-1inkedsites O-linked
6 unknown
Human gene location and size 3q21-q27; >80 kb 1
~~
~~~
NH2
Tissue distribution Human CD10 is expressed on early B and T lymphoid precursors, B blasts, some granulocytes and bone marrow stromal cells. CD10 is also expressed on various epithelia (with especially high expression on brush border of kidney and gut), some smooth muscle and myoepithelial cells, brain and fibroblasts. CD10 is widely used as a marker of common (pre-B) acute lymphocytic leukaemias and certain lymphomas 2 In contrast to the human, mouse CD10 is absent from B cells, T cells and granulocytes, but the expression pattern is similar to the human on bone marrow stromal cells and non-lymphoid tissues 3
Structure CD10 is a member of a group of type II membrane metalloproteases that includes the leucocyte antigens CD13, CD26, CD73 and BP-12. The CD10 glycoprotein has a short N-terminal cytoplasmic tail, a transmembrane region that functions as a signal peptide, and a large C-terminal extracellular region that contains six N-linked glycosylation sites. The extracellular domain also contains 12 cysteines which form disulfide bonds that are required for enzyme activity, and the characteristic pentapeptide motif (HEI/ L/MXH) associated with zinc binding and catalytic activity in a number of zinc-dependent metalloproteases 2 Other amino acids required for zinc binding, substrate binding and catalytic activity have been identified by alignment with the well-characterized bacterial metalloprotease thermolysin and by site-directed mutagenesis 2'4. Exons 1 and 2, encoding the 5' untranslated region, can be alternatively spliced to yield three CD 10 cDNAs with unique 5' untranslated sequences s.
i
L54
CD10
Function CD10 is a zinc-binding metalloprotease which is thought to downregulate cellular responses to peptide hormones. By hydrolysing peptide bonds on the amino side of hydrophobic amino acids, CD 10 reduces the local concentration of peptide available for receptor binding and signal transduction 2. CD10 can cleave a variety of biologically active peptides including opioid enkephalins, fMLP, substance P, bombesin-like peptides, atrial natriuretic factor, endothelin, oxytocin, bradykinin and angiotensins I and II 2 CD10 on neutrophils limits their inflammatory responses by degrading peptides such as fMLP, substance P and enkephalins 2. CD 10 on bone marrow stromal cells appears to regulate B cell development, since inhibition of CD10 enzyme activity in vivo enhances B cell maturation 6. Targeted disruption of the gene for CD10 suggests a role in the modulation of septic shock, since CD10deficient mice, which are otherwise grossly normal, exhibit enhanced lethality to endotoxin 7
E
Database accession numbers Human Rat Mouse
PIR A41387 A29295
SWISSPR OT P08473 P07861
EMBL/GENBANK Y00811
REFERENCE 8
M15944
9
M81591
4
Amino acid sequence of human CD 10 GKSESQMDIT YDDGICKSSD SRYGNFDILR RGGEPLLKLL NLFVGTDDKN VARLIRQEER KMRLAQIQNN LTKLKPILTK TSETATWRRC IQTLDDLTWM YKEDEYFENI QIVFPAGILQ DLVDWWTQQS NGGLGQAYRA YAVNSIKTDV
DINTPKPKKK CIKSAARLIQ DELEVVLKDV PDIYGWPVAT SVNHVIHIDQ LPIDENQLAL FSLEINGKPF YSARDLQNLM ANYVNGNMEN DAETKKRAEE IQNLKFSQSK PPFFSAQQSN ASNFKEQSQC YQNYIKKNGE HSPGNFRIIG
QRWTRLEISL NMDATTEPCR LQEPKTEDIV ENWEQKYGAS PRLGLPSRDY EMNKVMELEK SWLNFTNEIM SWRFIMDLVS AVGRLYVEAA KALAIKERIG QLKKLREKVD SLNYGGIGMV MVYQYGNFSW EKLLPGLDLN TLQNSAEFSE
SVLVLLLTII DFFKYACGGW AVQKAKALYR WTAEKAIAQL YECTGIYKEA EIANATAKPE STVNISITNE SLSRTYKESR FAGESKHVVE YPDDIVSNDN KDEWISGAAV IGHEITHGFD DLAGGQHLNG HKQLFFLNFA AFHCRKNSYM
AVRMIALYAT LKRNVIPETS SCINESAIDS NSKYGKKVLI CTAYVDFMIS DRNDPMLLYN EDVVVYAPEY NAFRKALYGT DLIAQIREVF KLNNEYLELN VNAFYSSGRN DNGRNFNKDG INTLGENIAD QVWCGTYRPE NPEKKCRVW
5O i00 150 2O0 25O 300 350 400 450 5O0 550 600 650 700 749
References 1 D'Adamio, L. et al. (1989) Proc. Natl Acad. Sci. USA 86, 7103-7107. ! 2 Shipp, M.A. and Look, A.T. (1993) Blood 82, 1052-1070. I 3 Kalled, S.L. et al. (1995) Eur. J. Immunol. 25, 677-687. 4 Chen, C.-Y. et al. (1992) J. Immunol. 148, 2817-2825. s Ishimaru, F. and Shipp, M.A. (1995) Blood 85, 3199-3207. 6 Salles, G. et al. (1993) Proc. Natl Acad. Sci. USA 90, 7618-7622. z Lu, B. et al. (1995) J. Exp. Med. 181, 2271-2275. 8 Letarte, M. et al. (1988) J. Exp. Med. 168, 1247-1253. 9 Malfroy, B. et al. (1987) Biochem. Biophys. Res. Commun. 144, 59-66. !
t5~
CDlla
!
I i
I
LFA-1 ~ subunit, integrin ~L subunit
Molecular weights Polypeptide 126 195 SDS-PAGE reduced unreduced Carbohydrate N-linked sites O-linked sites
180 kDa 170 kDa
'k 12 unknown
Human gene location and size 16 pll-13.1; >32kb 1
CD11 a/CD18
Tissue distribution CD1 la is expressed on lymphocytes, granulocytes, monocytes macrophages, with increased levels on memory T cells 1-4.
and
Structure CD1 la (integrin aL subunit)combines with CD18 (integrin//2 subunit)to form the integrin LFA-1 (aLfl2, CDlla/CD18). The ~L subunit belongs to the subclass of integrin a subunits with an I-domain near the N-terminus s. A crystal structure of the I-domain has been obtained 6. Ligands and associated molecules The three ligands for LFA-1 are CD54 (ICAM-1), CD102 (ICAM-2)and CDS0 (ICAM-3), each of which contains IgSF domains 7,8.
5--
Function CD 11 a/CD 18 (LFA- 1) was first described as an accessory molecule in cytotoxic lymphocyte killing 9. It was subsequently found to mediate lymphocyte adhesion to many cells including endothelium 2-4'7'8. The avidity of C D l l a / CD18 to ligands is transiently upregulated on T cells upon activation. This may involve both aggregation of CD 11 a/CD 18 on the cell surface and conformational change of the CDlla/CD18 antigen 8'1~ CDlla/CD18 has been shown to bind bacterial lipopolysaccharides 12. The binding activity of CD1 la/CD18 involves the I-domain of CD1 la 13. CD1 la/CD18 is not expressed on leucocytes of patients with leucocyte adhesion deficiency (see CD18).
t5(
CDlla
Database accession numbers Human Mouse
PIR
SWISSPROT
EMBL/GENBANK
REFERENCE
S03308
P20701 P24063
Y00796 M60778
5 14
A m i n o acid s e q u e n c e of h u m a n CD 1 l a MKDSCITVMA YNLDVRGARS GTGHCLPVTL SGLCYLFRQN LDFMKDVMKK HMLLLTNTFG AKDIIRYIIG ELQKKIYVIE GFLDLKADLQ RYQHMGRVLL ELLLIGAPLF TALTDINGDG SGIQWFGRSI PAEIPVHEVE QLDGHRTRRR NVSLNFSLWE KCEANLRVSF GLSFRKVEML LQMMFNTLVN LIQDQEDSTL PQPPSEGPIT RQEILVQVIG QVVMKVDVVY EAGRGVPNGI
MALLSGFFFF FSPPRAGRHF RGSNYTSKYL LQGPMLQGRP LSNTSYQFAA AINYVATEVF IGKHFQTKES GTSKQDLTSF DDTFIGNEPL FQEPQGGGHW YGEQRGGRVF LVDVAVGAPL HGVKDLEGDG CSYSTSNKMK GLFPGGRHEL EEGTPRDQRA SPARSRALRL KPHSQIPVSC SSWGDSVELH YVSFTPKGPK HQWSVQMEPP TLELVGEIEA EKQMLYLYVL PAEDSEQLAS
APASS GYRVLQVGNG GMTLATDPTD GFQECIKGNV VQFSTSYKTE REELGARPDA QETLHKFASK NMELSSSGIS TPEVRAGYLG SQVQTIHGTQ IYQRRQLGFE EEQGAVYIFN LADVAVGAES EGVNITICFQ RRNIAVTTSM QGKDIPPILR TAFASLSVEL EELPEESRLL ANVTCNNEDS IHQVKHMYQV VPCHYEDLER SSMFSLCSSL SGIGGLLLLL GQEAGDPGCL
VIVGAPGEGN GSILACDPGL DLVFLFDGSM FDFSDYVKWK TKVLIIITDG PASEFVKILD ADLSRGHAVV YTVTWLPSRQ IGSYFGGELC EVSELQGDPG GRHGGLSPQP QMIVLSSRPV IKSLYPQFQG SCTDFSFHFP PSLHSETWEI SLSNLEEDAY SRALSCNVSS DLLEDNSATT RIQPSIHDHN LPDAAEPCLP SISFNSSKHF LIFIVLYKVG KPLHEKDSES
STGSLYQCQS SRTCDQNTYL SLQPDEFQKI DPDALLKHVK EATDSGNIDA TFEKLKDLFT GAVGAKDWAG KTSLLASGAP GVDVDQDGET YPLGRFGEAI SQRIEGTQVL VDMVTLMSFS RLVANLTYTL VCVQDLISPI PFEKNCGEDK WVQLDLHFPP PIFKAGHSVA IIPILYPINI IPTLEAVVGV GALFRCPVVF HLYGSNASLA FFKRNLKEKM GGGKD
-i 50 i00 150 200 250 300 350 400 45O 5OO 55O 600 650 7OO 75O 8OO 85O 9OO 950 i000 1050 ii00 1145
References 1 Larson, R.S. and Springer, T.A. (1990)Immunol. Rev. 114, 181-217. 2 Pigott, R. and Power, C. (1993) The Adhesion Molecule FactsBook. Academic Press, London. 3 Patarroyo, M. et al. (1990)Immunol. Rev. 114, 67-108. 4 Rieu, P. and Arnaout, M.A. (1995) In Adhesion Molecules and the Lung (Ward, P. and Lenfart, C., eds), vol. 89, pp. 1-42. Marcel Dekker, New York. s Larson, R.S. et al. (1989) J. Cell Biol. 108, 703-712. 6 Qu, A. and Leahy, D.J. (1995) Proc. Natl Acad. Sci. USA 92, 10277-10281. 7 Springer, T.A. (1990) Nature 346, 425-433. s Springer, T.A. (1994) Cell 76, 301-314. 9 Davignon, D. et al. (1981) Proc. Natl Acad. Sci. USA 78, 4535-4539. lo Dustin, M.L. and Springer, T.A. (1991)Annu. Rev. Immunol. 9, 27-66. 11 Lub, M. et al. (1995) Immunol. Today 16, 479-483. 12 Wright, S.D. and Jong, M.T.C. (1986)J. Exp. Med. 164, 1876-1888. 13 Landis, R.C. et al. (1994) J. Cell Biol. 126, 529-537. 14 Kaufman, Y. et al. (1991) J. Immunol. 149, 369-3 74.
15~
CDllb
Mac-1 (Mo-1, CR3) a subunit, integrin aM subunit
Molecular weights Polypeptide 125 600 SDS-PAGE reduced unreduced
Carbohydrate N-linked sites O-linked sites
170 kDa 165 kDa
3 1 0--0,
19 unknown
tr"
Human gene location 16 pll-13.1
CD11 b/CD18
Tissue distribution CD 1 lb is expressed mainly on myeloid and NK cells 1-4.
Structure !
C D l l b (integrin aM subunit)combines with CD18 (integrin fi2 subunit)to form the integrin Mac-1 (aMfl2, CDllb/CD18). The aM subunit belongs to the subclass of integrin a subunits with an I-domain near the N-terminus s-7. Two crystal structures of the I-domain, possibly representing forms with active and an inactive conformations, have been obtained 8'9. The Nterminal sequence of CD 1 lb has been determined lo, ll
Ligands and associated molecules Ligands reported for CDllb/CD18 include the complement fragment iC3b, CD54 (ICAM-1), fibrinogen, factor X, CD23, the neutrophil inhibitory factor (NIF) of canine hookworm, heparin, fi-glucan, and bacterial lipopolysaccharide 12-19. It has also been reported that CD 1 l b/CD 18 binds denatured proteins 14'2~ There is increasing evidence that CDllb/CD18 is associated with many membrane proteins at the cell surface including CD14, CD87 21, and the Fc? receptors CD 16 and CD32 21,22.
Function CD1 lb/CD18 (Mac-l) is also known as the complement receptor type 3 (CR3) because of its binding to the iC3b complement fragment on opsonized targets12'13. It also mediates the subsequent ingestion process 23. C D l l b / CD18 is also important in the transendothelial migration of monocytes and neutrophils 24. Its association with other membrane protiens may account
15~
CDllb
for the many signalling functions of CD 1lb/CD 18. Most binding activities of C D l l b / C D 1 8 involve the I-domain of C D l l b 14'1s C D l l b / C D 1 8 is not expressed on leucocytes of patients with leucocyte adhesion deficiency (see CD18).
Database accession numbers PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
Human
A31108
Pl1215
Mouse
S00551
P05555
J03925 M 18044 J04145 X07640
5 6 7 25
A m i n o acid s e q u e n c e of h u m a n CD 1 l b MALRVLLLTA FNLDTENAMT DYSTGSCEPI TYVKGLCFLF RRMKEFVSTV PITQLLGRTH GYEDVIPEAD NFEALKTIQN STVGSYDWAG QSLVLGAPRY DSNGSTDLVL PWGRFGAALT SHSQRIAGSK PVLRVKAIME REGQIQSVVT LQLPNCIEDP FEKNCGNDNI TQVTFFFPLD INHPIFPENS QLELPVKYAV PISLVFLVPV KAPVVNCSIA IVSTAEILFN GLLLLALITA
LTLCHG FQENARGFGQ RLQVPVEAVN GSNLRQQPQK MEQLKKSKTL TATGIRKVVR REGVIRYVIG QLREKIFAIE GVFLYTSKEK QHIGLVAMFR IGAPHYYEQT VLGDVNGDKL LSPRLQYFGQ FNPREVARNV YDLALDSGRP VSPIVLRLNF CQDDLSITFS LSYRKVSTLQ EVTFNITFDV YMVVTSHGVS RLNQTVIWDR VCQRIQCDIP DSVFTLLPGQ ALYKLGFFKR
SVVQLQGSRV MSLGLSLAAT FPEALRGCPQ FSLMQYSEEF ELFNITNGAR VGDAFRSEKS GTQTGSSSSF STFINMTRVD QNTGMWESNA RGGQVSVCPL TDVAIGAPGE SLSGGQDLTM FECNDQVVKG HSRAVFNETK SLVGTPLSAF FMSLDCLVVG NQRSQRSWRL DSKASLGNKL TKYLNFTASE PQVTFSENLS FFGIQEEFNA GAFVRSQTET QYKDMMSEGG
VVGAPQEIVA TSPPQLLACG EDSDIAFLID RIHFTFKEFQ KNAFKILVVI RQELNTIASK EHEMSQEGFS SDMNDAYLGY NVKGTQIGAY PRGQRARWQC EDNRGAVYLF DGLVDLTVGA KEAGEVRVCL NSTRRQTQVL GNLRPVLAED GPREFNVTVT ACESASSTEV LLKANVTSEN NTSRVMQHQY STCHTKERLP TLKGNLSFDW KVEPFEVPNP PPGAEPQ
ANQRGSLYQC PTVHQTCSEN GSGSIIPHDF NNPNPRSLVK TDGEKFGDPL PPRDHVFQVN AAITSNGPLL AAAIILRNRV FGASLCSVDV DAVLYGEQGQ HGTSGSGISP QGHVLLLRSQ HVQKSTRDRL GLTQTCETLK AQRLFTALFP VRNDGEDSYR SGALKSTSCS NMPRTNKTEF QVSNLGQRSL SHSDFLAELR YIKTSHNHLL LPLIVGSSVG
-i 5O i00 150 200 250 300 350 400 45O 5OO 55O 600 650 7OO 75O 8OO 850 9OO 95O i000 1050 ii00 1137
References 1 Pigott, R. and Power, C. (1993)The Adhesion Molecule FactsBook, Academic Press, London. 2 Larson, R.S. and Springer, T.A. (1990)Immunol. Rev. 114, 181-217. 3 Patarroyo, M. et al. (1990)Immunol. Rev. 114, 67-108. 4 Rieu, P. and Arnaout, M.A. (1995) In Adhesion Molecules and the Lung (Ward, P. and Lenfart, C., eds), vol. 89, pp. 1-42. Marcel Dekker, New York. 5 Corbi, A.L. et al. (1988) J. Biol. Chem. 263, 12403-12411. 6 Arnaout, M.A. et al. (1988) J. Cell Biol. 106, 2153-2158. 7 Hickstein, D.D. et al. (1989) Proc. Natl Acad. Sci. USA 86, 257-261. s Lee, J.O. et al. (1995)Cell 80, 631-638. 9 Lee, J.O. et al. (1995) Structure 3, 1333-1341]. lo Pierce, M.W. et al. (1986) Biochim. Biophys. Acta 87, 368-371.
15~
11 12 13 14 is 16 17 18 19 2o 2~ 22 23 24 2s
L6C
Miller, L.J. et al. (1987)J. Immunol. 138, 2381-2383. Beller, D.I. et al. (1982) J. Exp. Med. 156, 1000-1009. Wright, S.D. et al. (1983) Proc. Natl Acad. Sci. USA 80, 5699-5703. Zhang, L. and Plow, E.F. (1996) J. Biol. Chem. 271, 18211-18216. Diamond, M.S. et al. (1993) J. Cell Biol. 120, 1031-1043. Lecoanet-Henchoz, S. et al. (1995) Immunity 3, 119-125. Diamond, M.S. et al. (1995) J. Cell Biol. 130, 1473-1482. Thornton, B.P. et al. (1996) J. Immunol. 156, 1235-1246. Wright, S.D. and Jong, M.T.C. (1986) J. Exp. Med. 164, 1876-1888. Davis, G.E. (1992) Exp. Cell Res. 200, 242-252. Petty, H.R. and Todd, R.F. III. (1996) Immunol. Today 17, 209-212. Annendov, A. et al. (1996) Eur. J. Immunol. 26, 207-212. Gresham, H.D. et al. (1991)J. Clin. Invest. 88, 588-596. Springer, T.A. (1994) Cell 76, 301-314. Pytela, R. (1988)EMBO J. 7, 1371-1378.
p150,95 ~ subunit, integrin c~X subunit
E
Molecular weights Polypeptide
125 897
SDS-PAGE reduced unreduced
150 kDa 145 kDa
Carbohydrate N-linked sites O-linked sites
8 unknown
Human gene location and size 16pl 1-13.1; 25 kb 1
CD11 c/CD18
Tissue distribution C D l l c is expressed mainly on myeloid cells with high levels on tissue macrophages. However, it is also found on NK cells, activated T cells and lymphoid cell lines including hairy cell leukaemias 1-4 Structure CD 11 c (integrin aX subunit) combines with CD 18 (integrin//2 subunit) to form the integrin p150,95 (~Xfl2, CDllc/CD18). The aX subunit belongs to the subclass of integrin a subunits with an I-domain near the N-terminus s. The N-terminal sequence of CD 11 c has been determined 6 Ligands and associated molecules CD1 lc/CD18 (p150,95)binds multiple ligands including the complement fragment iC3b, CD54 (ICAM-1), fibrinogen and bacterial lipopolysaccharide 7-12. It has also been reported that CD11 c/CD 18 binds denatured proteins 13. I
I !
i !
i !
Function CD 11 c/CD 18 has been described to play important roles in cytotoxic T cell killing, and in neutrophil and monocyte adhesion to endothelium, although its ligands in these two cases have not been identified 14 915 . An antibody against rabbit CD 11 c/CD 18 has been reported to induce T cell aggregation 1 6 . C D l l c / C D 1 8 is not expressed on leucocytes of patients with leucocyte adhesion deficiency (see CD18). Database accession numbers Human
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A36584
P20702
M81695 Y00093
s
L6]
CDllc
Amino acid sequence of human CD 11c MTRTRAALLL FNLDTEELTA GYSTGACEPI MYLTGLCFLL MMNFVRAVIS HQLQGFTYTA KDVIPMADAA DALKDIQNQL VGSFTWSGGA LVLGAPRYQH DGSTDLVLIG RFGAALTVLG QRIAGSQLSS WVGVSMQFIP DLQSSVTLDL PSCVEDSVTP NCGADHICQD TFSHPAGLSY FRGGAQITFL VKYAVYTVVS NFWVPVELNQ LDCSIAGCLR AEITFDTSVY LALITAVLYK
FTALATSLG FRVDSAGFGD GLQVPPEAVN GPTQLTQRLP QFQRPSTQFS TAIQNVVHRL GIIRYAIGVG KEKIFAIEGT FLYPPNMSPT TGKAVIFTQV APHYYEQTRG DVNGDKLTDV RLQYFGQALS AEIPRSAFEC ALDPGRLSPR ITLRLNFTLV NLGISFSFPG RYVAEGQKQG ATFDVSPKAV SHEQFTKYLN EAVWMDVEVS FRCDVPSFSV SQLPGQEAFM VGFFKRQYKE
SVVQYANSWV MSLGLSLAST VSRQECPRQE LMQFSNKFQT FHASYGARRD LAFQNRNSWK ETTSSSSFEL FINMSQENVD SRQWRMKAEV GQVSVCPLPR VIGAPGEEEN GGQDLTQDGL REQVVSEQTL ATFQETKNRS GKPLLAFRNL LKSLLVGSNL QLRSLHLTCD LGDRLLLTAN FSESEEKESH HPQNPSLRCS QEELDFTLKG RAQTTTVLEK MMEEANGQIA
VVGAPQKITA TSPSQLLACG QDIVFLIDGS HFTFEEFRRT ATKILIVITD ELNDIASKPS EMAQEGFSAV MRDSYLGYST TGTQIGSYFG GWRRWWCDAV RGAVYLFHGV VDLAVGARGQ VQSNICLYID LSRVRVLGLK RPMLAALAQR ELNAEVMVWN SAPVGSQGTW VSSENNTPRT VAMHRYQVNN SEKIAPPASD NLSFGWVRQI YKVHNPTPLI PENGTQTPSP
ANQTGGLYQC PTVHHECGRN GSISSRNFAT SNPLSLLASV GKKEGDSLDY QEHIFKVEDF FTPDGPVLGA ELALWKGVQS ASLCSVDVDT LYGEQGHPWG LGPSISPSHS VLLLRTRPVL KRSKNLLGSR AHCENFNLLL YFTASLPFEK DGEDSYGTTI STSCRINHLI SKTTFQLELP LGQRDLPVSI FLAHIQKNPV LQKKVSVVSV VGSSIGGLLL PSEK
-i 50 I00 150 200 250 300 350 400 450 500 550 600 650 7OO 75O 80O 850 900 95O i000 1050 ii00 1144
References 1 Larson, R.S. and Springer, T.A. (1990) Immunol. Rev. 114, 181-217. z Pigott, R. and Power, C. (1993) The Adhesion Molecule FactsBook. Academic Press, London. 3 Patarroyo, M. et al. (1990) Immunol. Rev. 114, 67-108. 4 Rieu, P. and Amaout, M.A. (1995) In Adhesion Molecules and the Lung (Ward, P. and Lenfart, C., eds), vol. 89, pp. 1-42. Marcel Dekker, New York. s Corbi, A.L. et al. (1987) EMBO J. 6, 4023-4028. 6 Miller, L.J. et al. (1987)J. Immunol. 138, 2381-2383. 7 Myones, B.L. et al. (1988) J. Clin. Invest. 82, 640-651. 8 de Fougerolles, A.R. et al. (1995) Eur. J. Immunol. 25, 1008-1012. 9 Loike, J.D. et al. (1991) Proc. Natl Acad. Sci. USA 88, 1044-1048. lo Postigo, A.A. et al. (1991) J. Exp. Med. 174, 1313-1322. 11 Wright, S.D. and Jong, M.T.C. (1986)J. Exp. Med. 164, 1876-1888. 12 Ingalls, R.R. and Golenbock, D.T. (1995)J. Exp. Med. 181, 1473-1479. 13 Davis, G.E. (1992) Exp. Cell Res. 200, 242-252. 14 Keizer, G.D. et al. (1987) J. Immunol. 138, 3130-3136. is Stacker, S.A. and Springer, T.A. (1991)J. Immunol. 146, 648-655. 16 Blackford, J. et al. (1996) Eur. J. Immunol. 26, 525-531.
[62
lntegrin aD subunit
CDlld
Molecular weights Polypeptide
125 096
SDS-PAGE reduced
150 kDa ....Q
Carbohydrate N-linked sites O-linked sites
10 unknown
o~D/132(CD18)
Tissue distribution The integrin aD subunit is expressed at moderate levels on peripheral blood leucocytes. It is expressed strongly on specialized cells in tissues, for example splenic red pulp macrophages and foamy macrophages in aortic fatty streaks 1
Structure The integrin aD subunit combines with CD 18 (integrin fi2 subunit)to form the integrin aDfi2. The aD subunit belongs to the subclass of integrin ~ subunits with an I-domain near the N-terminus 1. The N-terminus has been determined by protein sequencing 1.
Ligands and associated molecules The integrin ~Dfl2 binds CD50 (ICAM-3), but not CD54 (ICAM-1) or CD 106 (VCAM-1) 1
Database accession numbers PIR
Human
SWISSPR OT
EMBL/GENBANK
REFERENCE
U37028
1
162
Integrin 2D subunit
Amino acid sequence of human integrin ~D subunit MTFGTVLLLS FNLDVEEPTI AAATGMCQPI SYSKGSCLLL QMKGFVQAVM IVQLKGLTFT YSDVIPQAEK FAALGSIQKQ AVGSFSWSGG NLVLGAPRYQ SDGSTDLILI WGRFGAALTV HSQRIASSQL VLKVGVAMRF DIQSSVRFDL PDCVEDVVSP NCGQDGLCEG SLYYPAGLSH FHEGSNGTFI VKYAVYTMIS NFWVPVLLNG CSIADCLQFR ITFDTSVYSQ LITATLYKLG
I-i
164
VLASYHG FQEDAGGFGQ PLHIRPEAVN GSRWEIIQTV GQFEGTDTLF ATGILTVVTQ AGIIRYAIGV LQEKIYAVEG AFLYPPNMSP HTGKAVIFTQ GAPHYYEQTR LGDVNEDKLI SPRLQYFGQA SPVEVAKAVY ALDPGRLTSR IILHLNFSLV DLGVTLSFSG RRVSGAQKQP VTFDVSYKAT RQEESTKYFN VAVWDVVMEA CDVPSFSVQE LPGQEAFMRA FFKRHYKEML
SVVQFGGSRL MSLGLTLAAS PDATPECPHQ ALMQYSNLLK LFHHKNGARK GHAFQGPTAR TQSRASSSFQ TFINMSQENV VSRQWRKKAE GGQVSVCPLP DVAIGAPGEQ LSGGQDLTQD RCWEEKPSAL AIFNETKNPT REPIPSPQNL LQTLTVGSSL HQSALRLACE LGDRMLMRAS FATSDEKKMK PSQSLPCVSE ELDFTLKGNL QMEMVLEEDE EDKPEDTATF
VVGAPLEVVA TNGSRLLACG EMDIVFLIDG IHFTFTQFRT SAKKILIVIT QELNTISSAP HEMSQEGFST DMRDSYLGYS VTGTQIGSYF RGQRVQWQCD ENRGAVYLFH GLMDLAVGAR EAGDATVCLT LTRRKTLGLG RPVLAVGSQD ELNVIVTVWN TVPTEDEGLR ASSENNKASS EAEHRYRVNN RKPPQHSDFL SFGWVRETLQ VYNAIPIIMG SGDDFSCVAP
ANQTGRLYDC PTLHRVCGEN SGSIDQNDFN SPSQQSLVDP DGQKYKDPLE PQDHVFKVDN ALTMDGLFLG TELALWKGVQ GASLCSVDVD AVLRGEQGHP GASESGISPS GQVLLLRSLP IQKSSLDQLG IHCETLKLLL LFTASLPFEK AGEDSYGTVV SSRCSVNHPI SKATFQLELP LSQRDLAISI TQISRSPMLD KKVLVVSVAE SSVGALLLLA NVPLS
Reference 1 Van der Vieren, M. et al. (1995) Immunity 3, 683-690.
-i 5O i00 150 200 250 300 350 400 450 5OO 55O 600 650 70O 750 8O0 850 9OO 95O i000 1050 ii00 1145
Molecular weights SDS-PAGE reduced unreduced
90-120 kDa 150-160 kDa
Tissue distribution C D w l 2 is expressed on monocytes, granulocytes, NK cells and platelets 1-3.
Structure Unknown.
Function Unknown.
References 1 Knapp, W. (1989)Leucocyte Typing IV, 781. 2 Todd, R.F. (1995) Leucocyte Typing V, 771. 3 van der Schoot, C.E. et al. (1989) Leucocyte Typing IV, 868-878.
165
CD13
Aminopeptidase N (EC 3.4.11.2), gpl50, p161 (mouse)
Molecular weights Polypeptide
109 512
SDS-PAGE reduced unreduced
150-170 kDa 150-170 kDa
Carbohydrate N-linked sites O-linked
11 + abundant
Human gene location and size 15q25-q26; 20 kb 1 NH2
NH2
Tissue distribution CD 13 is expressed by granulocytes and monocytes and their precursors. CD 13 is a marker for most acute myeloid leukaemias and a smaller proportion of acute lymphoid leukaemias. Various non-haematopoietic cells express CD13, including epithelial cells from renal proximal tubules and intestinal brush border, endothelial cells, fibroblasts, brain cells, bone marrow stromal cells, osteoclasts and cells lining the biliary caniculae a.
Structure CD13 is a member of a group of type II integral membrane metalloproteases that includes the leucocyte antigens CD10, CD26, CD73 and BP-12. In common with CD10, the expression of CD13 appears to be controlled by distinct promoters in different cell types, and several CD13 transcripts have been identified that differ only in their 5' untranslated region a. The CD13 glycoprotein has a short N-terminal cytoplasmic tail, a transmembrane region that functions as a signal peptide, and a large C-terminal extracellular region that contains 11 N-linked glycosylation sites and also O-linked glycosylation. The extracellular domain contains the characteristic pentapeptide motif (His-Glu-Ile/Leu/Met-Xaa-His) associated with zinc binding and catalytic activity in a number of zinc-dependent metalloproteases. CD13 is expressed as a non-covalently linked homodimer 2.
i
Ligands and associated molecules CD13 is a receptor for coronaviruses, RNA viruses that cause respiratory disease in humans and several species of animals 2. The binding site on CD 13 for the swine coronavirus TGEV (transmissible gastroenteritis virus)is distinct from the enzymatic site 3.
Function CD13 is a zinc-binding metalloprotease which plays a role in cell surface antigen presentation by trimming the N-terminal amino acids from MHC
16~
CD13
Class II-bound peptides 4. CD 13 ectopeptidase activity is also thought to downregulate cellular responses to peptide hormones by reducing the local concentration of peptide available for receptor binding 2. Neutral amino acids are preferentially cleaved by CD13, although basic and acidic residues can also be removed. Peptide substrates for CD13 include opioid peptides and enkephalins in the brain, the phagocytosis-stimulating tetrapeptide tuftsin, and the neutrophil chemoattractant fMLP. CD13 appears to act in concert with another metalloprotease, CD10, in the hydrolysis of these peptides 2. Unlike CD 10, CD 13 activity is inhibited by the peptide hormones substance P and bradykinin s. CD13 is upregulated by the anti-inflammatory cytokine IL-4, which suggests a possible indirect mechanism of IL-4 action through the modulation of cell surface antigen processing and/or bioactive peptides 6. CD13 also appears to play a role, by a mechanism that is unclear, in the infection of cells by human cytomegalovirus (CMV), a herpesvirus 7.
Database accession numbers Human Rat Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
S01658 A32852
P15144 P15684
X13276 M25073 U77083
8,9 lo 11
A m i n o acid s e q u e n c e of h u m a n C D 13 MAKGFYISKS PSASATTNPA YVFKGSSTVR DKTELVEPTE GNVRKVVATT PKGPSTPLPE RIWARPSAIA AGAMENWGLV IEWWNDLWLN SSHPLSTPAS ASYLHTFAYQ VITVDTSTGT WLIDVRAQND SAIPVINRAQ SLSYFKLMFD YSEVNAISTA IAQGGEEEWD PDLIRKQDAT LIQAVTRRFS VKENKEVVLQ
LGILGILLGV SATTLDQSKA FTCKEATDVI YLVVHLKGSL QMQAADARKS DPNWNVTEFH AGHGDYALNV TYRENSLLFD EGFASYVEYL EINTPAQISE NTIYLNLWDH LSQEHFLLDP LFSTSGNEWV IINDAFNLAS RSEVYGPMKN CSNGVPECEE FAWEQFRNAT STIISITNNV TEYELQQLEQ WFTENSK
AAVCTIIALS WNRYRLPNTL IIHSKKLNYT VKDSQYEMDS FPCFDEPAMK TTPKMSTYLL TGPILNFFAG PLSSSSSNKE GADYAEPTWN LFDAISYSKG LQEAVNNRSI DSNVTRPSEF LLNLNVTGYY AHKVPVTLAL YLKKQVTPLF MVSGLFKQWM LVNEADKLRA IGQGLVWDFV FKKDNEETGF
VVYSQEKNKN KPDSYQVTLR LSQGHRVVLR EFEGELADDL AEFNITLIHP AFIVSEFDYV HYDTPYPLPK RVVTVIAHEL LKDLMVLNDV ASVLRMLSSF QLPTTVRDIM NYVWIVPITS RVNYDEENWR NNTLFLIEER IHFRNNTNNW ENPNNNPIHP ALACSKELWI QSNWKKLFND GSGTRALEQA
ANSSPVASTT PYLTPNDRGL GVGGSQPPDI AGFYRSEYME KDLTALSNML EKQASNGVLI SDQIGLPDFN AHQWFGNLVT YRVMAVDALA LSEDVFKQGL NRWTLQMGFP IRDGRQQQDY KIQTQLQRDH QYMPWEAALS REIPENLMDQ NLRSTVYCNA LNRYLSYTLN YGGGSFSFSN LEKTKANIKW
5O i00 150 200 250 300 350 400 450 5O0 55O 600 650 7OO 75O 800 850 900 950 967
References 1 2 3 4 s 6 z
Look, A.T. et al. (1986) J. Clin. Invest. 78, 914-921. Shipp, M.A. and Look, A.T. (1993) Blood 82, 1052-1070. Delmas, B. et al. (1994) J. Virol. 68, 5216-5224. Larsen, S.L. et al. (1996) J. Exp. Med. 184, 183-189. Xu, Y. et al. (1995) Biochem. Biophys. Res. Commun. 208, 664-674. van Hal, P.T.W. et al. (1994) J. Immunol. 153, 2718-2728. Giugni, T.D. et al. (1996) J. Infect. Dis. 173, 1062-1071.
L67
CD13
s Look, A.T. et al. (1989) J. Clin. Invest. 83, 1299-1307. 9 0 l s e n , J. et al. (1988) FEBS Lett. 238, 307-314. lo Watt, V.M. and Yip, C.C. (1989)J. Biol. Chem. 264, 5480-5487. 11 Chen, H. et al. (1996) J. Immunol. 157, 2593-2600.
L6~
Molecular weights Polypeptide
35 773
SDS-PAGE reduced
53-55 kDa
NH2
Carbohydrate N-linked sites O-linked
4 unknown
Human gene location and size 5q31; 1.5 kb 1
Tissue distribution
'1 I
. . . . .i. . .
In human and mouse CD14 is predominantly expressed on cells of the myelomonocytic lineage including monocytes, macrophages and Langerhans cells 1-3. CD14 is expressed at lower levels on neutrophils, but can be induced with certain cytokines or fMLP 4's. IFN7 or IL-13 treatment decreases monocyte expression. The antigen has also been detected, at low levels, on human B cells 6.
Structure CD14 is a GPI-linked glycoprotein 7. The extracellular region contains 10 repeats with some similarities to the leucine-rich glycoprotein (LRG) repeats 2. However, they do not show the regular size nor enough of the sequence patterns characteristic of LRG repeats to be included in this family (see Chapter 3). Soluble forms of CD14 are present in normal serum and urine of nephrotic patients and culture media of cells expressing CD14 7,8 The N-terminus of the mature protein has been determined by amino acid sequence analysis 1,7.
Ligands and associated molecules CD14 is a receptor for the complex of lipopolysaccharide (LPS) and the LPSbinding protein (LBP)9
Function CD14 may be involved in the clearance of Gram-negative pathogens opsonized with LBP. TNF~ synthesis induced by LPS in monocytes and macrophages can be blocked by anti-CD 14 mAbs 9. The interaction of CD 14 with the LPS-LBP complex causes an increase in the adhesive activity of CR3 (CDllb/CD18) on neutrophils s Transgenic mice overexpressing human CD14 show increased susceptibility to endotoxin shock ~o, whereas CD14-deficient mice are highly resistant to either live Gram-negative
L6~
bacteria or LPS 11. CD 14-deficient mice also show dramatically reduced levels of bacteraemia following in vivo challenge with E. coli, suggesting a role for CD 14 in dissemination of Gram-negative bacteria.
Comments The CD 14 gene contains a single intron after the initiation codon 1,12. CD14 maps within a chromosomal region containing other genes encoding growth factors (e.g. IL-3, IL-4, IL-5, IL-9, GM-CSF) and receptors (e.g. M-CSFR, PDGFR, al- and fl2-adrenergic receptors) 13. Deletions in this region are frequently found in myeloid leukaemias 13.
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A2763 7 S03605
P085 71 P 10810
X06882 M34510
12 2
Amino acid sequence of human CD 14 MERASCLLLL TTPEPCELDD LKRVDADADP ELTLEDLKIT PGLKVLSIAQ KFPAIQNLAL PRCMWSSALN EVDNLTLDGN ARSTLSVGVS
LLPLVHVSA EDFRCVCNFS RQYADTVKAL GTMPPLPLEA AHSPAFSYEQ RNTGMETPTG SLNLSFAGLE PFLVPGTALP GTLVLLQGAR
EPQPDWSEAF RVRRLTVGAA TGLALSSLRL VRAFPALTSL VCAALAAAGV QVPKGLPAKL HEGSMNSGVV GFA
QCVSAVEVEI QVPAQLLVGA RNVSWATGRS DLSDNPGLGE QPHSLDLSHN RVLDLSCNRL PAC
HAGGLNLEPF LRVLAYSRLK WLAELQQWLK RGLMAALCPH SLRATVNPSA NRAPQPDELP
References 1 2 3 4 s 6 7 8 9 lo 11 12 13
17(
Goyert, S.M. et al. (1988) Science 239, 497-500. Ferrero, E. et al. (1990) J. Immunol. 145, 331-336. Gadd, S. (1989)Leucocyte Typing IV, 787-789. Goyert, S.M. et al. (1989) Leucocyte Typing IV, 789-794. Wright, S.D. et al. (1991) J. Exp. Med. 173, 1281-1286. Labeta, M.O. et al. (1991) Mol. Immunol. 28, 115-122. Haziot, A. et al. (1988) J. Immunol. 141,547-552. Bazil, V. and Strominger, J.L. (1991)J. Immunol. 147, 1567-1574. Wright, S.D. et al. (1990) Science 249, 1431-1433. Ferrero, E. et al. (1993) Proc. Natl Acad. Sci. USA 90, 2380-2384. Haziot, A. et al. (1996) Immunity 4, 407-414. Ferrero, E. and Goyert, S.M. (1988)Nucleic Acids Res. 16, 4173. Le Beau, M.M. et al. (1993) Proc. Natl Acad. Sci. USA 90, 5484-5488.
-i 50 I00 150 200 250 300 333 +23
Lewis x (LeX), 3-fucosyl-N-acetyl-lactosamine (3-FAL)
Tissue distribution CD15 is expressed on neutrophils, eosinophils and monocytes, but not platelets, lymphocytes or erythrocytes. It is also present in embryonic tissues and adenocarcinomas, myeloid leukemias and Reed-Sternberg cells 1,2.
Structure CD15 antibodies recognize the terminal trisaccharide structure Galfil~4[Fucal~3]GlcNAc which is also referred to as the Lewis x (Lex) antigen. This structure is found on a variety of glycoproteins and glycolipids at the cell surface 3-s. For example, CD15 is carried by the CD11/CD18 and CD66 glycoproteins 1,6. The majority of the CD15 antibodies are IgM, and they do not crossreact with the sialylated form of CD15 (CD15s, sLeX)2
Function CD 15 antibodies have been shown to affect various cell activities. However, it is difficult to distinguish between effects on the CD15 structure itself and effects mediated by proteins which happen to carry the CD15 epitope 2'7. CD15 antibodies can mediate complement activation and may have potential therapeutic value in the killing of CD 15-expressing tumour cells 2
References 1 Stocks, S.C. et al. (1990) Biochem. J. 268, 275-280. 2 Ball, E.D. (1995) Leucocyte Typing V, 790-794. 3 4 s 6 7
Huang, L.C. et al. (1983) Blood 61, 1020-1023. Buescher, E.F. et al. (1984) Leucocyte Typing IV, 807-811. Spooncer, E. et al. (1984) J. Biol. Chem. 259, 4792-4801. $kubitz, K.M. and Snook, R.W. (1987)J. Immunol. 139, 1631-1639. Forsyth, K.D. et al. (1989) Eur. J. Immunol. 19, 1331-1334.
L71
CD15s
Sialyl Lewis x (sLe x)
Tissue distribution CD15s is expressed on neutrophils, basophils and monocytes, but its expression on lymphocytes is variable depending on the antibodies used for detection 1'2. CD15s is also present on high-endothelium venules and subcapsular sinus cells in lymph nodes 3 Structure The CD15s antigen is the sialylated form of CD15 with the structure NeuAc~2--, 3Galfll --, 4[Fuc~l --, 3]GlcNAcfl. It is found at the non-reducing termini of N-linked or O-linked oligosaccharides on glycoproteins as well as on glycosphingolipids 4-6. CD15s is not synthesized by the direct sialylation of CD15 but is instead synthesized by fucosylation of NeuAc~2 ~ 3Galfll --, 4GlcNAcfl-R by the fucosyl transferase VII (FucT-VII). FucT-VII is distinct from FucT-W which fucosylates Galfll ~ 4GlcNAcfl-R to yield CD 15 7. Ligands and associated molecules The selectins (CD62E, CD62P and CD62L)bind CD15s. However, although CD 15s is carried by many glycoproteins, selectins appear to bind preferentially to a limited number of cell surface glycoproteins (see CD62G CD62b CD62F), suggesting either that recognition depends on the protein backbone or that these glycoproteins carry rare carbohydrate structures related to CD15s which are better ligands than CD 15s itself 4-6,8.
--]
I
] !
!
! !
-]
Function The importance of CD15s in health is illustrated by the disease leucocyte adhesion deficiency type II 9, in which a defect in fucosylation results in decreased levels of CD15s and CD15. Selectin-mediated cell adhesion is impaired in these patients 9,1o. Mice deficient in the enzyme FucT-VII exhibit leucocytosis, impaired leucocyte extravasation into inflamed tissues, and defective lymphocyte homing 7.
References
i I I
E
17~
1 z 3 4 s 6 7 s 9 lo
Kannagi, R. and Magnani, J.L. (1995) Leucocyte Typing V, 1529-1531. Bochner, B.S. et al. (1996) J. Immunol. 157, 844-850. Magnani, J.L. (1995) Leucocyte Typing V, 1524-1529. Feizi, T. (1993)Curr. Opin. Struct. Biol. 3, 701-710. Varki, A. (1994) Proc. Natl Acad. Sci. USA 91, 7390-7397. Sears, P. andWang, C.H. (1996)Proc. Natl Acad. Sci. USA93, 12086-12093. Maly, P. et al. (1996) Cell 86, 643-653. Tu, L. et al. (1996) J. Immunol. 157, 3995-4004. Etzioni, A. et al. (1992) New Engl. J. Med. 327, 301-314. Phillips, M.L. et al. (1995) J. Clin. Invest. 2898-2906.
FcvRIII
Molecular weights Polypeptide TM form GPI-linked
TM form
GPI-linkedform
27 268 21090
form
SDS-PAGE reduced
r c,,
50-80 kDa
Carbohydrate N-linked TM form sites GPI-linked form O-linked
rc,,
"" TTT! T
TTTTT
5 6
TTT
unknown COOH COOH
Human gene location and size Transmembrane form: 1q23; 9 kb 1,2 GPI-linked form: l q23; 9 kb 1,2 CQG ,I YRCt
Domains Exon boundaries
GPl-linked form
Domains
Exon boundaries
/11TIE1D LVS
JsJ
CHS [ YFCi l IGW
C2
CQG ~3HS I YRCt YFC
,/11TEI1D LVS
02
~
IGW
02
QGL
~?~ QGL
Tissue distribution In humans the transmembrane (TM) form of CD 16 is expressed on NK cells, macrophages and mast cells, whilst the GPI-linked form is expressed on neutrophils. In the mouse no GPI-linked form of CD16 has been identified and the transmembrane form is expressed on macrophages, NK cells, neutrophils, myeloid precursors and the majority of early CD4-CD8-TCRfetal thymocytes 3,4
Structure
i
i
i............ !
There are two distinct forms of human CD 16 encoded by two linked genes: a transmembrane form with a 25 amino acid cytoplasmic tail and a glycosylphosphatidylinositol (GPI)-linked form 1'2's Their extracellular sequences differ by only six amino acids and site-directed mutagenesis has shown that amino acid Ser186 (mature protein numbering) determines the attachment of the GPI anchor &6. The extracellular region of CD16 comprises two C2-set IgSF domains 5
173
CD16
Ligands and associated molecules The transmembrane form is non-covalently associated with the Fc~RI ? chain or the TCR ~ chain 6,7. CD 16 on mast cells is also associated with the fl chain of the Fc~RI 8.
Function CD 16 is a low-affinity receptor for aggregated IgG a. The transmembrane form binds IgG complexed to antigens and mediates phagocytosis and antibodydependent cellular cytotoxicity. On NK cells, signal transduction by CD 16 is mediated through the ? chain and crosslinking of CD16 with immune complexes or CD16-specific mAbs induces calcium mobilization and hydrolysis of membrane phosphoinositides 9. In contrast the GPI-linked form on neutrophils binds to ligands but is unable to induce any signal or functional effect a. Targeted disruption of the mouse FcR ? chain common to CD 16 and Fc~RI results in immunocompromised mice lo
Comments CD16 (FcTRIII) has structural similarity to CD64 (Fc?RI), CD32 (FcTRII) and Fc~RIa 3. The sequence LFAVDTGL is completely conserved in the transmembrane regions of CD16 and Fc~RIa from human, mouse and rat11. A family of CD16 isoforms have been described in the rat11. Patients with paroxysmal nocturnal haemoglobinuria lack the GPI-linked form of CD16, and other GPI-linked cell surface molecules. This disease is characterized by the presence of circulating immune complexes and a susceptibility to bacterial infections 12 Database accession numbers
Human Mouse Rat
St
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
JL0107 $29360
P08637 P08508 P27645
X16863, X52645 M 14215 M64368-M64370
1 la 11
Amino acid sequence of human CD 16 TM form MWQLLLPTAL MRTEDLPKAV SSQASSYFID FKEEDPIHLR GSYFCRGLFG LFAVDTGLYF
LLLVSAG VFLEPQWYRV AATVDDSGEY CHSWKNTALH SKNVSSETVN SVKTNIRSST
LEKDSVTLKC RCQTNLSTLS KVTYLQNGKG ITITQGLAVS RDWKDHKFKW
QGAYSPEDNS DPVQLEVHIG RKYFHHNSDF TISSFFPPGY RKDPQDK
TQWFHNESLI WLLLQAPRWV YIPKATLKDS QVSFCLVMVL
-i 50 i00 150 200 237
Amino acid sequence of human CD 16 GPI-linked form MWQLLLPTAL MRTEDLPKAV SSQASSYFID FKEEDPIHLR GSYFCRGLVG PPGYQVSFCL
174
LLLVSAG VFLEPQWYSV AATVNDSGEY CHSWKNTALH SKNVSSETVN VMVLLFAVDT
LEKDSVTLKC RCQTNLSTLS KVTYLQNGKD ITITQGLAVS GLYFSVKTNI
QGAYSPEDNS TQWFHNESLI DPVQLEVHIG WLLLQAPRWV RKYFHHNSDF HIPKATLKDS TISSFS
-i 50 i00 150 186 +30
There are two alleles of the GPI-linked form of CD 16 3. The sequence shown above represents the product of the NA-2 allele 1
References 1 2 3 4 s 6 7 8 9 lo 11 12 13
Ravetch, J.V. and Perussia, B. (1989) J. Exp. Med. 170, 481-497. Qiu, W.Q. et al. (1990) Science 248, 732-735. Ravetch, J.V. and Kinet, J.-P. (1991)Annu. Rev. Immunol. 9, 457-492. Rodewald, H-R. et al. (1992)Cell 69, 139-150. Simmons, D. and Seed, B. (1988) Nature 333, 568-570. Hibbs, M.L. et al. (1989) Science 246, 1608-1611. Lanier, L.L. et al. (1989) Nature 342, 803-806. Kurosaki, T. et al. (1992)J. Exp. Med. 175, 447-451. Wirthmueller, U. et al. (1992) J. Exp. Med. 175, 1381-1390. Takai, T. et al. (1994) Cell 76, 519-529. Farber, D.L. and Sears, D.W. (1991)J. Immunol. 146, 4352-4361. Selvaraj, p. et al. (1988) Nature 333, 565-567. Ravetch, J.V. et al. (1986) Science 234, 718-725.
17~
Lactosylceramide (LacCer) Tissue distribution C D w l 7 is expressed on human neutrophils, basophils, monocytes and platelets 1,2. It is also found on post-proliferative granulocytes in the bone marrow 2. Between 40 and 80% of CD19 + peripheral B lymphocytes bind C D w l 7 mAbs, whereas other lymphocytes are negative 2. C D w l 7 is also expressed on tonsillar CD45 § dendritic cells, epithelial cells 2, and on endothelial cells within the intestinal epithelium 3.
Structure C D w l 7 antibodies recognize the lactosyldisaccharide group (LacCer or Galfll-*4Glcfll--, 1Cer) of the glycosphingolipid lactosylceramide 4,s. The antigen is not known to be associated with glycoproteins 1,2
Ligands and associated m o l e c u l e s The GM3 ganglioside on tumour cell lines has been shown to bind to CDwl 7coated surfaces 6.
Function CDwl 7 is the most abundant glycosphingolipid present on neutrophils 7. The majority of LacCer on neutrophils is contained in intracellular granules, where it has been proposed to participate in the exocytosis or packaging of granule contents l's. Surface expression of C D w l 7 is markedly decreased after activation with a number of stimuli. Downregulation is associated with membrane internalization and granule exocytosis, but not with superoxide production 1. Treatment of neutrophils with C D w l 7 mAbs induces a moderate release of calcium ions into the cytoplasm and stimulates a strong oxidative burst, but has little effect on C D l l b and CD67 surface expression 9. However, upregulation of CDwl 7 expression is associated with the activation of platelets lo
References 1 Symington, EW. (1989) J. Immunol. 142, 2784-2790. z 3 4 s 6 7 s
Thompson, J.S. and Lund-Johansen, F. (1995) Leucocyte Typing V, 822-823. Karlsson, K. (1989)Annu. Rev. Biochem. 58, 309-350. Symington, F.W. et al. (1984) J. Biol. Chem. 259, 6008-6012. Kniep, B. et al. (1989) Leucocyte Typing IV, 877-879. Kojima, N. and Hakomori, S. (1991) J. Biol. Chem. 266, 17552-17558. Symington, F.W. et al. (1985) J. Immunol. 134, 2498-2506. Symington, F.W. et al. (1987) J. Biol. Chem. 262, 11356-11363.
9 Lund-Johansen, E et al. (1992) J. Immunol. 148, 3221-3229. lO
17(
Michelson, A.D. et al. (1995) Leucocyte Typing V, 1207-1210.
Integrin f12 subunit Molecular weights Polypeptide
82 573
SDS-PAGE reduced non-reduced
95 kDa 90 kDa o,.
Carbohydrate N-linked sites O-linked sites
6 unknown
Human gene location and size 21 q22.3; -40 kb 1
CD11 b/CD18
Tissue distribution CD 18 is expressed on all leucocytes 2-s. See entries for CD 11 antigens and the integrin aD subunit for details of the expression of different CD 18 (f12 integrin) complexes.
Structure CD18 (integrin f12 subunit) combines with CD1 l a - c (aL, aM and aX) subunits, and the aD integrin subunit to form the integrins LFA-1 (aLfl2, CD 1 l a/CD 18), Mac-1 (aMfl2, CD1 lb/CD18), p150,95 (aXfl2, CD1 lc/CD18)and aDfl2. The Nterminus of CD 18 is blocked 6,7
Function CD18 may be important in the regulation of ligand binding activities of various CD11/CD18 complexes. CD18 has been shown to interact with several cytoskeletal proteins including a-actinin 8 and filamin (ABP-280) 9, and the cytoplasmic regulatory molecule cytohesin-1 lo. There are also a number of antibodies that promote the C D l l / C D 1 8 antigens to high affinity binding to ligands 11,12. Leucocyte adhesion deficiency results from defects in the CD18 gene, leading to diminished expression of all CD18containing integrins. The deficiency is very heterogeneous with varying degrees of clinical severity la,14.
Database accession numbers PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
Human
A25967
P05107
Mouse
S04847
P11835
M15395 X64072 X14951
6 7 is 177
CD18
Amino acid sequence of human CD 18 MLGLRPPLLA QECTKFKVSS CAADDIMDPT KGYPIDLYYL DKTVLPFVNT KQLISGNLDA GKLGAILTPN TSRMVKTYEK ALPDTLKVTY EQSFVIRALG DTGYIGKNCE DVPGKLIYGQ CQCERTTEGC GKYISCAECL AYTLEQQDGM WKALIHLSDL
LVGLLSLGCV CRECIESGPG SLAETQEDHN MDLSYSMLDD HPDKLRNPCP PEGGLDAMMQ DGRCHLEDNL LTEIIPKSAV DSFCSNGVTH FTDIVTVQVL CQTQGRSSQE YCECDTINCE LNPRRVECSG KFEKGPFGKN DRYLIYVDES REYRRFEKEK
LS CTWCQKLNFT GGQKQLSPQK LRNVKKLGGD NKEKECQPPF VAACPEEIGW YKRSNEFDYP GELSEDSSNV RNQPRGDCDG PQCECRCRDQ LEGSCRKDNN RYNGQVCGGP RGRCRCNVCE CSAACPGLQL RECVAGPNIA LKSQWNNDNP
GPGDPDSIRC VTLYLRPGQA LLRALNEITE AFRHVLKLTN RNVTRLLVFA SVGQLAHKLA VHLIKNAYNK VQINVPITFQ SRDRSLCHGK SIICSGLGDC GRGLCFCGKC CHSGYQLPLC SNNPVKGRTC AIVGGTVAGI LFKSATTTVM
DTRPQLLMRG AAFNVTFRRA SGRIGFGSFV NSNQFQTEVG TDDGFHFAGD ENNIQPIFAV LSSRVFLDHN VKVTATECIQ GFLECGICRC VCGQCLCHTS RCHPGFEGSA QECPGCPSPC KERDSEGCWV VLIGILLLVI NPKFAES
-I 50 i00 150 200 250 300 350 400 450 500 550 600 650 700 747
References 1 Weitzman, J.B. et al. (1991) FEBS Lett. 294, 97-103. z Pigott, R. and Power, C. (1993) The Adhesion Molecule FactsBook. Academic Press, London. 3 Larson, R.S. and Springer, T.A. (1990) Immunol. Rev. 114, 181-217. 4 Patarroyo, M. et al. (1990) Immunol. Rev. 114, 67-108. s Rieu, P. and Arnaout, M.A. (1995) In Adhesion Molecules and the Lung (Ward, P. and Lenfart, C., eds), vol. 89, pp. 1-42. Marcel Dekker, New York. 6 Kishimoto, T.K. et al. (1987) Cell 48, 681-690. 7 Law, S.K.A. et al. (1987) EMBO J. 6, 915-919. s Pavalko, F.M. and LaRoche, S.M. (1993)J. Immunol. 151, 3795-3807. 9 Sharma, C.P. et al. (1995) J. Immunol. 154, 3461-3470. lo Kolanus, W. et al. (1996) Cell 86, 233-242. 11 Ortlepp, S. et al. (1995) Eur. J. Immunol. 25, 637-643. lz Petruzzelli, L. et al. (1995)J. Immunol. 155, 854-866. 13 Kishimoto, T.K. et al. (1987) Cell 50, 193-202. 14 Arnaout, M.A. (1990) Immunol. Rev. 114, 145-180. is Wilson, R.W. et al. (1989)Nucleic Acids Res. 17, 5397.
[7[
B4, Leu-12 Molecular weights Polypeptide
59 154
SDS-PAGE reduced
95 kDa
Carbohydrate N-linked sites O-linked
5 unknown
Human gene location and size 16p 11.2; 8 kb 1
COOH CLK [ YLCI
Domains Exon boundaries
ISlll EEG
CGV [ YYCI 1 SGE
1 QDL
I ~PVL RAL [ Encoded by 9 exons i
T i s s u e distribution CD19 is expressed on B lineage cells with the exception of plasma cells. CD19 is also present on follicular dendritic cells 2
Structure The extracellular region of CD19 consists of two C2-set IgSF domains separated by a region of 67 residues with no significant sequence similarity to any known protein 3. The large cytoplasmic domain is highly conserved between species4 and contains several potential phosphorylation sites on Ser/Thr and Tyr residues 3. Phosphorylation of tyrosine residues in the YEXM motifs creates binding motifs for the SH2 domains of phosphatidylinositol 3-kinase and non-receptor protein tyrosine kinases s i
Ligands and associated m o l e c u l e s CD19 is a component of the CD19/CD21/CD81/leu-13 signalling complex (reviewed in refs 2, 5 and 6). The CD19/CD21 interaction has a 1:1 stoichiometry and is mediated by both the extracellular and transmembrane regions of CD19, whereas the CD19/CD81 interaction involves the CD19 extracellular region only s. CD19 links this complex to cytoplasmic signal transduction pathways. The extensive cytoplasmic region of CD19 can associate with phosphatidylinositol 3-kinase, Vav and the Src family protein tyrosine kinases Lyn and Fyn (reviewed in refs 6 and 7).
L7r
CD19
Function The CD19/CD21/CD81/leu-13 signalling complex modulates the threshold for the B cell antigen receptor 6'7. The functions of CD81 and leu-13, the expression of which are not restricted to B cells, are not clear. However CD21 (complement receptor 2) binds fragments of C3 that have been covalently attached to glycoconjugates by complement activation. This enables CD19, plus associated intracellular signalling molecules, to be crosslinked to the B cell antigen receptor after preimmune recognition of an immunogen by the complement system, thus reducing the number of B cell receptor molecules which must be ligated to enable B cell activation 6,7. This mechanism may be particularly important for the B cell during the primary immune response, prior to affinity maturation, when the low-affinity B cell antigen receptor must respond to low concentrations of antigen. This coreceptor role for CD19 is supported by data from CD19 knockout and transgenic mice 6,7. Database accession numbers Human Mouse
5<
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
JL0074 B45808
P15391 P25918
M28170 M62542
3 4
Amino acid sequence of human CD 19 MPPPRLLFFL EVRPEEPLVV GLPGLGIHMR GSGELFRWNV WEGEPPCVPP VHPKGPKSLL TMSFHLEITA LVLRRKRKRM WAAGLGGTAP DSEFYENDSN ELTQPVARTM RGQPGPNHEE
LFLTPM KVEEGDNAVL PLASWLFIFN SDLGGLGCGL RDSLNQSLSQ SLELKDDRPA RPVLWHWLLR TDPTRRFFKV SYGNPSSDVQ LGQDQLSQDG DFLSPHGSAW DADSYENMDN
QCLKGTSDGP VSQQMGGFYL KNRSSEGPSS DLTMAPGSTL RDMWVMETGL TGGWKVSAVT TPPPGSGPQN ADGALGSRSP SGYENPEDEP DPSREATSLG PDGPDPAWGG
TQQLTWSRES CQPGPPSEKA PSGKLMSPKL WLSCGVPPDS LLPRATAQDA LAYLIFCLCS QYGNVLSLPT PGVGPEEEEG LGPEDEDSFS SQSYEDMRGI GGRMGTWSTR
PLKPFLKLSL WQPGWTVNVE YVWAKDRPEI VSRGPLSWTH GKYYCHRGNL LVGILHLQRA PTSGLGRAQR EGYEEPDSEE NAESYENEDE LYAAPQLHSI
-i 50 i00 150 200 250 300 350 400 450 500 540
References 1 z a 4 s 6 7
[8{
Zhou, L.-J. et al. (1992)Immunogenetics 35, 102-111. Tedder, T.F. et al. (1994) Immunol. Today 15, 437-442. Tedder, T.F. and Isaacs, C.M. (1989) J. Immunol. 143, 712-717. Zhou, L.-J. et al. (1991) J. Immunol. 147, 1424-1432. Fearon, D.T. and Carter, R.H. (1995)Annu. Rev. Immunol. 13, 127-149. Doody, G.M. et al. (1996) Curr. Opin. Immunol. 8, 378-382. DeFranco, A.L. (1996) Curr. Biol. 6, 548-550.
CD20
B1, Bp35, Ly-44 (mouse)
Molecular weights
I I I
Polypeptide
33 078
SDS-PAGE reduced
33-37 kDa
Ittt~~
Carbohydrate N-linked sites O-linked
nil nil
~ ~
NH2
- COOH
Human gene location and size llq13; 16kbl
Tissue distribution CD20 is expressed only on B lineage cells but is absent from plasma cells (reviewed in ref. 2).
Structure CD20 is a member of the CD20/FceRIfl superfamily of leucocyte surface antigens which also includes the fi subunit of the high-affinity receptor for IgE (FceRIfl) and HTm4. These molecules are predicted to have four transmembrane regions, cytoplasmic N- and C-termini, and short extracellular loops 2,3. The CD20/FceRIfl superfamily shares no sequence similarity with another superfamily of four transmembrane molecules, the TM4SF. The gene for CD20 maps to the same region of the genome as FceRIfi and HTm4 4. The cytoplasmic regions of CD20 are serine/threonine rich and contain multiple phosphorylation consensus sequences. Differential phosphorylation is responsible for the three forms of CD20 (33, 35 and 37 kDa), with activated B cells showing a relative increase in the phosphorylated 35 and 3 7 kDa forms 2.
Ligands and associated molecules CD20 can exist in a multimolecular complex that includes the Src family tyrosine kinases Lyn, Fyn and Lck. This association may not be direct, since it is unaffected by deletion of a large proportion of the CD20 cytoplasmic regions s. This is consistent with flow cytometric energy transfer analyses which show that CD20 can exist in a complex with MHC Class I and II, and the TM4SF molecules CD53, CD81 and CD82 6. No extracellular ligand for CD20 has been identified.
Function Indirect evidence suggests that CD20 functions as a B cell Ca ~§ channel subunit, since the expression of CD20 in disparate cell types generates a qualitatively similar channel activity to that found endogenously in B cells z. An ion channel function is consistent with reports that CD20 regulates cell cycle progression 8 and exists on the cell surface as a homo-oligomer 7.
181
CD20
Database accession numbers Human Mouse
A.
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A27400 A30558
P11836 P19437
X12530 M62541
4
9
Amino acid sequence of human CD20 MTTPRNSVNG TLGAVQIMNG LAATEKNSRK SLNFIRAHTP AFFQELVIAG ETSSQPKNEE
TFPAEPMKGP LFHIALGGLL CLVKGKMIMN YINIYNCEPA IVENEWKRTC DIEIIPIQEE
IAMQSGPKPL MIPAGIYAPI SLSLFAAISG NPSEKNSPST SRPKSNIVLL EEEETETNFP
FRRMSSLVGP CVTVWYPLWG MILSIMDILN QYCYSIQSLF SAEEKKEQTI EPPQDQESSP
TQSFFMRESK GIMYIISGSL IKISHFLKME LGILSVMLIF EIKEEVVGLT IENDSSP
References 1 Tedder, T.F. et al. (1989) J. Immunol. 142, 2560-2568.
2 Tedder, T.F. and Engel, P. (1994) Immunol. Today 15, 450-454. 3 4 s 6 7 8 9
t8++
Adra, C.N. et al. (1994) Proc. Natl Acad. Sci. USA 91, 10178-10182. Tedder, T.F. et al. (1988) J. Immunol. 141, 4388-4394. Deans, J.P. et al. (1995)J. Biol. Chem. 270, 22632-22638. Sz611osi, J. et al. (1996) J. Immunol. 157, 2939-2946. Bubien, J.K. et al. (1993) J. Cell Biol. 121, 1121-1132. Kanzaki, M. et al. (1995) J. Biol. Chem. 270, 13099-13104. Tedder, T.F. et al. (1988) Proc. Natl Acad. Sci. USA 85, 208-212.
50 i00 150 200 250 297
CR2, EBV-receptor, C3d-receptor
Other names Complement receptor type 2 (CR2) C3d receptor Epstein-Barr virus (EBV) receptor
I
q
Molecular weights Polypeptide 16 CCP form 15CCPform
117 126 110929
SDS-PAGE reduced
145 kDa
( ,)
!
Carbohydrate N-linked sites
16 CCPform 15CCPform
()
12 11
unknown
O-linked Human gene location and size 1q32; -30 kb 1,2 CGS I
Dommns
CPE PKCI
1
11
CPA PTC/
CKS PTCI
SVF
EEA
II
Exonboundanes LGI
CPH I 1
CQA PVC/
CEA PQCI
II
i.c
t,.
+
KEG
ca. l h ET
CPP RLCI
11
II
t,
c . . . . t.,
KVA
EGY KEI
cP.
PLql
cP,
P(K; I
Iz 11 I ~ Y KVI
)
,,
CPS PVCI
.11
PVC s
11
i
1
EGY EEl
CSH PLCI
c
PRC
11
~..
CQP I 1
M,
PRC 1
c
K GC
cPP
,,:,rc I
I; KAF
css
PHCl
l; KEV
~E
kVC
TTTTT! TTTTT COOH
/1 [0 12 SRS / FRN LCG
Tissue distribution CD21 is expressed on mature B cells, follicular dendritic cells, and pharyngeal and cervical epithelial cells 3. It is also expressed on fetal astrocytes 4.
Structure The extracellular region consists of 15 or 16 complement control protein (CCP) domains, organized into four groups with a high degree of sequence identity between themS'6. The 1 l th CCP domain is absent in the 15 CCP domain isoform 3's'6 The CD21 gene is a member of the regulation of complement activation (RCA) gene cluster that encodes a family of C3/ C4 binding proteins (see CD35 and ref. 1). The cytoplasmic domain contains potential protein kinase C and tyrosine kinase phosphorylation sites 3,s,6.
t8+
Ligands and associated molecules CD21 is a receptor for the C3 activation fragments iC3b and C3d 3. It is also the receptor used by the Epstein-Barr virus to infect B lymphocytes 3,7. CD21 has been shown to bind CD23 s and interferon ~9. CD21 is one subunit in a muhimeric complex on B cells which includes CD19 and CD81 lo. CD21 is also associated with CD35 on B cells 11.
Function When covalent antigen-complement complexes bind to the B cell antigen receptor (BCR, surface Ig), the simultaneous interaction of CD21 with the complement component C3d enhances signalling through the BCR. The CD21 signal, which is transduced through the associated molecules CD19 and CD81, very effectively lowers the activation threshold of the BCR lo. Thus C3d may be considered as a molecular adjuvant 12 that directs the acquired immune response towards antigens recognized by the innate immune response 13. Mice made defective in CD21 have an impaired immune response to T-dependent antigens 14,1s. The CD21/CD23 interaction may have a regulatory role in IgE production s. Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
PL0009 A43526
P20023 P 19070
M26004/M26016 M35684
REFERENCE
2,5,6 16,17
Amino acid sequence of human CD21 MGAAGL~GVr
ISCGSPPPIL VDGTWDKPAP NFSMNGNKSV GSIAPGLSVT FPNGKVKEPP CEEIFCPSPP STLRCTVDSQ YNDTVIFACM KEDRHMVRFD CEATGRQLLT LCKEITCPPP ESTIRCTSND FYNDTVTFKC RHTGGNTVFF CQHVRQSLQE PLCKVIHCHP AEVILKAWIL NPGFIMNGSR IARFSPGMSI DMDGIQKGLE VCRSRSLAPV HLEAREVYSV
LA~VAPGWG
NGRISYYSTP KCEYFNKYSS WCQANNMWGP YSCESGYLLV ILRVGVTANF PILNGRHIGN KTGTWSGPAP FGFTLKGSKQ PGTSIKYSCN KPQHQFVRPD PVIYNGAHTG QERGTWSGPA YSGFTLKGSS VSGMTVDYTC LPAGSRVELV PPVIVNGKHT ERAFPQCLRS VIRCHTDNTW LYSCDQGYLV PRKMYQYGAV LCGIAAGLIL DPYNPAS
IAVGTVIRYS CSGTFRLIGE KSLLCITKDK CPEPIVPGGY KIRGSTPYRH GDSVTFACKT TRLPTCVSVF PLECPALPMI HNGHHTSENV GEKIINCLSS GKWSAVPPTC EEARCKSLGR FCDEGYRLQG PPSSRCVIAG QGVAWTKMPV SLANVSYGSI VTYTCDPDPE EGVNFILIGE RCELSTSAVQ CPHPQILRGR MVSGQKDRYT IRCNAQGTWE PSAPVCEKEC QAPPNILNGQ PGYVLVGEES IQCTSEGVWT PPVPQCKVAA VNSSCGEGYK LSGSVYQECQ GTIPWFMEIR SSLEDFPYGT TVTYTCNPGP ERGVEFSLIG PLCKLSLLAV QCSHVHIANG YKISGKEAPY QIRCKRDNTW DPEIPVCEKG CQPPPGLHHG DPGYLLVGNKSIHCMPSGNWSPSAPRCEET NTSCQDGYQL TGHAYQMCQD AENGIWFKKI GMMAENFLYG NEVSYECDQG FYLLGEKNCS LCPNPEVKHG YKLNKTHSAY SHNDIVYVDC VPGVPTCIKK AFIGCPPPPK TPNGNHTGGN VGEPLLLCTH EGTWSQPAPH CKEVNCSSPA VTLECEDGYM LEGSPQSQCQ SDHQWNPPLA LTFLIVITLY VISKHRERNY YTDTSQKEAF
The l l t h CCP domain(in bold) i s o n l y f o u n d i n t h e l 6 CCPform. [84
-i 5O i00 150 200 250 300 350 400 450 5OO 55O 600 650 70O 75O 80O 85O 9OO 95O i000 1050 1067
CD21
References
t
1 2 3 4 s 6 7 8 9 lO 11
!
lz 13 14 is ~6 17
Hourcade, D. et al. (1992) Genomics 12, 289-300. Fujisaku, A. et al. (1990)J. Biol. Chem. 264, 2118-2125. Ahearn, J.M. and Fearon, D.T. (1989) Adv. Immunol. 46, 183-219. Gasque, P. et al. (1996) J. Immunol. 156, 2247-2255. Weis, J.J. et al. (1989) J. Exp. Med. 167, 1047-1066. Moore, M.D. et al. (1987) Proc. Natl Acad. Sci. USA 84, 9194-9198. Tanner, J. et al. (1987) Cell 50, 203-213. Aubry, J.P. et al. (1992) Nature 358, 505-507. Delcayre, A.X. et al. (1991) EMBO J. 10, 919-926. Doody, G.M. et al. (1996) Curr. Opin. Immunol. 8, 378-382. Tuvenson, D.A. et al. (1991) J. Exp. Med. 173, 1083-1089. Dempsey P.W. et al. (1996) Science 2 7 1 , 3 4 8 - 3 5 0 . Fearon, D.T. and Locksley, R.M. (1996) Science 272, 50-54. Molina, H. et al. (1996) Proc. Natl Acad. Sci. USA 93, 3357-3361. Ahearn, J.M. et al. (1996)Immunity 4, 251-262 Fingeroth, J.D. et al. (1989) Proc. Natl Acad. Sci. USA 86, 242-246. Molina, H. et al. (1990) J. Immunol. 145, 2974-2983.
18~
BL-CAM, Leu-14, Lyb-8
Molecular weights Polypeptide ./fl SDS-PAGE unreduced reduced
70 991/93 241
c~/fl 120/130 kDa ~/fl 130/140 kDa
Carbohydrate N-linked sites O-linked
.//~
c2 :
10/11 nil
Human gene location 19q13.1; 22 kb 1 Domains Exon boundaries
LVL CNY
I
QYP
GLL
CTY I LGLI
i
1
SER CDF
,AC
I
QYA
YSC,
!'
I
,,
u
I
c2
QYA
,,YWC I
c2
I
1
KHT CES
CMS
CEV ! YCCI
VTCI
,,
ITMI
,o CY%jo
I D COOH
Tissue distribution CD22 is detected in the cytoplasm early in B cell development (late pro-B cell stage), appears on the cell surface simultaneously with surface IgD, and is found on most mature B cells, where expression is closely correlated with surface IgD 2. Expression is lost with terminal differentiation of B cells and is absent on plasma cells. Activation of B cells via surface Ig increases CD22 expression.
Structure CD22 is a member of a structurally related group of IgSF domain-containing sialic acid binding proteins called the sialoadhesin family, which includes sialoadhesin, CD33, and myelin-associated glycoprotein (MAG) a. Members of this family share --35% identity between their 2-4 membrane-distal IgSF domains. Like other members of the sialoadhesin family, CD22 is predicted to have an unusual disulphide bond between fl strands B and E in domain 1 and a disulphide bond between domains 1 and 2 a. The predominant form of CD22 in humans (CD22fl) 4 and the only identified form in the mouse s contains seven IgSF domains in the extracellular region. A human cDNA clone has been identified which encodes a variant (CD22a) lacking IgSF domains 3 and 4 and with a truncated cytoplasmic domain 6. At least three CD22 alleles have been identified in the mouse 7,8. The exon structure of the CD22 gene indicates that CD22~ represents an alternatively spliced transcript of the CD22 gene 1. CD22fl is the predominant form detected by immunoprecipitation experiments but a smaller protein has been detected which may correspond to CD22a 9. The cytoplasmic region contains six tyrosines, four of which are in SH2-binding YxxL motifs.
18(
CD22 Ligands and associated molecules The CD22 binds to sialoglycoconjugate NeuAca2 --. 6Galfll ~ 4GlcNAc which is widely present on N-linked carbohydrates lo. The binding site lies on the GFCC'C" fl sheet of the membrane-distal IgSF domain and includes an arginine (residue 101) conserved in all sialoadhesin family proteins 8. CD22 forms a loose complex with the B cell antigen receptor (BCR)2. The cytoplasmic domain is tyrosine phosphorylated upon ligation of the BCR and associates, via SH2 domains, with the tyrosine phosphatase SHP-1, the tyrosine kinase Syk, and phospholipase C-? 1 11,12. The tyrosine kmase Lck and phosphatidylinositol 3-kinase have also been reported to bind to the cytoplasmic domain 13.
Function
I
l
l !
I
CD22 down-modulates the B cell activation threshold, presumably through its association with SHP-1 and other signalling molecules 2,11. Mice deficient in CD22 show exaggerated antibody responses to antigen and have raised levels of autoantibodies/4 CD22 can also mediate cell adhesion through its interaction with cell surface molecules bearing the appropriate sialoglycoconjugates 2, but only when the cells expressing CD22 do not themselves carry these sialoglycoconjugates/o. Although the significance of sialic acid binding by CD22 is not known, ligation-induced restribution of CD22 on B cells decreases the BCR activation threshold, providing a plausible link between the adhesion and signalling functions 11
Database accession numbers Human CD22fl Human CD22c~ (short) Mouse CD22fl DBA/2J BALB/c
P.
PIR JH0371 A35648
SWISSPR O T Q01665 P20273
EMBL/GENBANK X59350 X52785
4
REFERENCE
A46512
P35329 P35329
L16928 L02844
7 s
6
Amino acid sequence of human CD22fl MHLLGPWLLL DSSKWVFEHP KFDGTRLYES RMESKTEKWM YPIQLQWLLE QLQDADGKFL
LVLEYLAFS ETLYAWEGAC TKDGKVPSEQ ERIHLNVSER GVPMRQAAVT SNDTVQLNVK
VWIPCTYRAL KRVQFLGDKN PFPPHIQLPP STSLTIKSVF HTPKLEIKVT
DGDLESFILF KNCTLSIHPV EIQESQEVTL TRSELKFSPQ PSDAIVREGD
HNPEYNKNTS HLNDSGQLGL TCLLNFSCYG WSHHGKIVTC SVTMTCEVSS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
KKVTTVIQNP MPIREGDTVT LSCNYNSSNP SVTRYEWKPH GAWEEPSLGV LKIQNVGWDN TTIACARCNS WCSWASPVAL NVQYAPRDVR VRKIKPLSEI HSGNSVSLQC DFSSSHPKEV QFFWEKNGRL LGKESQLNFD SISPEDAGSY SCWVNNSIGQ TASKAWTLEV LYAPRRLRVS MSPGDQVMEG KSATLTCESD ANPPVSHYTW FDWNNQSLPH HSQKLRLEPV KVQHSGAYWC QGTNSVGKGR SPLSTLTVYY SPETIGRRVA VGLGSCLAIL ILAICGLKLQ RRWKRTQSQQ GLQENSSGQS FFVRNKKVRR A P L S E G P H S L G C Y N P M M E D G ISYTTLRFPE MNIPRTGDAE SSEMQRPPRT CDDTVTYSAL HKRQVGDYEN.VIPD[PEDE 9
-i 50 i00 150 200 250
450 500 550 600 650 700 750 800
18;
CD22
The amino acid sequence of human CD22a is as above with dotted underlined areas (encoded by exons 6, 7 and 15) deleted and the cytoplasmic domain terminating with the sequence TMRT$ FQ I F QKblRGFITQS.
References 1 2 3 4 s 6 7 s 9 lo 11 lz 13 14
E E
E
E
188
Wilson, G.L. et al. (1993) J. Immunol. 150, 5013-5024. Law, C.L. et al. (1994) Immunol. Today 15, 442-449. Crocker, P.R. et al. (1996) Biochem. Soc. Trans. 24, 150-156. Wilson, G.L. et al. (1991 ) J. Exp. Med. 173, 13 7-146. Torres, R.M. et al. {1992) J. Immunol. 149, 2641-2649. Stamenkovic, I. and Seed, B. (1990) Nature 345, 74-77. Law, C.L. et al. (1993) J. Immunol. 151, 175-187. van der Merwe, P.A. et al. (1996)J. Biol. Chem. 271, 9273-9280. Schwartz-Albiez, R. et al. (1991) Int. Immunol. 3, 623-33. Powell, L.D. and Varki, A. (1995) J. Biol. Chem. 270, 14243-14246. Doody, G.M. et al. {1996)Curr. Opin. Immunol. 8, 378-382. Law, C.L. et al. {1996) J. Exp. Med. 183, 547-560. Tuscano, J.M. et al. {1996) Eur. J. Immunol. 26, 1246-1252. O'Keefe, T.L. et al. (1996) Science 274, 798-801.
FcERII, BLAST-2
Molecular weights Polypeptide FceRIIa
36 468
SDS-PAGE reduced non-reduced
45 kDa 45 kDa
Carbohydrate N-linked sites O-linked
1 probable +
Human gene location and size 19p13.3; 13 kb I
NH2 NH2 NH2
Tissue distribution CD23 is expressed on B cells and monocytes, and more weakly on a variety of other haematopoietic cells including T cells, follicular dendritic cells, eosinophils, NK cells, Langerhans cells and platelets. On B cells, CD23 expression is restricted to mIgM+mIgD § cells and is lost upon differentiation into plasma cells e CD23 on B cells is upregulated following B cell activation, CD40 ligation, or in response to IL-4 or IL-13 3
Structure CD23 is a type II membrane protein. The extracellular region contains a Cterminal C-type lectin domain and three membrane-proximal repeats of 21 amino acids. These repeats form an a-helical coiled-coil stalk that results in trimer formation 4. Two alternatively spliced forms called FceRIIa and FceRIIb, differing in the first nine amino acids of the N-terminal cytoplasmic region, are expressed on different cell types. FceRIIa is restricted to resting B cells, whereas FceRIIb is expressed by other cell types and is induced upon B cell activation 2. Proteolytic cleavage of the membrane-bound form generates a soluble product of 37 kDa, which can be further degraded into 33 kDa, 29 kDa, 25 kDa and 16 kDa fragments that retain their lectin head groups 4.
Ligands and associated molecules The Fc region of IgE, CD21, and the integrin a chains CD 1 l b and CD 11 c are ligands for CD23. In each case the C-type lectin domain of CD23 is responsible for ligand binding. IgE binding to CD23 is a protein-protein interaction that involves the third constant domain of the IgE heavy chain 4 CD23 interacts with two sites on CD21. One site (CCP domains 5 - 8 ) i s a
18 c,
lectin-carbohydrate type of interaction whereas the other (CCP domains 1-2) is a protein-protein interaction (reviewed in ref. 3). CD23 binds to the integrins CD1 lb/CD18 and CD1 lc/CD18, but not CD1 la/CD18. This interaction is at least partly dependent on lectin-carbohydrate binding (reviewed in ref. 5). CD23 has been shown to associate non-covalently with MHC Class II in B cells, by co-immunoprecipitation analyses in weak detergent 2.
Function CD23 is involved in the regulation of IgE synthesis. Following binding of IgE and IgE-containing immune complexes, CD23 exerts a negative feedback signal that reduces IgE synthesis 3'6'7. Consistent with this function, mice rendered deficient for CD23 have one major phenotype, namely high serum IgE levels and increased specific IgE responses in response to T celldependent antigens 6. Transgenic mice that overexpress CD23 show impaired IgE responses 6. The in vivo role of soluble CD23 (sCD23) is not clear, but it is proposed that sCD23 can function as a cytokine that enhances IgE synthesis, probably through binding to surface membrane IgE and CD213. The role of CD23 as a cell-cell adhesion molecule is also not clear; the CD23 interaction with CD21 may be important in enhancing IgE synthesis, in B cell homotypic adhesion and in the rescue of germinal centre B cells from apoptosis 3. The ligand pairing of CD23 with the integrins C D l l b / CD18 and CD11 c/CD18 is thought to play a role in monocyte activation s,s. The two forms of CD23 have been implicated in different signalling pathways and specific functions. The Tyr (amino acid 6) of Fc~RIIa is involved in endocytosis and the Asp-Pro motif (amino acids 2-3) of Fc~RIIb is associated with phagocytosis 6. CD23 is cleaved from the cell surface by Der p I, the group I protease allergen of the house dust mite Dermatophagoides pteronyssinus. Loss of surface CD23 and increase of sCD23 may combine to enhance IgE synthesis, providing a mechanism by which the dust mite induces an atopic condition in some individuals 9. Database accession numbers
Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A26067 A43518
P06734 P20693
M15059 M99371
10 11
Amino acid sequence of human CD23 FceRIIa
MEEGQYSEIE QSLKQLEERA RLKSQDLELS ELQVSSGFVC IHSPEEQDFL TSRSQGEDCV GPDSRPDPDG
ELPRRRCCRR ARNVSQVSKN WNLNGLQADL NTCPEKWINF TKHASHTGSW MMRGSGRWND RLPTPSAPLH
GTQIVLLGLV LESHHGDQMA SSFKSQELNE QRKCYYFGKG IGLRNLDLKG AFCDRKLGAW S
TAALWAGLLT QKSQSTQISQ RNEASDLLER TKQWVHARYA EFIWVDGSHV VCDRLATCTP
LLLLWHWDTT ELEELRAEQQ LREEVTKLRM CDDMEGQLVS DYSNWAPGEP PASEGSAESM
50 i00 150 200 250 300 321
FceRIIb MNPPSQEIEE
19(
10
CD23
The sequence differences between FceRIIa and FceRIIb reside within the first nine amino acids. The sequence downstream of this region is identical in both forms.
References 1 Suter, U. et al. (1987) Nucleic Acids Res. 15, 7295-7308. 2 Sarfati, M. et al. (1995)Leucocyte Typing V, 530-533.
3 Bonnefoy, J.-Y. et al. (1995) Curr. Opin. Immunol. 7, 355-359. 4 Sutton, B.J. and Gould, H.J. (1993) Nature 366, 421-428. s Bonnefoy, J.-Y. et al. (1996) Immunol. Today 17, 418-420.
6 Lamers, M.C. and Yu, P. (1995) Immunol. Rev. 148, 71-95. 7 s 9 lo 11
Mudde, G.C. et al. (1995)Immunol. Today 16, 380-383. Dugas, B. et al. (1995)Immunol. Today 16, 574-580. Hewitt, C.R.A. et al. (1995) J. Exp. Med. 182, 1537-1544. Ikuta, K. et al. (1987) Proc. Natl Acad. Sci. USA 84, 819-823. Bettler, B. et al. (1989) Proc. Natl Acad. Sci. USA 86, 7566-7570.
191
Heat stable antigen (HSA), M 1 / 6 9 - J l l d (mouse)
Molecular weights Polypeptide 3128 SDS-PAGE reduced
35-45 kDa
Carbohydrate N-linked sites O-linked
probable +
2
Human gene location 6q21
Tissue distribution Human CD24 is expressed on B cells, but decreases on activation and is lost at the plasma cell stage. The antigen is also present on granulocytes, a small number of thymocytes (2%) and normal epithelium 1-s. Mouse CD24 is present at all stages of B cell development and on most thymocytes 6,7. The absence of expression from mature T cells is closely associated with their maturation from CD4+CD8+CD24 § thymocytes to either CD4+CD8-CD24 or CD4-CD8+CD24 - T cells. The antigen is also expressed on mouse monocytes, granulocytes, Langerhans cells and erythrocytes 6,8. Rat CD24 expression has been detected, using in situ hybridization, in the developing central nervous system and in the epithelium of developing non-neural tissues 9.
Structure ]
CD24 is a GPI-linked sialoglycoprotein. The mature protein is predicted to be 1 only 33 amino acids long and has a high content (48 %) of Ser and Thr residues that may be the site of O-linked glycosylation 4. Different CD24 mAbs recognize epitopes that are dependent on sialic acid lo. The precursor forms of mouse and rat CD24 show 62.5% and 65% amino acid identity to the human sequence. However, the mature form of the molecule shows only 33% (mouse/human)and 42% (rat/human)sequence identity 4"9"11. It is possible that the rodent sequences are not CD24 homologues but members of a closely related family.
71
Ligands and associated molecules CD24 has been reported to be a P-selectin ligand (see CD62P)12.
Function CD24 may play a role in regulation of B cell proliferation and differentiation. CD24 mAbs have been shown to inhibit human B cell differentiation into antibody-secreting cells 13 and synergize with phorbol esters in triggering B cell proliferation 14. Crosslinking of CD24 induces an increase in intracellular
L9~
CD24
calcium levels in B cells and the production of hydrogen peroxide in granulocytes s. In the mouse, CD24 can provide costimulatory activity for CD4 § T cell activation and is involved in the aggregation of LPS-activated splenic B cells ls,16
Comments Multiple CD24 genes have been identified and mapped in both human and mouse 17,18. In addition to the human CD24 gene on chromosome 6q21, the human CD24 eDNA hybridized to sequences on chromosome 15q21-22 and Yql 1 but these genes have not been shown to be functional. Mouse CD24 genes have been mapped to chromosomes 10 (Cd24a), 8 (Cd24b) and 14 (Cd24c) 19. The Cd24a gene has a single intron and encodes the CD24 mRNA. Both the Cd24b and Cd24c genes lack the intron, are not expressed in adult mouse tissues and may have arisen by retropositioning. Database accession numbers PIR
Human Mouse Rat
f<
SWISSPR O T P25063 P24807 Q07490
EMBL/GENBANK M58664, L33930 M58661 Zl1663
REFERENCE 4 11 9
A m i n o acid s e q u e n c e of h u m a n C D 2 4 MGRAMVARLGLGLLLLALLLPTQIYS
S~.TTTGTSSN SSQSTSNSGI~ APNPTNATT~: AAG G;~I~QSTASI~F VVSLST,T,HLY S
-i 33 +21
References 1 z a a s 6 7 8 9 lo 11 12 la la is 16 17 18 19
Kemshead, J.T. et al. (1982)Hybridoma 1, 109-123. Hsu, S-M. and Jaffe, E.S. (1984)Am. J. Pathol. 114, 387-395. Jackson, D. et al. (1992)Cancer Res. 52, 5264-5270. Kay, R. et al. (1991) J. Immunol. 147, 1412-1416. Fischer, G.F. et al. (1990)J. Immunol. 144, 638-641. Takei, F. et al. (1981) Immunology 42, 371-378. Crispe, I.N. and Bevan, M.J. (1987) J. Immunol. 138, 2013-2018. Enk, A.H. and Katz, S.I. (1994) J. Immunol. 152, 3264-3270. Shirasawa, T. et al. (1993) Dev. Dyn. 198, 1-13. Larkin, M. et al. (1991) Clin. Exp. Immunol. 85, 536-541. Kay, R. et al. (1990) J. Immunol. 145, 1952-1959. Aigner, S. et al. (1995) Int. Immunol. 7, 1557-1565. de Rie, M.A. et al. (1987) Leucocyte Typing III, 402-405. Rabinovitch, P.S. et al. (1987) Leucocyte Typing III, 435-439. Liu, Y. et al. (1992) J. Exp. Med. 175, 437-445. Kadmon, G. et al. (1992) J. Cell Biol. 118, 1245-1258. Hough, M.R. et al. (1994)Genomics 22, 154-161. Wenger, R.H. et al. (1991) Eur. J. Immunol. 21, 1039-1046. Wenger, R.H. et al. (1993) J. Biol. Chem. 268, 23345-23352.
tg,~
Dipeptidyl peptidase IV (EC 3.4.14.5) Other names Tp 103 Adenosine deaminase binding protein Thymocyte-activating molecule (THAM)(mouse)
Molecular weights Polypeptide
88 319
SDS-PAGE reduced unreduced
110 kDa 110, 140 kDa
Carbohydrate N-linked sites O-linked
9 unknown
Human gene location and size 2q24.3; 70 kb 1
NH 2
NH 2
Tissue distribution CD26 is expressed by a variety of haematopoietic and non-haematopoietic cell types. On leucocytes, CD26 is expressed primarily on mature thymocytes in the medulla. Expression is weak on mature T cells and is restricted to memory T cells, although CD26 is upregulated on T cell activation. On non-haematopoietic cells, CD26 is found on epithelial cells of the intestine, kidney proximal tubule, bile duct and prostate gland (reviewed in refs 2 and 3).
Structure CD26 is a type II integral membrane dipeptidylpeptidase with a short Nterminal cytoplasmic tail, a transmembrane region and a large C-terminal extracellular region that contains nine N-glycosylation sites. CD26 is a member of the polyoligo peptidase family. The C-terminal region contains the putative catalytic site, consisting of Gly-Trp-Ser-Tyr-Gly motif (amino acids 628-632), Asp708 and His740, in the reverse order of the serine protease family. CD26 is expressed as a non-covalently linked homodimer 2,3.
Ligands and associated molecules CD26 associates non-covalently with adenosine deaminase (ADA), the deficiency of which results in severe combined immunodeficiency in humans. This interaction is mediated by the extracellular region of CD26 and is not dependent on catalytic activity (reviewed in ref. 2). CD26 co-immunoprecipitates and co-modulates with CD45 on T cells z. A ligand for CD26 is the extracellular matrix protein collagen. This interaction does not require CD26 catalytic activity and mAb blocking studies indicate that the binding site is within residues 236-491 4.
194
CD26
! ! ! ! ! !
:
!
! --__ -_
!
! I! _
! I
! _
!
!
Function CD26 is proposed to perform three distinct functions, as a membrane bound protease, a T cell co-stimulatory molecule and a cell adhesion molecule e. CD26 has a unique specificity amongst cell surface serine proteases: dipeptides are cleaved from the N-terminus of polypeptides if proline is at the penultimate position. This enzymatic activity is responsible for the intestinal digestion and renal transport of proline-containing polypeptides. Indeed, the Fischer 344 rat strain lacks functional CD26, resulting in impaired renal absorption of proline-containing peptides 2. CD26 also functions as a T cell co-stimulatory molecule. Expression of the T cell receptor (TCR) complex is required for CD26 signalling, in which the TCR chain is necessary but not sufficient s. The mechanism for CD26 signalling is not known, but may be related to the association of CD26 with CD45 and ADA e. In particular, ADA binding to CD26 on T cells is proposed to reduce the local concentration of adenosine, a nucleoside which inhibits T cell proliferation 6. It is not clear whether catalytic activity is required for CD26 signalling 2. However, the cell adhesion role of CD26, in binding to the extracellular matrix via collagen, does not depend on catalytic activity 4.
Comments CD26 is associated with human immunodeficiency virus (HIV) disease progression, which is a feature of many other T cell molecules. There is some correlation between CD26 expression and HIV entry, replication and cytopathicity. In addition, CD26 is downregulated as the disease progresses. However, CD26 is clearly not a co-receptor for HIV, as was originally reported (reviewed in ref. 7). Dipeptidyl peptidase IV activity has been measured in serum, but this does not appear to be the result of CD26 cleavage from the cell surface. Instead, a novel 175 kDa T cell antigen DPPT-L, related to CD26, is thought to be responsible s. Database accession numbers Human Rat Mouse
PIR $24313 A33315 S 23 752
SWISSPR O T P27487 P14740 P28843
EMBL/GENBANK X60708 J04591 X58384
REFERENCE 9 10 11
A m i n o acid sequence of h u m a n CD26 MKTPWKILLG NTYRLKLYSL SINDYSISPD NTQWVTWSPV WVYEEEVFSA TVRVPYPKAG CDVTWATQER TGWVGRFRPS TWEVIGIEAL PERCQYYSVS
LLGAAALVTI RWISDHEYLY GQFILLEYNY GHKLAYVWNN YSALWWSPNG AVNPTVKFFV ISLQWLRRIQ EPHFTLDGNS TSDYLYYISN FSKEAKYYQL
ITVPVVLLNK KQENNILVFN VKQWRHSYTA DIYVKIEPNL TFLAYAQFND VNTDSLSSVT NYSVMDICDY FYKIISNEEG EYKGMPGGRN RCSGPGLPLY
GTDDATADSR AEYGNSSVFL SYDIYDLNKR PSYRITWTGK TEVPLIEYSF NATSIQITAP DESSGRWNCL YRHICYFQID LYKIQLIDYT TLHSSVNDKG
KTYTLTDYLK ENSTFDEFGH QLITEERIPN EDIIYNGITD YSDESLQYPK ASMLIGDHYL VARQHIEMST KKDCTFITKG KVTCLSCELN LRVLEDNSAL
50 i00 150 200 250 300 350 400 450 500
19E
CD26
DKMLQNVQMP CSQKADTVFR FEVEDQIEAA IAVAPVSRWE LLIHGTADDN IYTHMSHFIK
SKKLDFIILN LNWATYLAST RQFSKMGFVD YYDSVYTERY VHFQQSAQIS QCFSLP
ETKFWYQMIL ENIIVASFDG NKRIAIWGWS MGLPTPEDNL KALVDVGVDF
PPHFDKSKKY RGSGYQGDKI YGGYVTSMVL DHYRNSTVMS QAMWYTDEDH
PLLLDVYAGP MHAINRRLGT GSGSGVFKCG RAENFKQVEY GIASSTAHQH
550 600 650 700 750 766
References 1 Abbott, C.A. et al. (1994) Immunogenetics 40, 331-338.
2 Fleischer, B. (1994) Immunol. Today 15, 180-184. 3 4 s a 7 8 9 lo 11
[9~
Shipp, M.A. and Look, A.T. (1993) Blood 82, 1052-1070. Loster, K. et al. (1995) Biochem. Biophys. Res. Commun. 217, 341-348. Mittrucker, H.-W. et al. (1995) Eur. J. Immunol. 25, 295-297. Dong, R.-P. et al. (1996) J. Immunol. 156, 1349-1355. Dalgleish, A. (1995)Nature Med. 1, 881-882. Duke-Cohan, J.S. et al. (1996)J. Immunol. 156, 1714-1721. Misumi, Y. et al. (1992) Biochim. Biophys. Acta 1131, 333-336. Ogata, S. et al. (1989) J. Biol. Chem. 264, 3596-3601. Marguet, D. et al. (1992) J. Biol. Chem. 267, 2200-2208.
Molecular weights Polypeptide
26 898
SDS-PAGE reduced unreduced
50-55 kDa 120 kDa
Carbohydrate i
N-linked sites O-linked
1 +++
Human gene location and size 12p131 ; 7kb 1
O00H CPE
Domains
isll
Exon boundaries
I1 GTF
AQcCIP Tr
'1I
AEC CRN 9r
,II
Mod edT, iTMICu I1 I1 I1 I1 GLL EML PQR NKG
Tissue distribution CD27 protein is present on T cells of both CD4 § and CD8 § subsets and on medullary thymocytes, some B cells and NK cells 2-4. Expression on T cells is upregulated on activation and is higher on CD45RA § than CD45RO § cells 2,4.
Structure CD27 is a member of the TNFR superfamily, with two full cysteine-rich repeats and one half repeat in the extracellular domain of the protein 2,s,6. The 70 amino acid membrane-proximal region contains sites for O-linked glycosylation 7. Extensive O-glycosylation has been established by studies on biosynthesis and effect of N- and O-glycanases 7. The protein is found on the surface of T cells as a disulfide-linked homodimer 6. Soluble CD27 is released probably by proteolytic cleavage on activation 4.
Ligands and associated molecules CD27 binds a ligand CD27L also called CD70, a type II membrane protein which is a member of the TNF superfamily 2-4.
Function Cells expressing CD70 can interact via CD27 to co-stimulate T cell proliferation, generate cytotoxic T cells and enhance cytokine production 2,3 CD70 binding to CD27 on B cells co-stimulates B cell proliferation and immunoglobulin production 3. CD27 is phosphorylated on serine residues
19~
CD27
and hyperphosphorylated with T cell activation 8. No tyrosine phosphorylation is observed.
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A46517 A49053
P26842 P41272
M63928 L24495
6 9
Amino acid sequence of human CD27 MARPHPWWLC TPAPKSCPER FSPDHHTRPH DPLPNPSLTA ARTLSTHWPP KGESPVEPAE
VLGTLVGLSA HYWAQGKLCC CESCRHCNSG RSSQALSPHP QRSLCSSDFI PCRYSCPREE
QMCEPGTFLV LLVRNCTITA QPTHLPYVSE RILVIFSGMF EGSTIPIQED
KDCDQHRKAA NAECACRNGW MLEASTAGHM LVFTLAGALF YRKPEPACSP
QCDPCIPGVS QCRDKECTEC QTLADFRQLP LHQRRKYRSN
References 1 z a 4 s 6 7 8 9
19~
Loenen, W.A.M. et al. (1992) J. Immunol. 149, 3937-3943. Gruss, H.-J. and Dower, S.K. (1995) Blood 85, 3378-3404. Kobata, T. et al. (1995) Proc. Natl Acad. Sci. USA 92, 11249-11253. Hintzen, R.Q. et al. (1994) Immunol. Today 15, 307-311. Armitage, R.J. (1994)Curr. Opin. Immunol. 6, 407-413. Camerini, D. et al. (1991)J. Immunol. 147, 3165-3169. Loenen, W.A.M. et al. (1992) Eur. J. Immunol. 22, 447-456. De Jong, R. et al. (1991) J. Immunol. 146, 2488-2494. Gravestein, L.A. et al. (1993) Eur. J. Immunol. 23, 943-950.
-i 50 i00 150 200 240
Tp44
Molecular weights Polypeptide 23 085 SDS-PAGE reduced unreduced
44 kDa 90 kDa
Carbohydrate N-linked sites O-linked
5 unknown COOH COOH
Human gene location and size 2q33; 36 kb 1,2 CKY
Domain
Exon boundaries
Isl
I1
TGN
YFC I
I,
KGK
oO1 FWVR
Tissue distribution CD28 is constitutively expressed on most T lineage cells and plasma cells 3,4. Mature thymocytes have higher levels of CD28 than the immature cells and among peripheral T cells, all CD4 § cells and -50% of human CD8 § cells are positive. In general, activation of T cells leads to enhanced CD28 expression but ligation of CD28 leads to its transient downregulation ~. Structure CD28 is disulfide-linked homodimer with a cellular portion 3. It is structurally similar to binds the same ligands (see below), and the apart s, suggesting that they share a common
single IgSF domain in its extraCD 152 (CTLA-4, 31% identity), two genes are less than 150 kb ancestor in evolution.
Ligands and associated molecules Like CD 152, CD28 binds both CD80 and CD86 using a highly conserved motif (MYPPPY) in the CDR3-1ike loopS. CD28 binds CD80 with a low affinity (Kd 4/~M) and dissociates very rapidly (koff >__1.6s-1)7. Binding to CD86 may be even weaker 8. The cytoplasmic_ domain interacts with phosphatidylinositol 3-kinase (PI 3-kinase), the complex between GRB-2 and the guanine nucleotide exchange protein SOS (GRB-2/SOS), and the tyrosine kinase ITK 9,1o. SH2 domains in PI 3-kinase and GRB-2/SOS mediate binding to the CD28 motif YMNMT, after it has been phosphoryled by Lck and Fyn 9,1o. Function Studies in vitro suggested that ligation of CD28 on T cells by CD80 and CD86 on antigen presenting cells provides a~co-stimulatory signal required for T cell
19~
CD28
activation 3. However, mice lacking CD2811 are able to mount effective T cell responses and are mainly defective in T cell-dependent antibody responses, suggesting that CD28 is mainly important for T cell-B cell interactions and humoral i m m u n i t y 4. Ligation of CD28 activates several signal-transduction pathways, including PI 3-kinase 9, and inhibits degradation of cytokine m R N A 12 Database accession numbers Human Mouse Rat
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A39983 A43523 $24413
P10747 P31041 P31042
J02988 M34563 X55288
13 14 is
Amino acid sequence of human CD28 MLRLLLALNL F P S I Q V T G NKILVKQSPM YGNYSQQLQV PPPYLDNEKS LVTVAFIIFW RS
LVAYDNAVNL YSKTGFNCDG NGTIIHVKGK VRSKRSRLLH
-1 SCKYSYNLFS KLGNESVTFY HLCPSPLFPG SDYMNMTPRR
REFRASLHKG LQNLYVNQTD PSKPFWVLVV PGPTRKHYQP
LDSAVEVCVV IYFCKIEVMY VGGVLACYSL YAPPRDFAAY
50 i00 150 200 202
References 1 2 3 4 s 6 7 8 9 lo 11 12 13 14 is
~.0~
Lafage-Pochitaloff, M. et al. (1990)Immunogenetics 31, 198-201. Lee, K.P. et al. (1990) J. Immunol. 145, 344-352. Linsley, P.S. and Ledbetter, J.A. (1993)Annu. Rev. Immunol. 11, 191-211. Lenschow, D.J. et al. (1996) Annu. Rev. Immunol. 14, 233-258. Buonavista, N. et al. (1992) Genomics 13, 856-861. Peach, R.J. et al. (1994) J. Exp. Med. 180, 2049-2058. van der Merwe, P.A. et al., (1997)J. Exp. Med. 185, 394-403. Greene, J.L. et al. (1996)J. Biol. Chem. 271, 26762-26771. June, C.H. et al. (1994)Immunol. Today 15, 321-331. Raab, M. et al. (1995) Proc. Natl Acad. Sci. USA 92, 8891-8895. Shahinian, A. et al. (1993) Science 261,609-612. June, C.H. et al. (1990) Immunol. Today 11,211-216. Aruffo, A. and Seed, B. (1987) Proc. Natl Acad. Sci. USA 84, 8573-8577. Gross, J.A. et al. (1990) J. Immunol. 144, 3201-3210. Clark, G.J. and Dallman, M.J. (1992) Immunogenetics 35, 54-57.
CD29 E
:
Integrin fll subunit
Molecular weights Polypeptides Aisoform B isoform C isoform Disoform
86242 85 273 89446 86710
SDS-PAGE (A isoform) reduced unreduced
130 kDa 115 kDa $
E :
Carbohydrate N-linked sites O-linked sites
12 unknown
: : :
Human gene location E
10pll.2
S
TT~'T~ " TT~'~' ~'T~'~ ;;~~ ;~~
;;~
CD49e/CD29
Tissue distribution CD29 is expressed on most cells. It is expressed on all leucocytes, although only at low levels on granulocytes 1,2. On T cells, CD29 is expressed at higher levels on memory than naive cells 1,2 Structure CD29 forms heterodimers with many integrin e subunits including the CD49a-f (el-a6) and CD51 (eV)antigens and, in non-lymphoid tissues, ~7~9 3-a The CD49a-f/CD29 heterodimers are also termed the very late antigens (VLA-1 to VLA-6) because two of them (VLA-1 and -2) appear on lymphocytes several weeks after stimulation. Four CD29 isoforms (A-D) with different cytoplasmic domains are generated by alternative splicing 7-1o. The B isoform (/J1g'v) is found in the placenta and is expressed on human umbilical vein endothelial cells (HUVECs)as well as lymphoma, neuroblastoma and hepatoma cell lines 8. The C isoform (I/is) is expressed on platelets and a number of haematopoietic cell lines but not detected on peripheral blood lymphocytes or HUVECs 9. The D isoform is exclusively expressed in skeletal and cardiac muscle lo Ligands and associated molecules Integrins which include CD29 bind to several cell surface and extracellular matrix molecules (see CD49a-f and CD51). Function Integrin heterodimers containing CD29 mediate cell-cell and cell-matrix adhesion (see CD49a-f and CDS1). The adhesive properties of CD29
~.01
CD29
heterodimers on T cells can be regulated by cell activation 11, possibly through interactions b e t w e e n the cytoplasmic domain of CD29 and the cytoskeleton 2'12. The different cytoplasmic domains in the A - D isoforms m a y allow interactions with different intracellular elements 7-1o.
Database accession numbers Human Human Human Human Mouse
A form B form C form D form
FIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
B27079
P05556
PL0104 S01659
P09055
X07979 U33879 U33882 ~U33880 K.15202 Y00769
7 8 9 lo 13 la
Amino acid sequence of human CD29 MNLQPIFWIG QTDENRCLKA KKKGCPPDDI LRLRSGEPQT NEMRRITSDF NVLSLTNKGE TRLLVFSTDA HLVQKLSENN IIDAYNSLSS GDEVQFEISI IPESPKCHEG SSEICSNNGE NGVCKCRVCE TDPKFQGQTC TKVESRDKLP PECPTGPDII MNAKWDT
LISSVCCVFA NAKSCGECIQ ENPRGSKDIK FTLKFKRAED RIGFGSFVEK VFNELVGKQR GFHFAGDGKL IQTIFAVTEE EVILENGKLS TSNKCPKKDS NGTFECGACR CVCGQCVCRK CNPNYTGSAC EMCQTCLGVC QPVQPDPVSH PIVAGVVAGI
AGPNCGWCTN STFLQEGMPT KNKNVTNRSK GTAEKLKPED YPIDLYYLMD LSYSMKDDLE TVMPYISTTP AKLRNPCTSE ISGNLDSPEG GFDAIMQVAV GGIVLPNDGQ CHLENNMYTM FQPVYKELKN LIPKSAVGTL EGVTISYKSY CKNGVNGTGE DSFKIRPLGF TEEVEVILQY CNEGRVGRHC ECSTDEVNSE R D N T N E I Y S G KFCECDNFNC DCSLDTSTCE ASNGQICNGR AEHKECVQCR AFNKGEKKDT CKEKDVDDCW FYFTYSVNGN VLIGLALLLIWKLLMIIHDR
SARCDDLEAL IHQIQPQQLV NVKSLGTDLM QNCTTPFSYK CGSLIGWRNV SHYYDYPSIA SANSSNVIQL NGRKCSNISI ICECECQSEG DMDAYCRKEN DRSNGLICGG GICECGVCKC CTQECSYFNI NEVMVHVVEN REFAKFEKEK
-I 5O i00 150 200 250 300 350 400 450 5OO 550 600 650 7OO 75O 757
The A - D isoforms terminate as follows: A: G E N P I Y K S A V B : VSYKTSKKQS C: S L S V A Q P G V Q D: Q E N P I Y K S P I
TTVVNPKYEG K GL W C D I S S L Q P L T S R Q Q F S C L S L P S T W D Y R V K ILFIRVP N N F K N P N Y G R KAGL
References 1 Pigott, R. and Power, C. (1993)The Adhesion Molecule FactsBook. Academic Press, London. z Hemler, M.E. (1990) Annu. Rev. Immunol. 114, 365-400. 3 Song, W.K. et al. (1992) J. Cell Biol. 117, 643-657. 4 Ziober, B.L. et al. (1993) J. Biol. Chem. 268, 26773-26783. s Bossy, B. et al. (1991) EMBO J. 10, 2375-2385. 6 Palmer, E.L. et al. (1993) J. Cell Biol. 123, 1289-1297. 7 Argraves, W.S. et al. (1987) J. Cell Biol. ,105, 1183-1190. 8 Balzac, F. et al. (1993) J. Cell Biol. 424, 171-178
~_02
778 769 804 781
CD29
9 lo i~ x2 13 14
Languino, R.L. and Ruoslahti, E. (1992) J. Biol. Chem. 267, 7116-7120. Belkin, A.M. et al. (1996) J. Cell Biol. 132, 211-216. Shimizu, Y. et al. (1990) Nature 345, 250-253. Schwartz, M.A. et al. (1995) Annu. Rev. Cell Dev. Biol. 11, 549-599. Holers, V.M. et al. (1989) J. Exp. Med. 169, 1589-1605. Tominaga, S.I. (1988) FEBS Lett. 138, 315-319.
~_0~
CD30
Ki-1, Ber-H2 antigen
Molecular weights Polypeptide 61 893 SDS-PAGE reduced unreduced
105-120 kDa 105-120 kDa
Carbohydrate N-linked sites O-linked
2 ++++
Human gene location lp361
COOH
CHGTDcCEP RVCCRP TVC
Domains
I sl
Tr
I
Tr
I
Zr
[ M~
CEP RTcCRP TTF
[
Zr
I .... Tr
I
Izulc~'l
Tissue distribution CD30 antigen is not found on resting lymphocytes or monocytes, but is expressed on mitogen-activated B and T cells 1. In normal tissue the antigen is found on large lymphoid cells in sections of lymph node, tonsil, and thymus 1. CD30 is present on Reed-Sternberg cells of Hodgkin's lymphoma and many other malignant cell lines 1. E
Structure CD30 is a member of the TNFR superfamily with five clearly identifiable Cysrich repeats 2-4. The five repeats are interrupted after repeat 3 by a hinge sequence of about 60 amino acids that may have derived from the central region of another TNFR repeat. This is argued because there are two Cys residues at the end of this region which are surrounded by sequence patterns typical of TNFR repeats. Repeats 2 and 5 show particular sequence similarities, suggesting that CD30 may have evolved by gene duplication of a precursor structure with three repeats 2,3. Biochemical analysis of CD30 revealed the presence of O-linked sugars accounting for 4 kDa on SDS-PAGE s. It is likely that both the central hinge region and the membrane-proximal region are sites for O-glycosylation by virtue of the high content of Ser, Thr, and Pro residues. In cell lines, CD30 is phosphorylated on serine and tyrosine residues s.
t04
CD30
Ligands and associated molecules CD30 binds to CD 153, a member of the TNF superfamily. Function The CD153-CD30 interaction co-stimulates T cell proliferation 1'3 and upregulates expression of adhesion molecules and cytokine release 1. A role for the CD153-CD30 interaction in deletion of thymocytes is suggested from studies with CD30-deficient mice 6. CD30 § T cell clones produce TH2type cytokines. A role for the CD153-CD30 interaction in TH2-type autoimmune disease has been suggested 1,7.
!
E
Comments Truncated CD30 protein is released from the cell surface and found as a soluble protein in the serum of some patients with adult T cell leukaemia or other CD30 § lymphomas 1. Levels of soluble CD30 correlated with disease 1. Database accession numbers Human Mouse
st
PIR A42086
SWISSPR OT P28908
EMBL/GENBANK M83554 U25416
REFERENCE 8 2
A m i n o a c i d s e q u e n c e of h u m a n C D 3 0 MRVLLAALGL FPQDRPFEDT CEPDYYLDEA VNSCARCFFH PSSGTIPQAK GRPSSDPGLS EKTPCAWNSS EKDTTFEAPP SAPVALSSTG KLHLCYPVQT METCHSVGAA IMKADTVIVG LGSCSDVMLS
LFLGALRA CHGNPSHYYD DRCTACVTCS SVCPAGMIVK PTPVSPATSS PTQPCPEGSG RTCECRPGMI LGTQPDCNPT KPVLDAGPVL SQPKLELVDS YLESLPLQDA TVKAELPEGR VEEEGKEDPL
KAVRRCCYRC RDDLVEKTPC FPGTAQKNTV ASTMPVRGGT DCRKQCEPDY CATSATNSCA PENGEAPAST FWVILVLVVV RPRRSSTQLR SPAGGPSSPR GLAGPAEPEL PTAASGK
PMGLFPTQQC AWNSSRVCEC CEPASPGVSP RLAQEAASKL YLDEAGRCTA RCVPYPICAA SPTQSLLVDS VGSSAFLLCH SGASVTEPVA DLPEPRVSTE EEELEADHTP
PQRPTDCRKQ RPGMFCSTSA ACASPENCKE TRAPDSPSSV CVSCSRDDLV ETVTKPQDMA QASKTLPIPT RRACRKRIRQ EERGLMSQPL HTNNKIEKIY HYPEQETEPP
-i 50 i00 150 200 250 300 350 400 450 500 550 577
References 1 2 3 4 s 6 7 8
Gruss, H.-J. and Dower, S.K. (1995) Blood 85, 3378-3404. van Kooten, C. and Banchereau, J. (1996)Adv. Immunol. 61, 1-77. Smith, C.A. et al. (1993) Cell 73, 1349-1360. Armitage, R.J. (1994)Curr. Opin. Immunol. 6, 407-413. Nawrocki, J.F. et al. (1988)J. Immunol. 141,672-680. Amakawa, R. et al. (1996) Cell 84, 551-562. Del Prete, G. et al. (1995) Immunol. Today 16, 76-80. Durkop, H. et al. (1992) Cell 68, 421-427.
'05
Platelet endothelial cell adhesion molecule 1 (PECAM-1) ] Molecular weights Polypeptide
79 5 79
SDS-PAGE reduced unreduced
130-140 kDa 130-140 kDa
Carbohydrate N-linked sites O-linked
9 unknown
Human gene location and size 17q23-ter; 65 kb 1
T~TT
TTT
COOH Domains
~s,
CFA I YKC
LCS NSF
c~
CSV
i EGV
FRC
c~
CTI
i TIES
YTC
c~
CSI
i TEL
YIC
c~
CES
f
GEM
YQC
c~
CAV
I lAP
YYC
c~
rTMI~
RVI KAK I
Tissue distribution CD31 is present on virtually all monocytes, platelets and granulocytes. Approximately 50% of resting PBL are CD31 +2. CD31 expression changes on maturation of CD4 § T cells 3. CD31 is highly expressed on endothelial cells and concentrated at the junctions between them 2.
Structure CD31 contains six IgSF C2-set domains with sequence similarity with the carcinoembryonic antigen (CD66), NCAM (CD56) and CD32 4-6. It is possible that there is a disulfide bond between domains 4 and 5.
Ligands and associated molecules CD31 interacts homotypically in cell adhesion assays 7. In addition it interacts heterotypically with the integrin ~v]13 s,9 and glycosaminoglycans lo. Several domains seem to be involved in the homotypic interaction whereas domain
~_0~
CD31
2 seems to be the main domain interacting with the integrin 8,9. Alternative splicing gives rise to variants with different cytoplasmic regions and these give rise to specific binding characteristics 11 Function MAbs and recombinant CD31 proteins can inhibit neutrophil migration through blood vessels by blocking at the stage of interaction with the basement membrane 12. Comments A Leu to Val polymorphism in CD31 (residue 98 in mature protein below)is a minor transplantation antigen where matching provides a significantly lower risk of graft-versus-host disease in bone marrow transplants 13 Database accession numbers Human Mouse
Amino
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A40096
P16284 Q08481
M28526, M37780 L06039
4-6
acid sequence
MQPRWAQGAT QENSFTINSV DDVLFYNISS EGVPSPRVTL KREKNSRDQN TVTESFSTPK IVAHNRHGNK PELESSFTHL KSDSGTYICT EVRCESISGT ADNCHSHAKM GSGPITYKFY ANHASSVPRS AKAKQMPVEM NDNKEPLNSD YSRTEGSLDG
MWLGVLLTLL DMKSLPDWTV MKSTESYFIP DKKEAIQGGI FVILEFPVEE FHISPTGMIM AVYSVMAMVE DQGERLNLSC AGIDKVVKKS LPISYQLLKT LSEVLRVKVI REKEGKPFYQ KILTVRVILA SRPAVPLLNS VQYTEVQVSS T
of h u m a n
14
CD31
LCSSLEG QNGKNLTLQC EVRIYDSGTY VRVNCSVPEE QDRVLSFRCQ EGAQLHIKCT HSGNYTCKVE SIPGAPPANF NTVQIVVCEM SKVLENSTKN APVDEVQISI MTSNATQAEW PWKKGLIAVV NNEKMSDPNM AESHKDLGKK
FADVSTTSHV KCTVIVNNKE KAPIHFTIEK ARIISGIHMQ IQVTHLAQEF SSRISKVSSI TIQKEDTIVS LSQPRISYDA SNDPAVFKDN LSSKVVESGE TKQKASKEQE IIGVIIALLI EANSHYGHND DTETVYSEVR
KPQHQMLFYK KTTAEYQLLV LELNEKMVKL TSESTKSELV PEIIIQKDKA VVNITELFSK QTQDFTKIAS QFEVIKGQTI PTEDVEYQCV DIVLQCAVNE GEYYCTAFNR IAAKCYFLRK DVRNHAMKPI KAVPDAVESR
-i 5O i00 150 200 250 300 350 400 450 5OO 550 600 650 7OO 711
References 1 2 3 4 s 6 z 8 9 lo
Kirschbaum, N. et al. (1994) Blood 84, 4028-4037. DeLisser, H.M. et al. (1994) Immunol. Today 15, 490-495. Demeure, C.E. et al. (1996) Immunology 88, 110-115. Newman, P.J. et al. (1990) Science 247, 1219-1222. Simmons, D.L. et al. (1990) J. Exp. Med. 171, 2147-2152. Stockinger, H. et al. (1990) J. Immunol. 145, 3889-3897. Fawcett, J. et al. (1995) J. Cell Biol. 128, 1229-1241. Piali, L. et al. (1995) J. Cell Biol. 130, 451-460. Buckley, C.D. et al. (1996)J. Cell Sci. 109, 437-445. DeLisser, H.M. et al. (1993) J. Biol. Chem. 268, 16037-16046.
~_0~
CD31
11 12 13 14
~.08
Yan, H.C. et al. (1995) J. Biol. Chem. 270, 23672-23680. Wakelin, M.W. et al. (1996) J. Exp. Med. 184, 229-239. Behar, E. et al. (1996)New Engl. J. Med. 334, 286-291. Xie, Y. and Muller, W.A. (1993) Proc. Natl Acad. Sci. USA 90, 5569-5573.
FcTRII, Fc receptor for aggregated IgG Molecular weights Polypeptides:
A isoform
31049 29 206 27107 29 206 30 757
A form B 1 form B2 form B3 form C form
SDS-PAGE reduced
40 kDa
TTTt" TTtT
Carbohydrate N-linked sites O-linked sites
2 nil
COOH
Human gene location and size There are three genes in 1q23-24 in the order of A - C - B in which the C gene is likely to have arisen from an unequal crossover between A and B; each gene is - 1 5 - 1 9 kb. The CD32 genes are intercalated with those of CD16 in a complex locus 1-3. s
I Domains
II
! I I
Exon boundaries I
c2
II II II LAP
I
I I I PAA
"xS/
LAA
Tissue
TM
'CQGI YTC (~HS YHC
SEW
c=
CY
I
t--i
I I
I I "H
isoforms
I
I !
A c B1 82
B3
SAL
,,,,z//PAN SAN
distribution
CD32 isoforms are expressed on a range of leucocytes including monocytes, macrophages, Langerhans cells, granulocytes, B cells, and platelets, as well as on endothelial cells of the placenta &4. All isoforms are expressed on monocytes, the B isoforms are present on B lymphocytes, and the A and C isoforms are found on neutrophils.
Structure CD32 contains two extracellular C2-set IgSF domains. There are six isoforms of CD32 derived from three genes. The C isoform is a hybrid between the A and B
tocJ
CD32
isoforms; it is identical to the B isoforms at the N-terminal (extracellular domain) but identical to the A isoform at the C-terminal (transmembrane and cytoplasmic segments). Three gene products are derived from the B gene. An exon is omitted in each of the B2 and B3 transcripts, leading to the B2 isoform with 19 residues deleted from the cytoplasmic domain and the B3 isoform having a shortened leader sequence e-5. A soluble variant of the A isoform has also been described (sequence not shown)that is generated by the alternative splicing of the m R N A 3,6.
Ligands and associated molecules CD32 is a low-affinity Fc7 receptor and only binds polymeric or aggregated IgG 3. The isotype preference for CD32 is IgG3 > IgG 1 > IgG2 = IgG4 3,4. The A isoform has been shown to associate with CD1 lb/CD18 but not CD1 lc/ CD18 7. The cytoplasmic portion of the B1 isoform associates with the tyrosine phosphatase SHP-1 and the inositol-5'-phosphatase (SHIP)14-16 via a phosphorylated IxYxxL motif 11-16.
Function Occupation of CD32 can trigger IgG-mediated phagocytosis and an oxidative burst in neutrophils and monocytes, possibly in coordination with the ligation of other receptors such as CD26 and CD 11 b/CD 18 4,s-lo. Co-ligation of membrane Ig with the B1 isoform of CD32 inhibits signalling through membrane Ig 11-16. As a result there is a dampening of the B cell response to antigen for which an IgG already exists. Co-ligation of CD32 B1 with membrane Ig leads to phosphorylation of Tyr247 in the IxYxxL motif which then binds SHP-1 and SHIP, leading to the inhibition of inositol phosphate production and Ca 2§ entry into the cell 11-14. CD32 expression on placental epithelia may indicate a role in transport of IgG 17.
I
i
Database accession numbers Human A Human B1 Human B2
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
JL0118 S02297 JL0119
P12318
M31932 Y00644 M31935 M31934 X17653 X52473 M31933 X17652 M16367 M17515 M14216 X04648
5 is 5 5 17 19 5 17 20
P31994
A43543 Human B3 Human C Mouse bl Mouse
~.11~
b2
S06946 $29361
P31995 P08101
B40071 A93384
P08102
21
2o,2e 23
Amino acid sequences of human CD32 A
M AMETQMSQNV
CPRNLWLLQP LTVLLLLASA
DSQAA
-1
C MGILSFLPVL ATESDWADCK B1 . . . . . . . . . . . . . . . . . B2 B3
SPQPWGHMLL WTAVLFLAPV
AGTPA
-i -I -i -I
A
....... -
APPKAVLKLE
PPWINVLQED
SVTLTCQGAR
SPESDSIQWF HNGNLIPTHT
50
C APPKAVLKLE B1 B2 ...... B3
PQWINVLQED
SVTLTCRGTH
SPESDSIQWF HNGNLIPTHT
50 50 50 50
A
QPSYRFKANN
NDSGEYTCQT
GQTSLSDPVH
LTVLSEWLVL
QTPHLEFQEG
i00
C QPSYRFKANN B1
NDSGEYTCQT
GQTSLSDPVH
LTVLSEWLVL
QTPHLEFQEG
i00 i00 i00 i00
B2
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
A
ETIMLRCHSW
KDKPLVKVTF
FQNGKSQKFS RLDPTFSIPQ
ANHSHSGDYH
150
C ETIVLRCHSW B1 B2 B3
KDKPLVKVTF
FQNGKSKKFS RSDPNFSIPQ
ANHSHSGDYH
150 150 150 150
A
CTGNIGYTLF
SSKPVTITVQ VPSMGSSSPM
GIIVAVVIAT
AVAAIVAAVV
200
C CTGNIGYTLY B1 B2 B3
SSKPVTITVQ AP...SSSPM ... ... ...
GIIVAVVTGI
AVAAIVAAVV
197 197 197 197
A
ALIYCRKKRI
S
211
C ALIYCRKKRI B1 B2 B3
S
208 208 208 208
B3
A C
................... ...................
A NSTDPVKAAQ
FEPPGRQMIA
IRKRQLEETN -P ....
242 239
B 1 ALPGYPECRE MGETLPEKPA B2 . . . . . . . . . . . . . . . . . . . B3
NPTNPDEADK
VGAENTITYS LLMHPDALEE
258 239 258
A
NDYETADGGY
DDKNIYLTLP
PNDHVNSNN
C
.....
281 278
B1 P D D Q N R I B2 B3
MTLNPRAPTD
265 246 265
~_11
.._...._.._
The C isoform is a hybrid of the A and B isoforms, and is listed between them. The B and C sequences are grouped until C residue 208, and are marked with an asterisk when different from the A isoform. A and C are grouped from C residue 209. Residues identical to the top sequence within the group are marked with dashes. The dotted stretches denote gaps. These result from exon omissions except for three additional residues (173-175) in isoform A. The IxYxxL motif is in bold.
References
!
I !
I I
I
I I
4
}.12
1 z 3 4 s 6 7 s 9 lo 11 lz 13 14 is 16 17 is 19 zo 21 22 23
Qiu, W.Q. et al. {1990) Science 248, 732-735. Warmerdam, P.A.M. et al. (1993) J. Biol. Chem. 268, 7346-7349. van de Winkel, J.G.J. and Capel, EJ.A. (1993) Immunol. Today 14, 215-221. Ravetch, J.V. and Kinet, J.P. (1991) Annu. Rev. Immunol. 9, 457-492. Brooks, D.G. et al. {1989} J. Exp. Med. 170, 1369-1385. Warmerdam, P.A.M. et al. (1992) J. Exp. Med. 172, 19-25. Annendov, A. {1996) Eur. J. Immunol. 26, 207-212. Huizinga, T.W.J. et al. {1989} J. Immunol. 142, 2365-2369. Brown, E.J. (1991) Curt. Opin. Immunol. 3, 76-82. Zhou, M.J. and Brown, E.J. (1994) J. Cell Biol. 125, 1407-1416. Muta, T. et al. (1994) Nature 368, 70-73. Choquet, D. et al. {1993) J. Cell Biol. 121,355-363. D'Ambrosia, D. et al. {1995} Science 268, 293-297. Bijsterbosch, M.K. and Klaus, G.G. {1985) J. Exp. Med. 162, 1825-1836. Doody, G.M. et al. (1996)Curt. Opin. Immunol. 8, 378-382. Scharenberg, A. M. and Kinet, J. -P (1996) Cell 87, 961-964 Stuart, S. G. et al. (1989) EMBO J. 8, 3657-3666. Stuart, S. G. {1987} J. Exp. Med. 166, 1668-1684. Engelhart, R. et al. {1990} Eur. J. Immunol. 20, 1367-1377. Ravetch, J.V. et al. {1986) Science 234, 718-725. Hogarth, P.M. et al. (1987) Immunogenetics 26, 161-168. Hogarth, P.M. et al. (1991)J. Immunol. 146, 369-376. Lewis, V.A. et al. (1986) Nature 324, 372-375.
CD33 Molecular weights Polypeptide
3 7 908
SDS-PAGE reduced
67 kDa
V
S
Carbohydrate N-linked sites O-linked
5 nil
Human gene location and size 19q13.3; <35 kb 1
Domains
CTF CSV I YFFI I LTCt I sl v I ITMIcu
OOOH
Tissue distribution CD33 is absent from pluripotential stem cells but appears on myelomonocytic precursors after CD34 2 It then continues to be expressed in both the myeloid and monocyte lineages although it is absent on granulocytes 3. CD33 is an important marker for distinguishing myeloid from lymphoid leukaemias 3.
Structure CD33 is the smallest member of a structurally related group of IgSF domaincontaining sialic acid-binding proteins called the sialoadhesin family, which includes sialoadhesin, CD22, and myelin-associated glycoprotein (MAG) 4. The genes for CD22, CD33 and MAG lie in the region 19q13.1-31,4. Like other members of the Sialoadhesin family, CD33 is predicted to have an unusual disulfide bond between fl strands B and E in domain 1 and a disulfide bond between domains 1 and 2 4. Two CD33 cDNA clones have been isolated in the mouse encoding CD33 isoforms with different cytoplasmic domains s
Ligands and associated molecules Like sialoadhesin, CD33 binds to the sialoglycoconjugates NeuAc~2--* 3Galfll --, 3(4)GlcNAc and NeuAca2 --, 3Galfll ~ 3GalNAc on glycoproteins 3.
Function May mediate cell-cell adhesion. Cells expressing CD33 require desialylation before they can bind cells bearing the appropriate sialoglycoconjugate 3, suggesting that inhibitory cis-interactions may regulate or block any adhesion function 3.
~.1~
CD33
Database accession numbers .
Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A30521
P20138
M23197 $71345/$71403
6 5
Amino acid sequence of human CD33 MPLLLLLPLL DPNFWLQVQE SGDSPVATNK FRMERGSTKY ACEQGTPPIF GAGVTTERTI TALLALCLCL GPTETSSCSG
WAGALAM SVTVQEGLCV LDQEVQEETQ SYKSPQLSVH SWLSAAPTSL QLNVTYVPQN IFFIVKTHRR AAPTVEMDEE
LVPCTFFHPI GRFRLLGDPS VTDLTHRPKI GPRTTHSSVL PTTGIFPGDG KAARTAVGRN LHYASLNFHG
PYYDKNSPVH RNNCSLSIVD LIPGTLEPGH IITPRPQDHG SGKQETRAGV DTHPTTGSAS MNPSKDTSTE
GYWFREGAII ARRRDNGSYF SKNLTCSVSW TNLTCQVKFA VHGAIGGAGV PKHQKKSKLH YSEVRTQ
References 1 2 3 4 s 6
Peiper, S.C. et al. (1988) Blood 72, 314-321. Andrews, R.G. et al. (1989) J. Exp. Med. 169, 1721-1731. Freeman, S.D. et al. (1995) Blood 85, 2005-2012. Crocker, P.R. et al. (1996) Biochem. Soc. Trans. 24, 150-156. Tchilian, E.Z. et al. (1994) Blood 83, 3188-3198. Simmons, D. and Seed, B. (1988) J. Immunol. 141, 2797-2800.
-i 50 i00 150 200 250 300 347
CD34
Sgp90
Molecular weights Polypeptide
37 319
SDS-PAGE reduced unreduced
90-120 kDa 90-120 kDa
Carbohydrate N-linked sites O-linked
9 +++
Human gene location and size 1q32; -26 kb i
I
Domains Exon boundaries LPS
CSG TET
QVCI
9 l,,b,l o"l
IKISCAEAE
TEl KKLG
COOH
Tissue distribution CD34 is expressed on a small subpopulation (1-4%) of bone marrow cells which includes haematopoietic stem cells (HSC). It is also present on bone marrow stromal cells and on most endothelial cells 1,z. Although a marker for primitive HSC in humans, CD34 has been reported to be absent from these cells in the mouse 3,4
Structure |
!
I ! !
i
I
CD34 is a highly glycosylated type I transmembrane glycoprotein. The first 130 amino acids are predicted to be heavily O-glycosylated and have a mucin-like structure 2. This is followed by a sequence of about 100 amino acids that can be expected to have a globular structure, based on the presence of six Cys residues 2. The cytoplasmic domain contains two sites for protein kinase C phosphorylation (and is phosphorylated following PKC activation s) and one site for tyrosine phosphorylation 2. A splice-variant of CD34 encodes a truncated protein lacking most of the cytoplasmic domain 6,7
Ligands and associated molecules CD62L (L-selectin) binds to sialoglycoconjugates present on a subpopulation of CD34 glycoforms; this binding requires sulfation and probably fucosylation of CD34 s,9. CD62E (E-selectin) can also bind CD34 lo
~_15
CD34
Function The ability of CD34 to bind the selectins CD62L and CD62~. and its expression on endothelium suggests a role in leucocyte-endothelial interactions 11. In support of this, CD34 is the major CD62L ligand in tonsillar high-endothelial venules and can mediate attachment and rolling of leucocytes in vitro lo. Curiously, CD34-deficient mice exhibit no detectable abnormality in neutrophil or lymphocyte trafficking but show decreased eosinophil accumulation in the lung following inhalation of allergen 12. Only minor haematopoietic defects have been reported in CD34-deficient mice 12'13. Database accession numbers Human Mouse
A.
PIR
SWISSPR OT
EMBL/GENBANK
A38078
P28906
M81104 $69293
REFERENCE 2 14
Amino acid sequence of human CD34 MPRGWTALCL SLDNNGTATP ATTNITETTV TTLKPSLSPG EVKLTQGICL LLAQSEVRPQ SHQSYSQKTL ENGGGQGYSS DTEL
LSLLPSGFM ELPTQGTFSN KFTSTSVITS NVSDLSTTST EQNKTSSCAE CLLLVLANRT IALVTSGALL GPGTSPEAQG
VSTNVSYQET VYGNTNSSVQ SLATSPTKPY FKKDRGEGLA EISSKLQLMK AVLGITGYFL KASVNRGAQK
TTPSTLGSTS SQTSVISTVF TSSSPILSDI RVLCGEEQAD KHQSDLKKLG MNRRSWSPTG NGTGQATSRN
LHPVSQHGNE TTPANVSTPE KAEIKCSGIR ADAGAQVCSL ILDFTEQDVA ERLGEDPYYT GHSARQHVVA
References 1 Satterthwaite, A.B. et al. (1992)Genomics 12, 788-794. 2 Simmons, D.L. et al. (1992) J. Immunol. 148, 267-271. 3 Andrews, R.G. et al. (1990) J. Exp. Med. 172, 355-358. 4 0 s a w a , M. et al. (1996) Science 273, 242-245. s Fackler, M.J. et al. (1990) J. Biol. Chem. 265, 11056-11061. 6 Nakamura, Y. et al. (1993) Exp. Hematol. 21,236-242. 7 Fackler, M.J. et al. (1995) Blood 85, 3040-3047. 8 Baumhueter, S. et al. (1993) Science 262, 436-438. 9 Rosen, S.D. and Bertozzi, C.R. (1996) Curr. Biol. 6, 261-264. lo Puri, K.D. et al. (1995) J. Cell. Biol. 131,261-270. 11 Lasky, L.A. (1995)Annu. Rev. Biochem. 64, 113-139. 12 Suzuki, A. et al. (1996) Blood 87, 3550-3562. 13 Cheng, J. et al. (1996) Blood 87, 479-490. 14 Brown, J. et al. (1991) Int. Immunol. 3, 175-184.
-i 50 i00 150 200 250 300 350 354
Complement receptor type 1, CR1
l
Molecular weights
A allotype
Polypeptide (A or F allotype): 219 638 SDS-PAGE reduced unreduced
250 kDa 190 kDa
Carbohydrate N-linked sites O-linked sites
20 unknown
LHR A
Human gene location and size 1q32; 140 kb 1
3 rn
C&D)
COOH
Domains LHR-A
Domains LHR-B
Domains LHR-C
Domains LHR-D
Domains
lsi
CNA CRN CGL CTP CQP CDD CPS i DRCi PICI PQCl PSCl PTCl PVCI PRC
c
CQA
i
!, !
c
CKT
i I
l
CGL
! DVC! PJCl ! c "1 c 'i CQA
I
1 1
I
'I'
!
c
CTP
PQCl
c
I II
I
CQP
PSCl
'1 c
,
'I
| I
c.. ,I
c
c
CDD
PTCl
'1
II|
!
c
CPS
PVCI
c ,'lc
I
!
PRC
1
c
'!'
c
'1'
c
'1'
c
'
c
"
c
CGP CEP CTA CQP CDA CPN DNC] PICI P R C l PHCl PRCl PVCI PRC
c
CPH
~|
CKT CGL CTP CQP CDD CPN DVCI PlCl P Q C I PSCl PRC i PVCl PKC
c CKT
c
Hu
c
1 c CSF
1 ,c
'I' c 'I' c
'I" c 'l',,c
'l"c
'1
AKC
'ITMlcYI
t17
Tissue distribution CD35 is found on erythrocytes, B cells, a subset of T cells, monocytes, neutrophils, eosinophils, follicular dendritic cells and kidney glomerular podocytes 1,2. Structure Four CD35 allotypes (A (or F), B (or S), C and D) have been identified in humans. The extracellular domain of the most common (82%) A allotype is composed of 30 complement control protein (CCP)domains. Sequence comparisons show that its first 28 CCP domains are arranged in four long homologous repeats (LHR-A, -B, -C, and-D) each of which contains seven CCP domains. The B allotype (18%) has five LHRs. The C and D allotypes are rare (<0.01%) and from protein size estimates are presumed to have three and six LHRs, respectively 1-s. Ligands and a s s o c i a t e d m o l e c u l e s CD35 binds the complement components C3b and C4b. Function CD35 mediates neutrophil and monocyte phagocytosis of particles coated with C3b and/or C4b. This interaction also enhances phagocytosis mediated by the Fc receptors. Expression of CD35 is important for clearance of complement-containing immune complexes by the liver and spleen l'a. CD35 is a complex receptor with one binding site for C4b in LHR-A and two binding sites for C3b in LHR-B and LHR-C. In each case, the binding site is located in the first four CCP domains within each LHR 6,7. CD35 is also a regulator of both the classical and alternative complement pathways. It accelerates the dissociation of the C3-convertase complexes (also mediated by the decay accelerating factor CD55), and acts as a cofactor for the factor I-mediated cleavage of C3b and C4b (also mediated by CD46) 1,2. Comments I
] =
=
~_18
The CD35 gene is located within the regulators of complement activation (RCA) gene complex on chromosome lq32 which also contains the genes for CD21, CD35, CD46, CD55, factor H, and the two subunits of the C4 binding protein, all of which contain CCP domains 8. This region appears to have evolved rapidly in mammals because in many cases homologues of these genes have not been identified in other species. Furthermore, those homologues identified often differ in structure, expression, and function. For example, the most abundant form of CD35 on chimpanzee erythrocytes has only one LHR (equivalent to LHR-A) and binds both C3b and C4b 9-11. Mice lack a structural equivalent of human CD35 but instead have a single gene that codes for both CR1 and CR212-14. Mouse CR2 has 15 CCP
CD35
! .
! ! !
! ! !
!
! !
i
domains and CR1 contains an additional six CCP at its N-terminus. These additional CCP domains confer C3b binding activity. A related mouse gene Crry encodes a membrane protein with five CCP domains. The sequence of the first four CCP domains of Crry are most similar to the first four CCP domains of human CD35. However, Crry can mediate both decay accelerating activity (like C D 5 5 ) a n d cofactor activity (like CD46) for both the classical and alternative complement activation pathways is. The CR1/Crry regulatory proteins of the rat are different yet again 16-18. Database accession numbers Human Mouse Mouse
(Crry) (CR1)
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
S03843 A28507 A43519
P17927
Y00816 X05309 M34164-74 M36470
3,4 12 13
A m i n o a c i d s e q u e n c e of h u m a n C D 3 5 MGASSPRSPE QCNAPEWLPF WTGAKDRCRR SATCIISGDT TYRCNPGSGG NVENGILVSD SCSRVCQPPP PQGDWSPAAP LKGSSASYCV FGKAVNYTCD HCQAPDHFLF WSSPKDVCKR SAECILSGNA TYRCNPGSGG NVENGILVSD SCSRVCQPPP PQGDWSPAAP LKGSSASYCV FGKAVNYTCD HCQAPDHFLF WSSPKDVCKR SAECILSGNT TYRCNLGSRG NVENGILVSD SCSRVCQPPP PQGDWSPEAP LKGSSVSHCV YGKEISYTCD RAGHCKTPEQ NLVWSSVEDN GSPSTTCLVS TVVTYQCHTG TAPEVENAIR
PVGPPAPGLP ARPTNLTDEF KSCRNPPDPV VIWDNETPIC RKVFELVGEP NRSLFSLNEV DVLHAERTQR TCEVKSCDDF LAGMESLWNS PHPDRGTSFD AKLKTQTNAS KSCKTPPDPV AHWSTKPPIC RKVFELVGEP NRSLFSLNEV DVLHAERTQR TCEVKSCDDF LAGMESLWNS PHPDRGTSFD AKLKTQTNAS KSCKTPPDPV AHWSTKPPIC RKVFELVGEP NRSLFSLNEV EILHGEHTPS RCAVKSCDDF LVGMRSLWNN PHPDRGMTFN FPFASPTIPI CRRKSCGPPP GNNVTWDKKA PDGEQLFELV VPGNRSFFSL
FCCGGSLLAV EFPIGTYLNY NGMVHVIKGI DRIPCGLPPT SIYCTSNDDQ VEFRCQPGFV DKDNFSPGQE MGQLLNGRVL SVPVCEQIFC LIGESTIRCT DFPIGTSLKY NGMVHVITDI QRIPCGLPPT SIYCTSNDDQ VEFRCQPGFV DKDNFSPGQE MGQLLNGRVL SVPVCEQIFC LIGESTIRCT DFPIGTSLKY NGMVHVITDI QRIPCGLPPT SIYCTSNDDQ VEFRCQPGFV HQDNFSPGQE LGQLPHGRVL SVPVCEHIFC LIGESTIRCT NDFEFPVGTS EPFNGMVHIN PICEIISCEP GERSIYCTSK TEIIRFRCQP
VVLLALPVAW ECRPGYSGRP QFGSQIKYSC ITNGDFISTN VGIWSGPAPQ MKGPRRVKCQ VFYSCEPGYD FPVNLQLGAK PSPPVIPNGR SDPQGNGVWS ECRPEYYGRP QVGSRINYSC IANGDFISTN VGIWSGPAPQ MKGPRRVKCQ VFYSCEPGYD FPVNLQLGAK PSPPVIPNGR SDPQGNGVWS ECRPEYYGRP QVGSRINYSC IANGDFISTN VGIWSGPAPQ MKGPRRVKCQ VFYSCEPGYD FPLNLQLGAK PNPPAILNGR SDPHGNGVWS LNYECRPGYF TDTQFGSTVN PPTISNGDFY DDQVGVWSSP GFVMVGSHTV
G FSIICLKNSV TKGYRLIGSS RENFHYGSVV CIIPNKCTPP ALNKWEPELP LRGAASMRCT VDFVCDEGFQ HTGKPLEVFP SPAPRCGILG FSITCLDNLV TTGHRLIGHS RENFHYGSVV CIIPNKCTPP ALNKWEPELP LRGAASMRCT VDFVCDEGFQ HTGKPLEVFP SPAPRCGILG FSITCLDNLV TTGHRLIGHS RENFHYGSVV CIIPNKCTPP ALNKWEPELP LRGAASLHCT VSFVCDEGFR HTGTPSGDIP SPAPRCELSV GKMFSISCLE YSCNEGFRLI SNNRTSFHNG PPRCISTNKC QCQTNGRWGP
-i 5O I00 150 20O 250 300 350 400 450 5OO 55O 600 650 7OO 750 800 850 9O0 95O i000 1050 ii00 i150 1200 1250 1300 1350 1400 1450 1500 1550 1600
~_1~
KLPHCSRVCQ HCTPQGDWSP GFRLKGRSAS DIPYGKEISY LSVPAACPHP TDQGIWSQLD YTLEGSPWSQ SWIILKHRKG
PPPEILHGEH EAPRCTVKSC HCVLAGMKAL ACDTHPDRGM PKIQNGHYIG HYCKEVNCSF CQADDRWDPP NNAHENPKEV
TLSHQDNFSP DDFLGQLPHG WNSSVPVCEQ TFNLIGESSI GHVSLYLPGM PLFMNGISKE LAKCTSRAHD AIHLHSQGGS
GQEVFYSCEP RVLLPLNLQL IFCPNPPAIL RCTSDPQGNG TISYTCDPGY LEMKKVYHYG ALIVGTLSGT SVHPRTLQTN
SYDLRGAASL GAKVSFVCDE NGRHTGTPFG VWSSPAPRCE LLVGKGFIFC DYVTLKCEDG IFFILLIIFL EENSRVLP
1650 1700 1750 1800 1850 1900 1950 1998
References 1 Fearon, D.T. and Ahearn, J.M. (1989) Curr. Topics Microbiol. Immunol. 155, 83-98. z Ahearn, J.M. and Fearon, D.T. (1989) Adv. Immunol. 46, 183-219. 3 Klickstein, L.B. et al. (1987)J. Exp. Med. 165, 1095-1112. 4 Klickstein, L.B. et al. (1988) J. Exp. Med. 168, 1699-1717. s Wong, W.W. et al. (1989)J. Exp. Med. 169, 847-863. 6 Krych, M. et al. (1991) Proc. Natl Acad. Sci. USA 88, 4353-4357. 7 Kalli, D.R. et al. ( 1991) J. Exp. Med. 174, 1451-1460. s Hourcade, D. et al. (1992) Genomics 12, 289-300. 9 Birmingham, D.J. et al. (1994) J. Immunol. 153, 691-700. lo Nickells, M.W. et al. (1995) J. Immunol. 154, 2829-2837. 11 Subramanian, V.B. et al. (1996) J. Immunol. 157, 1242-1247. 12 Paul, M.S. et al. (1990) J. Immunol. 144, 1988-1996. 13 Kurtz, C.B. et al. (1990) J. Immunol. 144, 3581-3591. 14 Molina, H. et al. (1992). J. Exp. Med. 175, 121-129. is Kim, Y.U. et al. (1995) J. Exp. Med. 151-159. 16 Funabashi, K. et al. (1994) Immunology 8 1 , 4 4 4 - 4 5 1 . 17 Takizawa, H. et al. (1994) J. Immunol. 152, 3032-3038. 18 Quigg, R.I. and Holers, V.M. (1995) J. Immunol. 155, 1481-1488.
E
[
~_2C
Platelet glycoprotein IV, FAT (rat) Molecular weights Polypeptide
53 053
SDS-PAGE reduced unreduced
88 kDa 88 kDa
Carbohydrate N-linked sites O-linked
10 +
Human gene location and size 7ql 1.2; >32 kb 1
NH2 COOH
Tissue distribution CD36 expression is restricted to platelets, monocytes, macrophages, erythrocyte precursors, adipocytes, activated keratinocytes, and some endothelial and epithelial cells (reviewed in ref. 2). In the mouse, CD36 is expressed in B cells 3.
Structure CD36 is thought to have two transmembrane regions, short N- and C-terminal cytoplasmic tails, and a heavily glycosylated extracellular region 2,4. This model is controversial, since an extracellular location for the N-terminus has also been proposed s. However, recent data has shown both N- and C-termini to be palmitoylated at cysteine residues 3, 7, 464 and 466, and the N-terminus to be membrane anchored4. CD36 is a member of a superfamily that includes the receptor for high-density lipoprotein SR-BI (also known as CLA-1), the lysosomal protein LIMP II, and the Drosophila epithelial molecule Emp (see ref. 4). CD36 mRNA exhibits considerable size heterogeneity as a result of alternate splicing of 5'-untranslated and 3'-untranslated exons. In addition, skipping of coding exons 4 and 5, in an erythroleukaemia cell line, generates a CD36 isoform of 57 kD that lacks amino acids 41-143 in the extracellular region 6. CD36 has been proposed as a candidate gene through which the transcription factor Oct-2 could affect B cell differentiation in the mouse 3.
Ligands and associated molecules A number of different ligands have been identified for CD36: the extracellular matrix components collagen 2 and thrombospondin 7; oxidized low-density lipoproteinS; fatty acids g; anionic phospholipidsl~ and Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1)11. CD36 associates non-covalently with the Src family protein tyrosine kinases Fyn, Lyn and Yes 2.
~_21
Function
i
CD36 is a muhifunctional glycoprotein that has roles as a cell adhesion molecule, a scavenger receptor, a signal transducer and in the pathogenesis of malaria. CD36 on macrophages functions in the phagocytic clearance of apoptotic cells, which is thought to involve an interaction of CD36 with thrombospondin 7. An interaction between CD36 and anionic phospholipids in the outer leaflet of the plasma membrane may contribute to the recognition of apoptotic cells by macrophages lo. CD36 is also a macrophage scavenger receptor for oxidized low-density lipoprotein, and is therefore predicted to play a part in atherogenesis s. CD36 has a role in platelet function as a receptor for collagen 2. CD36 appears to transduce signals through its associated Src family protein tyrosine kinases. Following ligand binding, a signal transduction cascade is initiated which can activate platelets and lead to an oxidative burst in platelets and macrophages 2. On adipocytes, CD36 is involved in the binding and transport into the cell of long-chain fatty acids 9. CD36 has a direct role in Plasmodium falciparum malaria pathogenesis as an endothelial cell receptor for PfEMP1, a large antigenically variant malarial protein expressed on the surface of parasitized erythrocytes 11.
Database accession numbers Human Rat Mouse
PIR A30989
SWISSPR OT P16671 Q07969 Q08857
EMBL/GENBANK M24795 L19658 L23108
REFERENCE 12 13 14
A m i n o acid s e q u e n c e of h u m a n C D 3 6 MGCDRNCGLI FKNWVKTGTE ENVTQDAEDN NQFVQMILNS FYPYNNTADG AASFPPFVEK ASPVENPDNY PDVSEPIDGL VLKNLKRNYI VGVVMFVAFM
AGAVIGAVLA VYRQFWIFDV TVSFLQPNGA LINKSKSSMF VYKVFNGKDN SQVLQFFSSD CFCTEKIISK NPNEEEHRTY VPILWLNETG ISYCACRSKT
VFGGILMPVG QNPQEVMMNS IFEPSLSVGT QVRTLRELLW ISKVAIIDTY ICRSIYAVFE NCTSYGVLDI LDIEPITGFT TIGDEKANMF IK
DLLIQKTIKK SNIQVKQRGP EADNFTVLNL GYRDPFLSLV KGKRNLSYWE SDVNLKGIPV SKCKEGRPVY LQFAKRLQVN RSQVTGKINL
QVVLEEGTIA YTYRVRFLAK AVAAASHIYQ PYPVTTTVGL SHCDMINGTD YRFVLPSKAF ISLPHFLYAS LLVKPSEKIQ LGLIEMILLS
50 i00 150 200 250 300 350 400 450 472
References 1 2 a 4 s 6 7 s 9 10
~_2~
Armesilla, A.L. and Vega, M.A. (1994)J. Biol. Chem. 269, 18985-18991. Greenwalt, D.E. et al. (1992) Blood 80, 1105-1115. K6nig, H. et al. (1995)Genes Dev. 9, 1598-1607. Tao, N. et al. (1996) J. Biol. Chem. 271, 22315-22320. Pearce, S.F.A. et al. (1994) Blood 84, 384-389. Tang, Y. et al. (1994) J. Biol. Chem. 269, 6011-6015. Navazo, M.D.P. et al. (1996) J. Biol. Chem. 271, 15381-15385. Nozaki, S. et al. (1995) J. Clin. Invest. 96, 1859-1865. Ibrahimi, A. et al. (1996) Proc. Natl Acad. Sci. USA 93, 2646-2651. Rigotti, A. et al. (1995) J. Biol. Chem. 270, 16221-16224.
CD36
11 12 13 14
Baruch, D.I. et al. (1996) Proc. Natl Acad. Sci. USA 93, 3497-3502. Oquendo, P. et al. (1989) Cell 58, 95-101. Abumrad, N.A. et al. (1993) J. Biol. Chem. 268, 17665-17668. Endemann, G. et al. (1993)J. Biol. Chem. 268, 11811-11816.
~.2~
Molecular weights Polypeptide
31 703
SDS-PAGE reduced
40-52 kDa
Carbohydrates N-linked sites O-linked
3 nil
NH 2
Human gene location 19p13-q13.4
Tissue distribution CD3 7 was originally defined as an antigen of mature B lymphocytes. It is highly expressed on mature B cells, but not on pre-B cells or plasma cells. There is low expression on T cells, neutrophils, monocytes and some myelomonocytic leukaemia cells. CD3 7 is not expressed by NK cells, platelets or erythrocytes. Immunoelectron microscopy has shown that a large proportion of CD37 protein is present in intracellular vesicles 1.
Structure CD3 7 is a member of the TM4 superfamily and is predicted to have four transmembrane regions, short cytoplasmic N- and C-termini, and two extracellular regions (reviewed in ref. 2). CD3 7 is heavily glycosylated at three N-linked sites but there is no O-linked carbohydrate 1. Gene mapping data suggest that CD3 7 and CD53 have evolved by gene duplication and divergence from a common ancestral gene 3.
Ligands and associated molecules CD37 on B cells associates non-covalently with MHC Class II, CD19, CD21 and the TM4SF molecules CD53, CD81 and CD82 4. No extracellular ligand has been identified for CD3 7.
Function Unknown.
Database accession numbers Human Rat Mouse
~_24
PIR
SWISSPR OT
EMBL/GENBANK
A47629 B47629
P11049 P31053
X14046 X53517 U 18367-U 18372
REFERENCE 5,6 5,6 7
CD37
Amino
('"i
acid sequence
MSAQESCLSL FVPLQIWSKV QITLGILIST RCCGWHYPQD LSRLGHLARS GVGLLELGFM ................
! i
IKYFLFVFNL LAISGIFTMG QRAQLERSLR WFQVLILRGN RHSADICAVP TLSIFLCRNL
of h u m a n
CD37
FFFVLGSLIF IALLGCVGAL DVVEKTIQKY GSEAHRVPCS AESHIYREGC DHVYNRLARY
CFGIWILIDK KELRCLLGLY GTNPEETAAE CYNLSATNDS AQGLQKWLHN R
TSFVSFVGLA FGMLLLLFAT ESWDYVQFQL TILDKVILPQ NLISIVGICL
50 i00 150 200 250 281
References 1 z 3 4 s a 7
Schwartz-Albiez, R. et al. (1988)J. Immunol. 140, 905-914. Wright, M.D. and M.G. Tomlinson (1994) Immunol. Today 15, 588-594. Wright, M.D. et al. (1993) Int. Immunol. 5, 209-216. Angelisova, P. et al. (1994) Immunogenetics 39, 249-256. Classon, B.J. et al. (1989) J. Exp. Med. 169, 1497-1502. Classon, B.J. et al. (1990) J. Exp. Med. 172, 1007. Tomlinson, M.G. and Wright, M.D. (1996) Mol. Immunol. 33, 867-872.
~.25
T10
Other names ADP-ribosyl cyclase Cyclic ADP-ribose hydrolase
Molecular weights Polypeptide
34 301
SDS-PAGE reduced unreduced
46 kDa 46 kDa
NH 2
Carbohydrate N-linked sites O-linked
4 unknown
Human gene location 4p15
Tissue distribution CD38 is found on immature cells of the B and T cell lineages, but not on most mature resting peripheral lymphocytes. It is also present on thymocytes, pre-B cells, germinal centre B cells, mitogen-activated T cells, Ig-secreting plasma cells, monocytes, NK cells, erythroid and myeloid progenitors in the bone marrow and brain cells 1-4. In the mouse, CD38 has been used to subdivide TcR+CD4-CD8 - thymocytes, with CD38 § cells biased towards Vf18.2 expression and capable of producing large amounts of IL-4 following stimulation s
Structure CD38 is a type II membrane glycoprotein, with the transmembrane sequence near the N-terminus 1.
Ligands and associated molecules The Moon-1 mAb blocks CD38-mediated binding of several cell lines to human vein endothelial cells. Moon-1 recognizes a molecule (120 kDa) expressed by endothelium, monocytes, platelets, NK cells, T cells and B cells 6.
Function CD38 is a bifunctional enzyme that can synthesize cyclic ADP-ribose (cADPR) from nicotinamide adenine dinucleotide (NAD § as well as hydrolyse cADPR to ADP-ribose 4. As such, CD38 might mediate ADP-ribosylation of an as yet uncharacterized physiologic target molecule 7. Antibodies to human and mouse CD38 have a wide range of biological effects, including the induction of B and T cell proliferation, protection of B cells from apoptosis, inhibition of B lymphopoiesis and enhancement of macrophage APC function 4.
~_2(
Signalling through CD38 results in Tyr phosphorylation, and activation of the protein kinase Syk, the c - c b l proto-oncogene and Bruton Tyr kinase 8-1o.
Database accession numbers Human Mouse Rat
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
A43521 I49586 JC2410
P28907
M34461 Ll1332 D30795
1 11 12
Amino acid sequence of human CD38 MANCEFSPVS SGPGTTKRFP ITEEDYQPLM GYLADDLTWC CDVVHVMLNG RDLCQDPTIK
GDKPCCRLSR ETVLARCVKY KLGTQTVPCN GEFNTSKINY SRSKIFDKNS ELESIISKRN
RAgLCLGVSI TEIHPEMRHV KILLWSRIKD QSCPDWRKDC TFGSVEVHNL IQFSCKNIYR
LVLILVVVLA DCQSVWDAFK LAHQFTQVQR SNNPVSVFWK QPEKVQTLEA PDKFLQCVKN
VVVPRWRQTW GAFISKHPCN DMFTLEDTLL TVSRRFAEAA WVIHGGREDS PEDSSCTSEI
50 100 150 200 250 300
References 1 z 3 4 s 6 7 8 9 lo
Jackson, D.G. and Bell, J. I. (1990) J. Immunol. 144, 2811-2815. Alessio, M. et al. (1990)J. Immunol. 145, 878-884. Mizuguchi, M. et al. (1995) Brain Res. 697, 235-240. Lund, E et al. (1995) Immunol. Today 16, 469-473. Bean, A.G. et al. (1995) Int. Immunol. 7, 213-221. Deaglio, S. et al. (1996)J. Immunol. 156, 727-734. Grimaldi, J.C. et al. (1995)J. Immunol. 155, 811-817. Silvennoinen, O. et al. (1996) J. Immunol. 156, 100-107. Kontani, K. et al. (1996)J. Biol. Chem. 271, 1534-1537. Kikuchi, Y. et al. (1995) Proc. Natl Acad. Sci. USA 92, 11814-11818. 11 Harada, N. et al. (1993) J. Immunol. 151, 3111-3118. 12 Li, Q. et al. (1994)Biochem. Biophys. Res. C o m m u n . 202, 629-636.
~_27
Molecular weights Polypeptides 5 7 965 SDS-PAGE reduced
78 kDa
Carbohydrate N-linked sites O-linked sites
4 unknown NH 2
GOOH
Human gene location 10q23.1-24.1
-v
Tissue distribution CD39 was originally identified on Epstein-Barr virus-transformed B cells 1 and subsequently shown to be present on activated B and NK cells, and subsets of activated T cells. It is not expressed by resting lymphoid cells. CD39 expression in lymphoid tissues is limited to the mantle zone and paracortical lymphocytes, macrophages, dendritic cells and Langerhans cells and is absent from germinal centres 2,a.
Structure The derived amino acid sequence contains three hydrophobic segments. However the N-terminal segment may be an unusual signal sequence and the central segment may not be membrane-spanning since it is short and contains a Gln and an Asp residue 4. There is experimental evidence that the C-terminal segment traverses the membrane and is orientated as shown in the figure. However the topology of the remainder of the molecule is not known and the figure shown is a model proposed in ref. 4.
i
Function CD39 can mediate B cell homotypic adhesion 2. The primary sequence of CD39 contains four segments that are characteristic of ecto-apyrase from plants and yeast s and transfection of CD39 cDNA into COS cells induces ecto-apyrase activity 6. ATP, which leaks from damaged cells (e.g. lysed target cells), could be toxic to the lymphocytes. Molecules such as CD39 may protect activated lymphocytes through hydrolysis of extracellular ATP 6.
E
Database accession numbers Human
~_28
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
I56242
P49961
$73813
4
CD39
A m i n o acid s e q u e n c e of h u m a n C D 3 9 MEDTKESNVK IVLDAGSSHT IGIYLTDCME VVERSLSNYP YETNNQETFG HSFLCYGKDQ CTKRFEMTLP PPLQGDFGAF YAGVKEKYLS TLGYMLNLTN KPSYFWKDMV
TFCSKNILAI SLYIYKWPAE RAREVIPRSQ FDFQGARIIT ALDLGGASTQ ALWQKLAKDI FQQFEIQGIG SAFYFVMKFL EYCFSGTYIL MIPAEQPLST
LGFSSIIAVI KENDTGVVHQ HQETPVYLGA GQEEGAYGWI VTFVPQNQTI QVASNEILRD NYQQCHQSIL NLTSEKVSQE SLLLQGYHFT PLSHSTYVFL
ALLAVGLTQN VEECRVKGPG TAGMRLLRME TINYLLGKFS ESPDNALQFR PCFHPGYKKV ELFNTSYCPY KVTEMMKKFC ADSWEHIHFI MVLFSLVLFT
KALPENVKYG ISKFVQKVNE SEELADRVLD QKTRWFSIVP LYGKDYNVYT VNVSDLYKTP SQCAFNGIFL AQPWEEIKTS GKIQGSDAGW VAIIGLLIFH
50 i00 150 200 250 300 350 400 450 500 510
References 1 Rowe, M. et al. (1982) Int. J. Cancer 29, 373-381. z Kansas, G.S. et al. (1991) J. Immunol. 146, 2235-2244. 3 Goutefangeas, C. et al. (1992)Eur. J. Immunol. 22, 2681-2685.
4 Maliszewski, C.R. et al. (1994) J. Immunol. 153, 3574-3583. s Handa, M. and Guidotti, G. {1996)Biochem. Biophys. Res. Comm. 218, 916923. 6 Wang, T.E and Guidotti, G. (1996) J. Biol. Chem. 271, 9898-9901.
~_2~
CD40
r>
Molecular weights Polypeptide
26 989
SDS-PAGE reduced unreduced
48 kDa 48 kDa
Carbohydrate N-linked sites O-linked
2 nil
Tr
Human gene location 20q 12-q 13.21,2
COOH
Domains
i sl
CRE [
TECt[CGE T~
!
TICt[CEE Tr
[
TIC, . .Ci_PV Tr
[
VVCi Tr
I TMICu
Tissue distribution CD40 is found on all mature B lymphocytes but is absent from plasma cells 1,2. It is also present on some epithelium including thymus and on some endothelial cells 1,2. It is expressed on lymphoid interdigitating and follicular dendritic cells and activated monocytes 1,2.
Structure CD40 antigen is a member of the TNFR superfamily and contains four cysteine-rich repeats 1.
Ligands and associated molecules CD40 binds to CD 154, a type II membrane protein of the TNF superfamily. The cytoplasmic domain of CD40 binds to CRAF-1, a member of the TRAF (TNFRassociated proteins) family. Binding is dependent on Thr234 which is an essential residue for signal transduction via CD401. A novel 23 kDa protein co-precipitates specifically with CD40 a.
Function CD 154 binding to CD40 on B cells is required for secondary immune responses and germinal centre formation 1'2'4. Mutations in CD154 which abolish binding to CD40 cause the immunodeficiency disease hyper-IgM syndrome, which is characterized by lack of isotype switching in Ig production and lack of germinal centres 1,2,4. There is evidence for a role for the CD154-CD40
~_3(
interaction in negative selection and peripheral tolerance 1,2,4. Mice deficient in CD40 or CD 154 have increased susceptibility to parasite infection s.
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
S04460 A46476
P25942 P27512
X60592 M83312
6 7
Amino acid sequence of human CD40 MVRLPLQCVL EPPTACREKQ WNRETHCHQH LHRSCSPGFG DLVVQQAGTN KPTNKAPHPK
WGCLLTAVHP YLINSQCCSL KYCDPNLGLR VKQIATGVSD KTDVVCGPQD QEPQEINFPD
CQPGQKLVSD VQQKGTSETD TICEPCPVGF RLRALVVIPI DLPGSNTAAP
CTEFTETECL TICTCEEGWH FSNVSSAFEK IFGILFAILL VQETLHGCQP
PCGESEFLDT CTSEACESCV CHPWTSCETK VLVFIKKVAK VTQED
-i 50 i00 150 200 245
References
E
!
1 z 3 4 s 6 7
van Kooten, C. and Banchereau, J. (1996) Adv. Immunol. 61, 1-77. Gruss, H.-J. and Dower, S.K. (1995) Blood 85, 3378-3404. Morio, T. et al. (1995) Proc. Natl Acad. Sci. USA 92, 11633-11636. Foy, T.M. et al. (1996)Annu. Rev. Immunol. 14, 591-617. Noelle, R.J. (1996)Immunity 4, 415-419. Stamenkovic, I. et al. (1989) EMBO J. 8, 1403-1410. Torres, R.M. and Clark, E.A. (1992)J. Immunol. 148, 620-626.
~.3]
CD41
GPIIb of the GPIIb/IIla complex, integrin aIIb subunit
Molecular weights Polypeptides
110 005
SDS-PAGE reduced unredueed
125 + 22 kDa 140 kDa
Carbohydrates N-linked sites O-linked sites
5 unknown
Human gene location and size 17q21.32; 17 kb 1
CD41/CD61
?q
Tissue distribution CD41 is expressed on platelets and megakaryocytes 1,2.
Structure CD41 is the integrin aIIb subunit that is expressed as a heterodimer noncovalently associated with the integrin f13 subunit CD61 1,2. The association between CD41 and CD61 is calcium dependent 3,4. CD41 is post-translationally cleaved into large (125kDa) N-terminal and small (22kDal fragments which are disulfide-linked 1,2. The transmembrane sequence is in the smaller fragment. The N-terminal sequences of both fragments have been determined s'6. Whereas the N-terminus of the small fragment is usually at residue Gln891, a fragment with the N-terminus at Leu902 has also been detected. Alternatively spliced mRNA which lacks sequence from an exon encoding the 34 residues Arg917-Gin950 has been reported in megakaryotes 7. The significance of this alternatively spliced mRNA is unclear since this deletion interferes with surface expression of the protein s. Intramolecular disulfide bonds are formed between neighbouring cysteines along the primary sequence, beginning at Cys56 9. Electron microscopy of purified CD41/CD61 has provided a structural model for all integrins lo.
Ligands and associated proteins The ligands for CD41/CD61 include fibrinogen, von Willebrand factor, fibronectin, vitronectin, and thrombospondin 11,12. Binding to these ligands depends on the activation state of the platelets. Unstimulated platelets only bind immobilized fibrinogen. Binding to soluble fibrinogen and other ligands requires stimulation of platelets by agonists such as thrombin, ADP and collagen 11,13. Binding usually involves an RGD motif on ligands 11-1a but CD41/CD61 also binds to the HHLGGAKQAGDV sequence in the ~ chain of
Z32
I
fibrinogen 14. CD41/CD61 binding to multiple sites on fibrinogen is important for platelet adhesion is. Function CD41/CD61 is the major integrin on platelets and is important for platelet adhesion and aggregation 12. Three missense mutations (G242D, R327H and G418D - numbered as in the mature protein) have been identified which result in the disease Glanzmann thrombasthenia 16. Glanzmann thrombasthenia can also be caused by defects in the integrin f13 subunit (see CD61). Mutations of CD41 at Ile843 give rise to the HPA-3 alloantigenic polymorphism 17. The HPA-3B allele (Ser843) forms an additional O-glycosylation site 17, and is associated with neonatal alloimmune thrombocytopenia 18. Database accession numbers Human
PIR A29522
SWISSPR OT P08514
EMBL/GENBANK J02764
REFERENCE 19
A m i n o a c i d s e q u e n c e of h u m a n C D 4 1 MARALCPLQA LNLDPVQLTF GGVFLCPWRA SDVIVACAPW LSRIYVENDF ADIFSSYRPG TEYVVGAPTW DGRHDLLVGA QLYGRFGSAI SRPSQVLDSP QPVVKASVQL KLSLNAELQL FLRDEADFRD DSGEDDVCVP VHLPQGAHYM IAMLVSVGNL RGNSFPASLV IHLPGQSQPS PAHHKRDRRQ AMVTVLAFLW VWTQLLRALE EEDDEEGE
LWLLEWVLLL YAGPNGSQFG EGGQCPSLLF QHWNVLEKTE SWDKRYCEAG ILLWHVSSQS SWTLGAVEIL PLYMESRADR APLGDLDRDG FPTGSAFGFS LVQDSLNPAV DRQKPRQGRR KLSPIVLSLN QLQLTASVTG RALSNVEGFE EEAGESVSFQ VAAEEGEREQ DLLYILDIQP IFLPEPEQPS LPSLYQRPLD ERAIPIWWVL
LGPCAAPPAW FSLDFHKDSH DLRDETRNVG EAEKTPVGSC FSSVVTQAGE LSFDSSNPEY DSYYQRLHRL KLAEVGRVYL YNDIAVAAPY LRGAVDIDDN KSCVLPQTKT VLLLGSQQAG VSLPPTEAGM SPLLVGADNV RLICNQKKEN LQIRSKNSQN NSLDSWGPKV QGGLQCFPQP RLQDPVLVSC QFVLQSHAWF VGVLGGLLLL
A GRVAIVVGAP SQTLQTFKAR FLAQPESGRR LVLGAPGGYY FDGYWGYSVA RAEQMASYFG FLQPRGPHAL GGPSGRGQVL GYPDLIVGAY PVSCFNIQMC TTLNLDLGGK APAVVLHGDT LELQMDAANE ETRVVLCELG PNSKIVLLDV EHTYELHNNG PVNPLKVDWG DSAPCTVVQC NVSSLPYAVP TILVLAMWKV
RTLGPSQEET QGLGASVVSW AEYSPCRGNT FLGLLAQAPV VGEFDGDLNT HSVAVTDVNG GAPSLLLTGT VFLGQSEGLR GANQVAVYRA VGATGHNIPQ HSPICHTTMA HVQEQTRIVL GEGAYEAELA NPMKKNAQIG PVRAEAQVEL PGTVNGLHLS LPIPSPSPIH DLQEMARGQR PLSLPRGEAQ GFFKRNRPPL
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 i000 1008
References 1 Kieffer, N. and Phillips, D.R. (1990) Annu. Rev. Cell Biol. 6, 334-357. 2 Pigott, R. and Power, C. (1993)The Adhesion Molecule FactsBook. Academic Press, London. a Jennings, L.K. and Phillips, D.R. (1982) J. Biol. Chem. 257, 10458-10466. 4 Weisel, J.W. et al. (1992) J. Biol. Chem. 267, 16637-16643. s Charo, I.F. et al. (1986) Proc. Natl Acad. Sci. USA 83, 8351-8355. 6 Loftus, J.C. et al. (1988) J. Biol. Chem. 263, 11025-11028.
~.3~
z 8 9 lo 11 12 13 14 ,s 16 17 18 19
~.34
Bray, P.F. et al. (1990)J. Biol. Chem. 265, 9587-9590. Kolodziej, M.A. et al. (1992) Blood 78, 2344-2353. Calvette, J.J. et al. (1989) Biochem. J. 2 6 1 , 5 6 1 - 5 6 8 . Carrell, N.A. et al. (1985) J. Biol. Chem. 260, 1743-1749. Phillips, D.R. et al. (1991) Cell 65, 359-362. Du, X. et al. (1991) Cell 65, 409-416. Ginsberg, M.H. et al. (1993)Thromb. Haemost. 70, 87-93. D'Souza, S.E. et al. (1991) Nature 350, 66-68. Savage, B. et al. (1995) J. Biol. Chem. 270, 28812-28817. Bray, P.F. (1994)Thromb. Haemost. 72, 492-502. Calvette, J.J. and Muniz-Diaz, E. (1993) FEBS Lett. 328, 30-34. Lymen, S. et al. (1990) Blood 75, 2343-2348. Poncz, M. et al. (1987) J. Biol. Chem. 262, 8476-8482.
CD42a, b
GPIX (CD42a), GPIB (CD42b)
Molecular weights Polypeptides
SDS-PAGE unreduced reduced
CD42a CD42ba CD42bfl
17 259 67 193 19 320
CD42a CD42b CD42a CD42ba CD42bfl
22 kDa 160-170 kDa 17-22 kDa 135-145 kDa 22-25 kDa
CD42a CD42b~ CD42bfl CD42b~
1 4 1
LRG
repeats
CD42bcz
Carbohydrate N-linked sites
O-linked
+ abundant
Human gene location and size CD42ba: 17pter-p 12; N3.4 kb 1
CD42b13
CD42a
'TTTTT ) COOH COOH COOH
Domain IsI
CD42a
I LRR
I "ITMIcYI
Domains
CD42bo~ -]{sl '1 LRR I LRR I LRR I LRR I Domain
CD42bB
LRR
I
LRR
I
LRR I FMIcYI
Intron Isl'i
LRR
I ITMlCYI
Tissue distribution Restricted to platelets and megakaryocytes. Red blood cells, granulocytes, T cells and thymocytes do not express CD42a, b. The CD42a, b complex is a major component of the platelet surface (25 000 copies per platelet)2.
Structure Each polypeptide starts with a region of about 30 amino acids that shares sequence similarities among the chains. This is followed by a region of Leurich repeats (LRR)3. There is one in each of the CD42a and CD42bfl chains and seven in the CD42ba chain. Following the LRR repeats is another region
~.35
of about 60 amino acids in each chain that shows sequence similarities and is likely to have a globular structure as deduced from the presence of four conserved Cys residues. For CD42a and CD42bfl the transmembrane sequence then follows whereas in CD42b= a region of about 150 amino acids containing a high level of Ser, Thr and Pro residues precedes the transmembrane sequence. This is likely to be heavily O-glycosylated and can be deduced to be about 35 nm in length as revealed by electron micrographs 4. The diagram shown is based on the electron micrographs of the CD42 complex which show two globular regions 9 n m and 16nm in diameter separated by a narrow stalk 35 nm in length 4. The small globular region is presumed to consist of the region of CD42b~ that shows sequence similarity to the CD42a and CD42bfl chains. CD42ba and CD42bfl are disulfide linked 2'a. CD42a, CD42b~ and CD42bfl together with GPV form a muhimeric complex at the cell surface 2,a. CD42a and GPV, another LRRcontaining polypeptide, associate with the complex non-covalently 2"a. The protein sequence of CD42b= is encoded by a single exon 1.
F
i
Ligands and associated molecules CD42b~ binds to von Willebrand factor a. The binding site on CD42ba lies in a hinge region between the LRR motifs and the stalk a. Snake venom proteins bind to CD42b and inhibit binding of von Willebrand factor s. CD42b is also a receptor for thrombin, possibly only when the CD42-GP V complex is highly multimeric as there are only 50 sites per platelet n.
Function Binding of CD42b to yon Willebrand factor bound to exposed subendothelial surfaces is essential for platelet adhesion at sites of injury a. A bleeding disorder, Bernard-Soulier syndrome, is characterized by defects in CD42a,b expression a.
Database accession numbers CD42a CD42b~ CD42bfl
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A33731 A27075 B26864
P14770 P07359 P13224
X52997 J02940 J03259
7 s 9
Amino acid sequence of human CD42a MPAWGALFLL WATAEA T K D C P S P C T C RALETMGLWV DCRGHGLTAL PALPARTRHL LLANNSLQSV PPGAFDHLPQ LQTLDVTQNP WHCDCSLTYL RLWLEDRTPE ALLQVRCASP SLAAHGPLRL TGYQLGSCGW QLQASWVRPG VLWDVALVAV AALGLALLAG LLCATTEALD
~_3t
-i 50 i00 150 160
CD42a, b
A m i n o acid sequence of h u m a n CD42b ~ chain MPLLLLLLLL HPICEVSKVA LMPYTRLTQL ALTVLDVSFN EKLSLANNNL FLHGNPWLCN SDKFPVYKYP KAHTTPWGLF MESITFSKTP PAPSPTTPEP KTIPELDQPP FASVVLILLL AWLLFLRGSL VSIRYSGHSL
St
Amino
PSPLHP SHLEVNCDKR NLDRCELTKL RLTSLPLGAL TELPAGLLNG CEILYFRRWL GKGCPTLGDE YSWSTASLDS KSTTEPTPSP TPIPTIATSP KLRGVLQGHL SWVGHVKPQA PTFRSSLFLW
NLTALPPDLP KDTTILHLSE QVDGTLPVLG TLDLSHNQLQ RGLGELQELY LKGNELKTLP LENLDTLLLQ ENSLYTIPKG QDNAENVYVWKQGVDVKAMT GDTDLYDYYP EEDTEGDKVR QMPSSLHPTQ ESTKEQTTFP TTSEPVPEPA PNMTTLEPTP TILVSATSLI TPKSTFLTTT ESSRNDPFLH PDFCCLLPLG LDSGQGAALT TATQTTHLEL VRPNGRVGPL VAGRRPSALS
a c i d s e q u e n c e of h u m a n
NLLYTFSLAT SLPLLGQTLP PGLLTPTPKL FFGSHLLPFA SNVASVQCDN ATRTVVKFPT PRWTPNFTLH SPTTPEPTSE KPVSLLESTK FYVLGLFWLL QRGRQVTVPR QGRGQDLLST
C D 4 2 b fl c h a i n
MGSGPRGALS L L L L L L A P P S RPAAG
CPAPCSCAGT DALPALRTAH GRLLPYLAED RRLRARARAR
LVDCGRRGLT LGANPWRCDC ELRAACAPGP AAARLSLTDP
-i 50 i00 150 200 250 300 350 400 450 500 550 600 610
WASLPTAFPV RLVPLRAWLA LCWGALAAQL LVAERAGTDE
DTTELVLTGN NLTALPPGLL GRPERAPYRD LRCVAPPALR ALLGLGLLHA LLLVLLLCRL S
-1
50 i00 150 181
References 1 2 3 4 s 6 7 s 9
Wenger, R.H. et al. (1988) Biochem. Biophys. Res. Commun. 156, 389-395. Lopez, J.A. et al. (1996) J. Biol. Chem. 269, 23716-23721. Roth, G.J. (1992) Immunol. Today, 13, 100-105. Fox, J.E.B. et al. (1988) J. Biol. Chem. 263, 4882-4890. Kawasaki, T. et al. (1996)J. Biol. Chem. 271, 10635-10639. Greco, N.J. et al. (1996)Biochemistry 35, 906-914. Hickey, M.J. et al. (1989) Proc. Natl Acad. Sci. USA 86, 6773-6777. Lopez, J.A. et al. (1987) Proc. Natl Acad. Sci. USA 84, 5615-5619. Lopez, J.A. et al. (1988) Proc. Natl Acad. Sci. USA 85, 2135-2139.
~_37
Leukosialin, sialophorin, Ly-48 (mouse), W3/13 (rat)
F'-
'
Molecular weights Polypeptide
F' i
'
SDS-PAGE reduced
neutrophils and platelets T cells and thymocytes
Carbohydrate N-linked sites O-linked
F ;,
38 801
115-135 kDa 95-115 kDa
1 + abundant
Human gene location and size 16p 11.2; 4.6 kb 1
GOOH
.............
]
Tissue distribution CD43 is the major sialoglycoprotein on thymocytes, T cells and neutrophils 2,3. It is also present on activated (but not resting) B cells, plasma cells, NK cells, granulocytes, monocytes, macrophages, platelets, and bone marrow haematopoietic stem cells 1. Activation of neutrophils leads to downregulation of CD43 as a result of proteolytic cleavage a. A soluble form of CD43, called galactoglycoprotein, is present in human serum s.
Structure The CD43 extracellular domain of 239 amino acids is mucin-like with a high content of Ser and Thr residues that carry 70-85 O-linked oligosaccharides 6. It has an extended structure, approximately 45 nm in length, and contains four repeats of an 18 amino acid sequence 6,7. The molecular weight of the molecule and its antigenicity varies, depending on the nature of the O-glycans 7. The cytoplasmic domain is highly conserved between species and is constitutively phosphorylated on Ser residues, probably by protein kinase C 8. The coding sequence of CD43 is contained within a single exon 1.
~.3~
Ligands and a s s o c i a t e d m o l e c u l e s The extracellular domain has been reported to interact with CD54, but this is controversial 9. An interaction with albumin has also been reported 3. The membrane-proximal portion of the cytoplasmic domain mediates an association with the cytoskeleton lo.
Function CD43, which is both abundant and very large, appears to function as an antiadhesion molecule, inhibiting T cell interactions, including T cell killing, and increasing the threshold for T cell activation 9. Antibodies to CD43 have a costimulatory effect on T cell activation 11 and can induce cell clustering 12, but the physiological significance of these effects is not clear. Cells infected with HIV express CD43 with altered glycosylation, and autoantibodies to this form of CD43 are detectable in all HW-infected individuals 13
Database accession numbers Human Rat Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A39822 S00842 A43545
P 16150 P13838 P 15702
J04168 Y00090 X 17018
6 14 is
A m i n o acid s e q u e n c e of h u m a n C D 4 3 MATLLLLLGV DALGSTTAVQ PPSTSINEGS PAVPITANSL GPPLTMATVS SGASGPQVSS AVIVLVALLL TVTVGGSGGD GPSLKGEEEP
LVVSP TPTSGEPLVS PLWTSIGAST GSHTVTGGTI LETSKGTSGP VKLSTMMSPT LWRRRQKRRT KGSGFPDGEG LVASEDGAVD
TSEPLSSKMY GSPLPEPTTY TTNSPETSSR PVTMATDSLE TSTNASTVPF GALVLSRGGK SSRRPTLTTF APAPDEPEGG
TTSITSDPKA QEVSIKMSSV TSGAPVTTAA TSTGTTGPPV RNPDENSRGM RNGVVDAWAG FGRRKSRQGS DGAAP
DSTGDQTSAL PQETPHATSH SSLETSRGTS TMTTGSLEPS LPVAVLVALL PAQVPEEGAV LAMEELKSGS
-i 50 i00 150 200 250 300 350 385
References 1 z 3 4 s 6
Shelley, C.S. et al. (1990) Biochem. J. 270, 569-576. Brown, W.R.A. et al. (1981) Nature 289, 456-460. Nathan, C. et al. (1993) J. Cell Biol. 122, 243-256. Rieu, P. et al. (1992) Eur. J. Immunol. 22, 3021-3026. Schmid, K. et al. (1992) Proc. Natl Acad. Sci. USA 89, 663-667. Pallant, A. et al. (1989) Proc. Natl Acad. Sci. USA 86, 1328-1332.
7 Cyster, J.G. et al. (1991) EMBO J. 10, 893-902. 8 Piller, V. et al. (1989) J. Biol. Chem. 264, 18824-18831.
9 Manjunath, N. et al. (1995) Nature 377, 535-538. lo 11 12 13 14 is
Yonemura, S. et al. (1993) J. Cell Biol. 120, 437-449. Sperling, A.I. et al. (1995)J. Exp. Med. 182, 139-146. Cyster, J.G. and Williams, A.F. (1992) Eur. J. Immunol. 22, 2565-2572. Giordanengo, V. et al. (1995) Blood 86, 2302-2311. Killeen, N. et al. (1987) EMBO J. 6, 4029-4034. Cyster, J. et al. (1990) Eur. J. Immunol. 20, 875-881.
~_3~
Phagocytic glycoprotein 1 (Pgp-1) CD44H
Other names Lymphocyte homing receptor Hermes antigen In(Lu)-related p80 Extracellular matrix receptor type III (ECMRIII) Hutch-1 Ly-24 (mouse) p85 HCAM Molecular weights Polypeptide
. , ~ variable exons inserted here
CD44H CD44v
37 237 variable
T COOH
SDS gel reduced
CD44H without glycosaminoglycans Other CD44 variants
80-95 kDa 110-250 kDa
Carbohydrate N-linked sites O-linked
CD44H 7 CD44H ++ CD44v +++/++++ Glycosaminoglycans 1 Chondroitin sulfate (CS)(CD44H and CD44v) Heparan sulfate (CD44v only)
Human gene location and size 1 lpter-p 13 2; 50 kb a Domains
CKA
I sl
!1
I
,k
12
Exon boundaries IDL CRY
Variable exons
~'~
I
!11
Mucin-like ITM I CY]
I1
I1 I1 12"
SAPI2 TRD I PEW I ITI DGH RRC
Iv2 [va Iv4 Iv5 Iv6 Iv7 Iv8 Iv9 I viol
, I , , 1TDV 1 , 1 1AAA , 1 TQQ ' AGT TIS TIQ MDM SNR
Tissue distribution The standard form of CD44 without variable exons (CD44H or CD44s) is very widely expressed on haematopoeitic and non-haematopoeitic cells, being present on epithelial, endothelial, mesothelial and mesenchymal cells, and in the nervous system 4. CD44H is the major isoform expressed on lymphoid,
Z4f
CD44
i
.......
myeloid and erythroid cells 4,s. Expression is increased on activated and memory/ effector T cells 4. CD44 isoforms encoded by additional variable exons (CD44v) are widely expressed on epithelial cells but are only present at low levels on leucocytes, with the exception of a subpopulation of bone marrow plasma cells 6. Activation of lymphocytes and monocytes leads to a transient increase in the expression of CD44v isoforms 6. Soluble forms of CD44 can be detected in body fluids 4. They are generated by proteolytic shedding and perhaps also by alternative splicing to exons containing stop codons 7.
Structure Alternative splicing of the CD44 gene can generate a large number of CD44 isoforms, all of which are heavily glycosylated 4. The standard isoform (CD44H) comprises a membrane-distal link module 8, a membrane-proximal mucin-like region, a transmembrane segment, and a 70 residue cytoplasmic domain. In humans, all other CD44 isoforms (CD44v) are generated by the insertion of various combinations of at least nine exons units (v2-vl0) after Thr202 in the mucin-like region, adding up to 380 residues to the extracellular portion 9. These variable exons also encode mucin-like sequences (-30% Ser/ Thr residues) 9. In addition to carrying O- and N-linked oligosaccharides, proteoglycan isoforms of CD44 have been identified with the covalently linked glycosaminoglycans (GAGs) chondroitin sulfate (CS) and heparan sulfate (HS). One or more of the partial consensus sequences for GAG-modification (SG) in CD44H are modified by CS on lymphocytes 1,4. However a full consensus sequence (SGxG) encoded by the v3 exon appears to be the primary site for HS modification in CD44v isoforms 1. The cytoplasmic domain is phosphorylated on serine/threonine residues 4. Two alleles of CD44 differing at a single residue constitute the India (In) blood group antigens In a (Arg26) and In b (Pro26)lo.
Ligands and associated molecules The membrane-distal link module of CD44 binds the extracellular matrix (ECM) GAG hyaluronan (HA). The ability of CD44 to bind HA is highly regulated 4. For example, leucocytes expressing CD44 do not bind HA constitutively but are able to bind following activation 4,11. Molecular mechanism implicated in regulating binding include clustering, altered glycosylation, differential use of exons, and interactions with the cytoskeleton 4. CD44 has also been reported to bind the ECM proteins collagen, fibronectin and laminin; these interactions are probably mediated by CS carried by CD44 4. Similarly, CD44 molecules carrying HS can bind growth factors such as basic fibroblast growth factor 12. HS may also mediate reported interactions between CD44 and chemokines such as macrophage inflammatory protein lfl. CD44 is a receptor for the chemotactic cytokine osteopontin 13. The cytoplasmic domain interacts with the actin cytoskeleton through associations with ankyrin 14 and members of the ERM (ezrin, radixin and moesin) family is. CD44 also associates with (and may activate) the cytoplasmic tyrosine kinase Lck 16.
CD44
Function t i"
L i i i i i i
CD44 contributes to the adhesion of leucocytes to endothelial cells 4,17 stromal cells and ECM 4,18. These interactions appear to be mediated by CD44 binding to I-La~associated with cells and ECM 17,18, although alternative ligands may exist 4. CD44 appears to mediate extravasation of activated and memory/effector lymphocytes (rather than naive lymphocytes) into sites of inflammation (rather than to primary or secondary lymphoid tissues)4,17,19. Expression of the v6 exon in the rat confers metastatic potential on tumour cell lines 4. The same exon is expressed in activated lymphocytes in vivo and antibodies specific for this exon inhibit the immune response/n vivo a. Expression of the v3 exon in human B lymphoma cells increases their metastatic potential2~ This effect does not appear to result from enhanced HA adhesion 20 but does requires an intact HS GAG consensus sequence (SGxG), consistent with a role for growth factor or chemokine binding to HS (D. Jackson, personal communication). CD44 may transduce signals to cells following interactions with ECM 4 or with soluble ligands such as osteopontm la. Database accession numbers PIR Human CD44H Human CD44 gene
SWISSPR OT
Mouse CD44H Rat CD44H
P15379 P26051
A34424
EMBL/GENBANK M33827 M69215 L05407-L05424 M27129 M61875
REFERENCE 21 3 22 23
Amino acid sequence of human CD44H MDKFWWHAAW QIDLNITCRF SIGFETCRYG ASAPPEEDCT SNPTDDDVSS ATRDQDTFHP IILASLLALA EASKSQEMVH .
.
GLCLVPLSLA AGVFHVEKNG FIEGHVVIPR SVTDLPNAFD GSSSERSSTS SGGSHTTHGS LILAVCIAVN LVNKESSETP .
.
.
.
.
.
.
.
.
.
RYSISRTEAA IHPNSICAAN GPITITIVNR GGYIFYTFST ESDGHSHGSQ SRRRCGQKKK DQFMTADETR
DLCKAFNSTL NTGVYILTYN DGTRYVQKGE VHPIPDEDSP EGGANTTSGP LVINSGNGAV NLQNVDMKIG
PTMAQMEKAL TSQYDTYCFN YRTNPEDIYP WITDSTDRIP IRTPQIPEWL EDRKPSGLNG V
-i 50 i00 150 200 250 300 341
Notes 1 CD44v isoforms are generated by insertion of additional sequences after Thr202 (bold and underlined). 2 Partial consensus GAG linkage sites (SG) are underlined with dotted lines.
Sequence encoded by variable exons TLMSTSATAT SAGWEPNEEN DWTQWNPSHS HEEEETPHST TTGTAAASAH_ MDSSHSTTLQ DKDHPTTSTL FIPVTSAKTG
Z42
ETATKRQETW EDERDRHLSF NPEVLLQTTT ST~QATPSST TSHPMQGRTT PTANPNTGLV TSSNRNDVTG SFGVTAVTVG
DWFSWLFLPS SGSGIDDDED RMT'DVDRNGT TEETATQKEQ PSPEDSSWTD EDLDRTGPLS GRRDPNHSEG DSNSNVNRSL
ESKNHLHTTT FISSTISTTP TAYEGNWNPE WFGNRWHEGY FFNPISHPMG MTT~QSNSQS STTLLEGYTS S
QMAGTSSNTI RAFDHTKQNQ AHPPLIHHEH RQTPREDSHS RGHQAGRRMD FSTSHEGLEE HYPHTKESRT
50 i00 150 200 250 300 350 381
Notes 1 Sequence encoded by exons v 2 - v l 0 is shown 9. In humans exon vl contains an in-frame stop codon and is probably not used. The underlined residues span the exon splice junctions and can vary with different exon combinations. 2 A full consensus GAG linkage site (SGxG) is underlined with dotted lines. i
i
9
References 1 Jackson, D.G. et al. (1995) J. Cell. Biol. 128, 673-685. z Forsberg, U.H. et al. (1989) Immunogenetics 29, 405-407. 3 Screaton, G.R. et al. (1992) Proc. Natl Acad. Sci. USA 89, 12160-12164. 4 Lesley, J. et al. (1993) Adv. Immunol. 54, 271-335. s Schlossman, S.F. et al. (eds) (1995) Leucocyte Typing V: white cell differentiation antigens. Oxford University Press, Oxford. 6 Mackay, C.R. et al. (1994) J. Cell. Biol. 124, 71-82. 7 Yu, Q. and Toole, B.P. (1996)J. Biol. Chem. 271, 20603-20607. 8 Kohda, D. et al. (1996)Cell 86, 767-775. 9 Screaton, G.R. et al. (1993) J. Biol. Chem. 268, 12235-12238. lo Telen, M.J. et al. (1996)J. Biol. Chem. 271, 7147-7153. 11 Lesley, J. et al. (1994)J. Exp. Med. 180, 383-387. 12 Bennett, K.L. et al. (1995) J. Cell. Biol. 128, 687-698. 13 Weber, G.F. et al. (1996) Science 2 7 1 , 5 0 9 - 5 1 2 . 14 Lokeshwar, V.B. et al. (1994) J. Cell. Biol. 126, 1099-1109. is Tsukita, S. et al. (1994)J. Cell. Biol. 126, 391-401. 16 Taher, T.E.I. et al. (1996) J. Biol. Chem. 271, 2863-2867. 17 DeGrendele, H.C. et al. (1996)J. Exp. Med. 183, 1119-1130. 18 Clark, R.A. et al. (1996) J. Cell Biol. 134, 1075-1087. 19 Butcher, E.C. and Picker, L.J. (1996) Science 272, 60-66. 2o Bartolaz71, A. et al. (1995)J. Cell Sci. 108, 1723-1733. zl Stamenkovic, I. et al. (1989)Cell 56, 1057-1062. 22 Nottenburg, C. et al. (1989) Proc. Natl Acad. Sci. USA 86, 8521-8525. z3 Gunthert, U. et al. (1991) Cell 65, 13-24.
}.43
Leucocyte common antigen (L-CA), B220, T200, Ly-5, EC 3.1.3.48
CD45
Molecular weights Polypeptide
127 438 (0) 145 590 (ABC)
SDS-PAGE reduced B cells thymocytes T cells
A
180-240 kDa 240 kDa 180 kDa multiple bands
Carbohydrate N-linked sites O-linked
11-16 + abundant (especially on isoforms expressing A, B, C exon products)
0.~ F3
,~
F3
"~
Human gene location and size
~3
J
TTTT
1 q31-q32; 120 kb 1
\ COOH
WKN
Domains
IslIAIBICl I Exon boundaries
" I' SDV " I~ IAT " I' /' / TGL TGQ TGV SDA CDE ITMI
P
WNP
CRP
I 1 PGV
I
11 CGN
P
,
GSP
I,
TEN
,
APP
~I
FKAY
SYN
I
17 m o r e e x o n s
Tissue
distribution
CD45 proteins are found on all cells of haematopoietic origin, except erythrocytes 2. Various isoforms of CD45 are generated by alternative splicing of three exons that can be inserted immediately after an N-terminal sequence of eight amino acids found on all isoforms. Of the eight possible combinations of exons, seven have been found at the mRNA level 2,a. The isoforms are expressed differentially on leucocytes and their expression can be followed with mAbs specific for protein encoded by the alternate exons or by an mAb that recognizes the junction of the eight amino acid N-terminal conserved sequence and the rest of the conserved part of the protein. These epitopes are termed CD45RA, CD45RB, CD45RC and CD45R0 respectively. Other mAbs react with the common part of the structure and recognize all CD45 isoforms. B cells express a single isoform, including the A, B and C ~.44
exon encoded sequences. Among peripheral CD4 § T cells various combinations including one or two of the alternate exons are found. This differential expression has been of use in defining subsets of CD4 § T cells, in which naive T cells express forms of CD45 including these exons, whereas activated cells and most memory cells express forms including these exons at low levels and label with mAbs specific for the CD45R0 epitope. Thymocytes express primarily the low Mr isoform of CD45.
Structure The extracellular domain of CD45 is from 391 to 552 amino acids long with 11-16 N-linked carbohydrate attachment sites. Protein sequence coded by exons 3-8 is rich in Ser, Thr and Pro and has multiple O-linked carbohydrates 2'4. Binding of several mAbs recognizing the ABC segments is affected by glycosylation although in most cases the mAb can react with the unglycosylated protein backbone s. The rest of the extracellular domain contains 16 cysteine residues which are largely conserved between mouse, rat and human 2. Three putative fibronectin type III domains have been identified; these are unusual because of their high Cys content 6,7. CD45 has been shown to be sulfated s. The cytoplasmic domain is large (700 amino acids) and contains two PTPase (phosphotyrosine phosphatase)domains 9. The membrane-proximal domain has enzymatic activity lo but is dependent on the second domain 11,12. The intracellular domain is phosphorylated on Ser, in response to PKC activation 2.
Ligands and associated molecules E
CD45 has been reported to be associated at the cell surface with several cell surface antigens such as CD4, TCR, CD2, Thy-1 and CD26 lz. CD22 can bind CD45 and other glycoproteins through sialic acid residues 13. The cytoplasmic part of CD45 is associated with fodrin 2. CD45 associates with a lymphocyte-specific protein termed lymphocyte phosphatase-associated phosphoprotein (LPAP) or CD45-associated protein (CD45-AP) through an interaction involving the transmembrane_region of CD4514,1s. CD45 is also associated with a phosphorylated glycoprotein of Mr 116 kDa present in all haematopoietic cells 16.
Function Expression of CD45 is necessary for signalling through the T cell receptor 17. Gene deletion experiments indicate a role for CD45 in the selection of both T and B lymphocytes ls,19. The variants in the extracellular region correlate with functional subpopulations of lymphocytes and may affect threshold of activation 2o Database accession numbers
Human (ABC) Rat (ABC) Mouse (0)
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
Y00638 Y00065 M14342
3
A02247 A29381
P08575 P04157 P06800
21 22
_4,..
CD45
Amino acid sequence of human CD45 MYLWLKLLAF QSPTPSPT GLTTAKMPSW QVSPDSLDNA GVSSVQTPHL DVPGERSTAS DAYLNASETT TKLFTAKLNV LILDVPPGVE GNMIFDNKEI PQIIFCRSEA DLQNLKPYTK MTSDNSMHVK QYSTDYTFKA LVVLYKIYDL KIADEGRPFL ELSEINGDAG ATVIVMVTRC QKLNIVNKKE FSGPIVVHCS MVQVEAQYIL EAEFQRLPSY KESEHDSDES DFWQMIFQRK KSSTYTLRVF KQKLPQKNSS VDIFQVVKAL KIEFDNEVDK NGPASPALNQ
GFAFLDTEVF VTG PLSSDPLPTH SAFNTT PTHADSQTPS TFPTDPVSPL TLSPSGSAVI NENVECGNNT KFQLHDCTQV KLENLEPEHE AHQGVITWNP YVLSLHAYII CRPPRDRNGP YFHNGDYPGE HKKRSCNLDE AEFQSIPRVF SNYINASYID EEGNRNKCAE KATGREVTHI AGVGRTGTYI IHQALVEYNQ RSWRTQHIGN SDDDSDSEEP VKVIVMLTEL ELRHSKRKDS EGNKHHKSTP RKARLGMVST VKQDANCVNP GS
TTAFSPASTF ERENDFSETT TSLSPDNTST AGTDTQTFSG TTTLSLAHHS STTTIATTPS CTNNEVHNLT EKADTTICLK YKCDSEILYN PQRSFHNFTL AKVQRNGSAA HERYHLEVEA PFILHHSTSY QQELVERDDE SKFPIKEARK GFKEPRKYIA YWPSMEEGTR QFTSWPDHGV GIDAMLEGLE FGETEVNLSE QEENKSKNRN SKYINASFIM KHGDQEICAQ RTVYQYQYTN LLIHCRDGSQ FEQYQFLYDV LGAPEKLPEA
SAANAKLNPT SAALPARTSN KPTCDEKYAN ECKNASVSIS WKNIETFTCD NHKFTNASKI CYIKETEKDC MCHFTTKSAP GNTLVRNESH NSKALIAFLA KQLMNVEPIH PFNQNKNRYV AQGPRDETVD AFGDVVVKIN PEDPHLLLKL AENKVDVYGY LHPYLHNMKK SNVIPYDYNR SYWKPEVMIA YWGEGKQTYG WSVEQLPAEP QTGIFCALLN IASTYPAQNG KEQAEGSEPT
PGSNAIS TTITANTS ITVDYLYNKE HNSCTAPDKT TQNITYRFQC IKTDFGSPGE LNLDKNLIKY PSQVWNMTVS KNCDFRVKDL FLIIVTSIAL ADILLETYKR DILPYDYNRV DFWRMIWEQK QHKRCPDYII RRRVNAFSNF VVKLRRQRCL RDPPSEPSPL VPLKHELEMS AQGPLKETIG DIEVDLKDTD KELISMIQVV LLESAETEEV QVKKNNHQED SGTEGPEHSV
-i 8 74A 121B 169C 219 269 319 369 419 469 519 569 619 669 719 769 819 869 919 969 1019 1069 1119 1169 1219 1269 1281
References 1 Fernandez-Luna, J. et al. (1991) Genomics 10, 754-764. z Thomas, M.L. (1989) Annu. Rev. Immunol. 7, 339-369. 3 Streuli, M. et al. (1987) J. Exp. Med. 166, 1548-1566. 4 Jackson, D.I. and Barclay, A.N. (1989)Immunogenetics 29, 281-287. s Cyster, J.G. et al. (1994) Int. Immunol. 6, 1875-1881. 6 Bork, P. and Doolittle, R.F. (1993)Protein Sci. 2, 1185-1187. 7 0 k u m u r a , M. et al. (1996)J. Immunol. 157, 1569-1575. s Giordanengo, V. et al. (1995) Eur. J. Immunol. 25, 274-278. 9 Tonks, N.K. et al. (1988) Biochemistry 27, 8695-8701. lo Streuli, M. et al. (1990) EMBO J. 9, 2399-2407. 11 Johnson, P. et al. (1992)J. Biol. Chem. 9.67, 8035-8041. 12 Trowbridge, I.S. and Thomas, M.L. (1994) Annu. Rev. Immunol. 12, 85-116. 13 Sgroi, D. et al. (1993)J. Biol. Chem. 268, 7011-7018. 14 Bruyns, E. et al. (1995) J. Biol. Chem. 270, 31372-31376. is McFarland, E.D.C. and Thomas, M.L. (1995)J. Biol. Chem. 270, 28103-28107. 16 Arendt, C.W. and Ostergaard, H.L. (1995)J. Biol. Chem. 270, 2313-2319. 17 Pingel, J.T. and Thomas, M.L. (1989)Cell 58, 1055-1065.
24(
CD45
~8 19 2o 21 22
Byth, K.F. et al. (1996)J. Exp. Med. 183, 1707-1718. Cyster, J.G. et al. (1996) Nature 381,325-328. Leitenberg, D. et al. (1996)J. Exp. Med. 183, 249-259. Barclay, A.N. et al. (1987) EMBO J. 6, 1259-1264. Saga, Y. et al. (1986) Proc. Natl Acad. Sci. USA 83, 6940-6944.
~_47
CD46 ----1
Complement membrane cofactor protein (MCP)
Molecular weights Polypeptides (multiple splicing variants) SDS-PAGE reduced unreduced
37018-40387
r" )
51-68 kDa 46-63 kDa
).,,
Carbohydrate N-linked sites O-linked sites (depending on slicing variant)
3
r"
+/++
Human gene location and size 1q 32; >43 kb 1,2.
2 COOH
I
CEE Domains Exon boundaries
I~1 I' SDA
CPY
c
'clCTP
I c I I, 12 11 YRE EGY EKV
CRF
c
,,, I
11 KVV
A B pC STP STP ST c
,_ ~
I I T M [ ~ ~ J
V 1 V 1 V 1 ]1 11 ~ ~ V~2 KVL SVS SGP [ [ \ N ""N] ~ P G Y ] ~ K G T KGK ,vs
.ov
,vv
Tissue distribution CD46 is expressed on all peripheral blood cells and platelets but not erythrocytes l'e. CD46 is also expressed on fibroblasts, endothelial and epithelial cells, and tissues of the reproductive system, including fallopian tube, uterine endometrium, placenta and sperm. CD46 is not expressed on unfertilized oocytes but appears at the 6-8 cell stage embryo 1-4.
Structure CD46 is a type I membrane protein with four complement control protein (CCP) domains, a Ser/Thr/Pro rich (STP)region, and a further 13 residues in the extracellular portion s. The N-terminal protein sequence has been determined 6. There are six CD46 variants determined from cDNA clones, derived from the combination of three variants in the STP region and two variants in the cytoplasmic domain. The three variants in the STP region express both STP A and STP B exons, only STP B, or neither exon. The two cytoplasmic variants are the consequence of the presence or absence of the exon CY1 1,s. Thus the six isoforms are designated ABC1, ABC2, BC1, BC2, C1 and C2; A, B and C indicate the expression of the STP A, STP B and STP c exons, and 1 and 2 denote the cytoplasmic exons expressed. However, the two isoforms ABC1 and ABC2 have not been observed in peripheral blood cells or cell lines s. Most cells of a given individual express the same ratio of the four remaining isoforms, although selective expression of certain isoforms has been noted in the brain, kidney and salivary gland 5,7
~_4~
CD46
Ligands and associated molecules CD46 binds the complement components C3b and C4b 1,2, and is a receptor for the measles virus 8 and for Streptococcus pyogenes 9.
Function
t E
F
i .....
CD46 is a member of the regulator of complement activation (RCA)family of proteins 2. It acts as a cofactor which binds to C3b and C4b, thereby permitting factor I, a serine protease of the complement system, to convert C3b and C4b into fragments that cannot support further complement activation 1'2. Of the variants that have different STP regions, the BC isoforms are found to be more efficient than the C isoforms in the regulation of C4b activity, although both sets of isoforms have similar regulatory activities on C3b lo. CD46 with cytoplasmic tail type 1 (CY1)has been shown to be processed four times faster into its mature forms 11. The complement regulatory proteins CD46, CD55 and CD59 are likely to be important in protecting the sperm and fetus from rejection by the maternal immune system 2'12 Transgenic pigs are being produced which express human complement regulatory proteins in order to prevent complementmediated hyperacute rejection - a major problem in xenotransplantation of pig organs to humans 13 Database accession numbers BC1 BC2 C2
Human
St
Amino
PIR I54479 SO1896
acid sequence
SWISSPR OT P15529
of human
EMBL/GENBANK M58050 Y00651 $51940
REFERENCE s,14
s,6,1s
s,16
CD46
MEPPGRRECP FPSWRFPGLL LAAMVLLLYS FSDA
-1
CEEPPTFEAM ELIGKPKPYY EIGERVDYKC KKGYFYIPPL ATHTICDRNH 50 T W L P V S D D A C Y R E T C P Y I R D P L N G Q A V P A N G T Y E F G Y Q M H F I C N E G Y Y L I i00 G E E I L Y C E L K G S V A I W S G K P P I C E K V L C T P P P K I K N G K H T F S E V E V F E Y L 150
DAVTYSCDPA PGPDPFSLIG ESTIYCGDNS VWSRAAPECK VVKCRFPVVE 2 0 0 NGKQISGFGK KFYYKATVMF ECDKGFYLDG SDTIVCDSNS TWDPPVPKCL 2 5 0
300 285 270
ABC KVLPPSSTKP PALSHSVSTS STTKSPASSA -BC K . . . . . . . . . . . . . . . V S T S S T T K S P A S S A --C K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
SGPRPTYKPP SGPRPTYKPP GPRPTYKPP
VSNYPGYPKP VSNYPGYPKP VSNYPGYPKP
All EEGILDSLDV WVIAVIVIAI VVGVAVICVV
PYRYLQRRKK
KG 3 4 2 / 3 2 7 / 3 1 2
CYI TYLTDETHRE VKFTSL CY2 K A D G G A E Y A T Y Q T K S T T P A E
QRG
358/343/328 365/350/335
To represent the six isoforms, all possible sequences in the variable STP and cytoplasmic regions are shown, including the residue numbering. Residues missing due to exon deletion are marked with dots.
~.4~
CD46
References 1 Liszewski, M.K. et al. (1991) Annu. Rev. Immunol. 9, 431-455. z Liszewski, M.K. et al. (1996) Adv. Immunol. 61, 201-283. 3 Fenichel, P. et al. (1995) Am. J. Reprod. Immunol. 33, 155-164. 4 Jensen, T.S. et al. (1995) Am. J. Reprod. Immunol. 34, 1-9. s Post, T.W. et al. (1991) J. Exp. Med. 174, 93-102. 6 Lublin, D.M. et al. (1988)J. Exp. Med. 168, 181-194. 7 Johnstone, R.W. et al. (1993) Mol. Immunol. 30, 1231-1241. 8 Manchester, M. et al. (1994) Proc. Natl Acad. Sci. USA 91, 2161-2165. 9 0 k a d a , N. et al. (1995) Proc. Natl Acad. Sci. USA 92, 2489-2493. lo Liszewski, M.K. and Atkinson, J.P. (1996) J. Immunol. 156, 4415-4421. 11 Liszewski, M.K. et al. (1994) J. Biol. Chem. 269, 10776-10779. 12 Rooney, I.A. et al. (1993) Immunol. Res. 12, 276-294. 13 Cozzi, E. and White, D.J.G. (1995) Nature Med. 1, 964-966. 14 Purcell, D.F. et al. (1991) Immunogenetics 33, 335-344. is Purcell, D.F. et al. (1990) Immunogenetics 31, 21-28. 16 Cervoni, F. et al. (1993) Mol. Reprod. Dev. 34, 107-113.
~_5(
Integrin-associated protein (IAP), OV-3, A38L (poxvirus) Molecular weights Polypeptide SDS-PAGE reduced unreduced
35 213
v~ 47-52 kDa 43-50.5, 110 kDa
Carbohydrate N-linked sites O-linked
6 unknown
Human gene location
OOOH
3q13.1-q13.2
Tissue distribution CD47 has a broad tissue distribution with expression on virtually all haematopoietic cells, including thymocytes, T cells, B cells, monocytes, platelets and erythrocytes, as well as on epithelial cells, endothelial cells, fibroblasts, sperm, and tumour cell lines. CD47 is part of the Rh complex on erythrocytes and is not expressed on Rhnull erythrocytes 1.
Structure The CD47 structure is unusual in combining an IgSF domain with a multipass transmembrane mode of attachment to the cell membrane. CD47 is predicted to have five membrane-spanning regions, an N-terminal IgSF V-set domain and short cytoplasmic regions. Immunofluorescence microscopy has proved the C-terminal tail to be cytoplasmic e. Four alternatively spliced forms of the C-terminus have been identified 3. These are generated by the variable usage of three short exons. Two predominant splice variants have been demonstrated in vivo and these have distinct tissue expression patterns. The form with the shorter cytoplasmic tail is expressed by bone marrow-derived cells, endothelial cells and fibroblasts, whereas the longer form is expressed mainly by neural tissue 3
Ligands and associated molecules The CD47 molecule associates non-covalently with the CD61 (f13) integrins CD51/CD61 (evil3) and CD41/CD61 (~IIbfl3)e. The IgSF domain of CD47 is thought to mediate these interactions 4. A ligand for CD47 is the C-terminal cell binding domain of thrombospondin, an extracellular adhesion molecule which regulates motility and proliferation in many cell types s.
Function CD47 plays a role in the chemotactic and adhesive interactions of leucocytes with endothelial cells. The mechanism of action is not entirely understood, but is thought to involve the modulation of the function of integrins, with
~.51
i
7 which CD47 is physically and functionally linked, and its binding to thrombospondin s-7. The binding of thrombospondin to CD47 on endothelial cells in vitro induces a chemotactic response that is inhibited by a CD47 mAb s. CD47 mAbs also inhibit neutrophil migration across endothelium and epithelium, without affecting C D l l b / C D 1 8 integrin-mediated neutrophil adhesion to epithelium 6,7. Human cells that lack CD47 are deficient in some CD51 integrin ligand binding functions 4. CD47 knockout mice show increased susceptibility to bacterial infections due to granulocyte defects in CD61 integrin function, activation of oxidative burst and phagocytosis 8.
Comments CD47 species homologues have been identified in the poxviruses vaccinia and variola major 9. The vaccinia virus A38L gene product is a 33 kDa membrane glycoprotein with 28% sequence identity to mammalian CD47. Proteolysis studies of A38L translated in vitro in microsomal membranes are consistent with the proposed membrane topology of human CD47. The A38L protein is expressed at a low level in infected cells in vivo but its function is not clear, since deletion of the A38L gene does not affect virus particle production or virulence 9.
Database accession numbers FIR Human Mouse Vaccinia
$29922
SWISSPR OT
EMBL/GENBANK
REFERENCE
Q08722
Z25521 Z25524 X57318
2 z lo
P24763
A m i n o acid s e q u e n c e of h u m a n C D 4 7 MWPLVAALLL QLLFNKTKSV DGALNKSTVP LTREGETIIE SGGMDEKTIA ILILLHYYVF PLLISGLSIL MMNDE
GSACCGSA EFTFCNDTVV TDFSSAKIEV LKYRVVSWFS LLVAGLVITV STAIGLTSFV ALAQLLGLVY
IPCFVTNMEA SQLLKGDASL PNENILIVIF IVIVGAILFV IAILVIQVIA MKFVASNQKT
QNTTEVYVKW KMDKSDAVSH PIFAILLFWG PGEYSLKNAT YILAVVGLSL IQPPRKAVEE
KFKGRDIYTF TGNYTCEVTE QFGIKTLKYR GLGLIVTSTG CIAACIPMHG PLNAFKESKG
References 1 2 3 4 s 6 7
Hadam, M.R. (1989)Leucocyte Typing IV, 658-660. Lindberg, F.P. et al. (1993) J. Cell Biol. 123, 485-496. Reinhold, M.I. et al. (1995)J. Cell Sci. 108, 3419-3425. Lindberg, F.P. et al. (1996)J. Cell Biol. 134, 1313-1322. Gao, A.-G. et al. (1996) J. Biol. Chem. 271, 21-24. Cooper, D. et al. (1995) Proc. Natl Acad. Sci. USA 92, 3978-3982. Parkos, C.A. et al. (1996)J. Cell Biol. 132, 437-450.
s Lindberg, EE et al. (1996) Science 274, 795-798. 9 Parkinson, J.E. et al. (1995) Virology 214, 177-188. lo Smith, G.L. et al. (1991)J. Gen. Virol. 72, 1349-1376.
~.5~
-i 50 i00 150 200 250 300 305
CD48
Blast-l, HuLy-m3, BCM1, MEM-102, OX-45 NH 2
Molecular weights Polypeptide
22 344
SDS-PAGE reduced
4 0 - 4 7 kDa
Carbohydrate N-linked sites O-linked
6 unknown
s
Human gene location and size lq21-q23; >28 kb 1 [ISE
Domains
YIM
CVI
YTC I
]1
Exon boundaries
QGH
LDP
LAR
Tissue distribution Widely expressed on haematopoetic cells with the exception of granulocytes (but some on eosinophils), platelets and erythrocytes 2'3 Expression is increased following activation of T and B cells. Rat CD48 (OX-45) is more widely expressed, being present on erythrocytes, endothelium and some connective tissue 4.
Structure CD48 contains two IgSF domains and is a m e m b e r of the CD2 family of molecules which includes CD58, Ly-9, 2B4 and CD150 s. Within this group CD48 is most similar to CD58, which is a ligand for h u m a n CD2 s. Like one splice variant of CD58, CD48 possesses a glycosyl-phosphatidylinositol (GPI) m e m b r a n e anchor. The site of GPI a t t a c h m e n t is shown as Ser194 on the basis of homology with the rat sequence 6. The CD48 gene lies close to the Ly-9 and 2B4 genes in the mouse 7-9 and within 410kb of the putative h u m a n Ly-9 gene 9. These genes are likely to have arisen in a series of gene duplication events, and duplication of this entire chromosomal region is likely to have given rise to the region encoding CD2 and CD58 7,9
Ligands and associated molecules H u m a n CD48 has been reported to bind CD2 but with an affinity considerably (> 100-fold) lower than CD58, the major CD2 ligand s. In mice and rats CD48 appears to be the major CD2 ligand s. Rat CD48 binds CD2 with an affinity of Kd 60-90gM and dissociates with a very fast dissociation rate constant (koff > 6 s-I). The CD2 binding site lies on the G F C C ' C " fi sheet of the V-set domain and CD2 binds CD48 in a head-to-head orientation s. Immunoprecipitation studies suggest associations with the cytoplasmic tyrosine kinases Lck and Fyn lo
Z5~
CD48
Function In the mouse and the rat CD48 is the major CD2 ligand and so may contribute to T cell antigen recognition s. CD48 antibodies suppress the i m m u n e response in mice but the precise role of the C D 4 8 - C D 2 interaction is unclear since no abnormality has been detected in CD2-deficient mice s.
Database accession numbers Human Rat Mouse
PIR
SWISSPR OT
EMBL/GENBANK
A53244 S01299 JL0143
P09326 P10252 P 18181
M37766 X13016 X17501
REFERENCE 11 6 7
Amino acid sequence of human CD48 MWSRGWDSCL QGHLVHMTVV FESKFKGRVR VLDPVPKPVI QNSVLETTLM FGVEWIASWL
ALELLLLPLS SGSNVTLNIS LDPQSGALYI KIEKIEDMDD PHNYSRCYTC VVTVPTILGL
LLVTSI ESLPENYKQL SKVQKEDNST NCYLKLSCVI QVSNSVSSKN LLT
TWFYTFDQKI YIMRVLKKTG PGESVNYTWY GTVCLSPPCT
VEWDSRKSKY NEQEWKIKLQ GDKRPFPKEL LARS
-i 50 i00 150 194 +23
References 1 Fisher, R.C. and Thorley-Lawson, D.A. (1991)Mol. Cell. Biol. 11, 1614-1623. z Hadam, M.R. (1989) in Leucocyte Typing IV, Knapp, W. et al. (ed.). Oxford University Press, Oxford, pp. 661-667. a Vaughan, H.A. et al. (1983)Transplantation 36, 446-450. 4 Arvieux, J. et al. (1986) Mol. Immunol. 23, 983-990. s Davis, S.J. and van der Merwe, P.A. (1996) Immunol. Today 17, 177-187. 6 Killeen, N. et al. (1988) EMBO J. 7, 3087-3091. 7 Wong, Y.W. et al. (1990) J. Exp. Med. 171, 2115-2130. 8 Mathew, P.A. et al. (1993)J. Immunol. 151, 5328-5337. 9 Kingsmore, S.F. et al. (1995) Immunogenetics 42, 59-62. lo Garnett, D. et al. (1993) Eur. J. Immunol. 171, 2115-2130. 11 Staunton, D.E. and Thorley-Lawson, D.A. (1987) EMBO J. 6, 3695-3701.
~_54
CD49a Other names 1 integrin subunit VLA- 1 ~ subunit Molecular weights Polypeptides 127 839 SDS-PAGE reduced unreduced
210 kDa 200 kDa
Carbohydrate N-linked sites O-linked sites
26 unknown
Human gene location 5
c_... CD49a/CD29
Tissue distribution CD49a is expressed on monocytes and at low levels on resting T cells 1,2. Its expression on T cells is increased upon prolonged stimulation in vitro 2,3. It is also found at increased levels on T lymphocytes in synovial joints of patients with rheumatoid arthritis; however, its expression is reduced following phytohaemagglutinin stimulation a. CD49a expression on NK cells is upregulated by IL-2 s.
Structure CD49a is the ~ 1 integrin subunit which forms a non-covalent heterodimer with CD29, the fll integrin subunit. CD49a belongs to the integrin a subclass which contains an I-domain 6. The N-terminal sequence has been determined 7
Ligands and associated molecules CD49a/CD29 binds laminin and collagen through different binding sites 2,s.
Function The integrin CD49a/CD29 on choriocarcinoma and melanoma cells binds collagen and the E1 fragment of laminin 8'9. Solubilized CD49a/CD29 complexes have been shown to bind collagen 6,8. However, collagen binding is not mediated by CD49a/CD29 on cultured T lymphoyctes lo.
Database accession numbers PIR
Human Rat
A35854
SWISSPR OT
EMBL/GENBANK
REFERENCE
P 18614
X68742 X52140
6 11
~.5~
CD49a
A m i n o acid sequence of human CD49a MVPRRPASLE FNVDVKNSMT VYKCPVGRGE ACGPLYAYRC GSNSIYPWDS STEEVLVAAK VTDGESHDNH SIASEPTEKH QTGFSAHYSQ NEPLASYLGY QTLSGEQIGS YALNQTRFEY KDLNLDGFND TLKFFGQSIH NKVNIQKKNC LRQISRSFFS FNLTDPENGP LLIVRSQNDK ESNHNITCKV PPETLSDNVV IGNEINIFYL NCRPHIFEDP ISQVNVSLIL IQISKDGLPG K
VTVACIWLLT FSGPVEDMFG SLPCVKLDLP GHLHYTTGIC VTAFLNDLLK KIVQRGGRQT RLKKVIQDCE FFNVSDELAL DWVMLGAVGA TVNSATASSG YFGSILTTTD QMSLEPIKQT IVIGAPLEDD GEMDLNGDGL HMEGKETVCI GTQERKVQRN VLDDSLPNSV FNVSLTVKNT GYPFLRRGEM NISIPVKYEV IRKSGSFPMP FSINSGKKMT WKPTFIKSYF RVPLWVILLS
VILGFCVS YTVQQYENEE VNTSIPNVTE SDVSPTFQVV RMDIGPKQTQ MTALGTDTAR DENIQRFSIA VTIVKTLGER YDWNGTVVMQ DVLYIAGQPR IDKDSNTDIL CCSSRQHNSC HGGAVYIYHG TDVTIGGLGG NATVCFEVKL ITVRKSECTK HEYIPFAKDC KDSAYNTRTI VTFKILFQFN GLQFYSSASE ELKLSISFPN TSTDHLKRGT SSLNLTIRGE AFAGLLLLML
GKWVLIGSPL VKENMTFGST NSIAPVQECS VGIVQYGENV KEAFTEARGA ILGSYNRGNL IFALEATADQ KASQIIIPRN YNHTGQVIIY LVGAPMYMGT TTENKNEPCG SGKTIRKEYA AALFWSRDVA KSKEDTIYEA HSFYMLDKHD GNKEKCISDL VHYSPNLVFS TSYLMENVTI YHISIAANET MTSNGYPVLY ILDCNTCKFA LRSENASLVL LILALWKIGF
VGQPKNRTGD LVTNPNGGFL TQLDIVIVLD THEFNLNKYS RRGVKKVMVI STEKFVEEIK SAASFEMEMS TTFNVESTKK RMEDGNIKIL EKEEQGKVYV ARFGTAIAAV QRIPSGGDGK VVKVTMNFEP DLQYRVTLDS FQDSVRITLD SLHVATTEKD GIEAIQKDSC YLSATSDSEE VPEVINSTED PTGLSSSENA TITCNLTSSD SSSNQKRELA FKRPLKKKME
References 2 3 4 s 6 7 8 9 lo 11
~_5~
Pigott, R. and Power, C. (1993) The Adhesion Molecule FactsBook. Academic Press, London. Hemler, M.E. (1990) Annu. Rev. Immunol. 114, 365-400. Hemler, M.E. et al. (1984) J. Immunol. 132, 3011-3018. Hemler, M.E. et al. (1986) J. Clin. Invest. 78, 696-702. Perez-Villar, J.J. et al. (1996) Eur. J. Immunol. 26, 2023-2029. Briesewitz, R. et al. (1993) J. Biol. Chem. 268, 2989-2996. Takada, Y. et al. (1987) Proc. Natl Acad. Sci. USA 84, 3239-3243. Kramer, R.H. and Marks, N. (1989) J. Biol. Chem. 264, 4684-4688. Hall, D.E. et al. (1990) J. Cell Biol. 110, 2175-2184. Goldman, R. et al. (1992) Eur. J. Immunol. 22, 1109-1114. Ignatius, M.J. et al. (1990)J. Cell Biol. 111, 709-720.
-i 5O I00 150 200 250 300 350 400 450 5OO 550 600 650 7OO 750 800 85O 9OO 95O i000 1050 Ii00 1150 1151
Other names Integrin c~2 subunit VLA-2 ~ subunit Ia subunit of platelet GP Ia-IIa
Molecular weights !
!
I
Polypeptides
126378
SDS-PAGE reduced unreduced
165 kDa 160 kDa
Carbohydrates N-linked sites O-linked sites
10 unknown
Human gene location I 5q23-31
CD49b/CD29
Tissue distribution CD49b is expressed on monocytes, platelets, B and T lymphocytes, and NK cells 1-4 The level of expression on T cells is elevated upon prolonged culture ~. Like CD49a/CD29, CD49b/CD29 is also found on lymphocytes in the synovial joints of rheumatoid patients. In contrast to CD49a/CD29, CD49b/CD29 is further upregulated on stimulation with phytohaemagglutinin 5.
Structure CD49b is the a2 integrin subunit which forms a non-covalently associated heterodimer with CD29 (ill integrin subunit). CD49b belongs to the integrin subclass which contains an I-domain 6. The N-terminal sequence of CD49b has been determined 6. A polymorphism at residue 505 gives rise to the platelet alloantigenic Bra (Lys) and Brb (Glu)variants 7.
Ligands and associated m o l e c u l e s CD49b/CD29 binds collagen and laminin 2. Binding to both ligands can be inhibited by peptides containing the DGEA sequence 8. The I-domain of CD49b is involved in collagen binding 9.
Function CD49b/CD29 mediates the Mg2+-dependent adhesion of platelets to collagen 8,1o. It is also a collagen receptor on leucocytes 11,12 and fibroblasts la. CD49b/CD29 may be utilized by fibroblasts for the reorganization of the collagen matrix during wound healing 14'1s. The upregulation of CD49b/CD29 may also be used by tumour cells to remodel the collagen matrix during
~_5~
tumor growth, invasion and metastasis ls-17. CD49b/CD29 can act as a laminin receptor in certain cell lines but not in platelets or fibroblasts 12. CD49b/CD29 integrin is a receptor for echovirus-1 18.
Database accession numbers Human Cattle
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A33998
P17301
X17033 L25886
6 9
A m i n o acid sequence of human CD49b MGPERTGAAP YNVGLPEAKI VYKCPVDLST TCGPLWAQQC DESNSIYPWD KTKEEMIVAT VVTDGESHDG KAIASIPTER QVGFSADYSS DRNHSSYLGY QAHRGDQIGS FTIKKGILGQ NQNSGAVYIY GDSITDVSIG CFSAKFRPTK VVNQAQSCPE VFSIPFHKDC KNKRESAYNT YPALKREQQV IPLLYDAEIH GSVPVSMATV QTSSSVSFKS WNGTFASSTF VPTGVIIGSI SS
1
~_5~
LPLLLVLALS FSGPSSEQFG ATCEKLNLQT GNQYYTTGVC AVKNFLEKFV SQTSQYGGDL SMLKAVIDQC YFFNVSDEAA QNDILMLGAV SVAAISTGES YFGSVLCSVD HQFLEGPEGI NGHQGTIRTK AFGQVVQLWS QNNQVAIVYN HIIYIQEPSD GEDGLCISDL GIVVDFSENL TFTINFDFNL LTRSTNINFY IIHIPQYTKE ENFRHTKELN QTVQLTAAAE IAGILLLLAL
QGILNCCLA YAVQQFINPK STSIPNVTEM SDISPDFQLS QGLDIGPTKT TNTFGAIQYA NHDNILRFGI LLEKAGTLGE GAFGWSGTIV THFVAGAPRA VDKDTITDVL ENTRFGSAIA YSQKILGSDG QSIADVAIEA ITLDADGFSS VVNSLDLRVD VLDVRQIPAA FFASFSLPVD QNLQNQASLS EISSDGNVPS KNPLMYLTGV CRTASCSNVT INTYNPEIYV VAILWKLGFF
GNWLLVGSPW KTNMSLGLIL ASFSPATQPC QVGLIQYANN RKYAYSAASG AVLGYLNRNA QIFSIEGTVQ QKTSHGHLIF NYTGQIVLYS LVGAPMYMSD ALSDINMDGF AFRSHLQYFG SFTPEKITLV RVTSRGLFKE ISLENPGTSP QEQPFIVSNQ GTEVTCQVAA FQALSESQEE IVHSFEDVGP QTDKAGDISC CWLKDVHMKG IEDNTVTIPL KRKYEKMTKN
SGFPENRMGD TRNMGTGGFL PSLIDVVVVC PRVVFNLNTY GRRSATKVMV LDTKNLIKEI GGDNFQMEMS PKQAFDQILQ VNENGNITVI LKKEEGRVYL NDVIVGSPLE RSLDGYGDLN NKNAQIILKL NNERCLQKNM ALEAYSETAK NKRLTFSVTL SQKSVACDVG NKADNLVNLK KFIFSLKVTT NADINPLKIG EYFVNVTTRI MIMKPDEKAE PDEIDETTEL
References 1 Pigott, R. and Power, C. (1993)The Adhesion Molecule FactsBook. Academic Press, London. 2 Hemler, M.E. (1990) Annu. Rev. Immunol. 114, 365-400. 3 Hemler, M.E. et al. (1984) J. Immunol. 132, 3011-3018. 4 Perez-Villar, J.J. et al. (1996) Eur. J. Immunol. 26, 2023-2029. s Hemler, M.E. et al. (1986) J. Clin. Invest. 78, 696-702. 6 Takada, Y. and Hemler, M.E. (1989) J. Cell Biol. 109, 397-407. 7 Santoso, S. et al. (1993)J. Clin. Invest. 92, 2427-2432. 8 Staatz, W.D. et al. (1991) J. Biol. Chem. 266, 7363-7367. 9 Kamata, T. et al. (1994)J. Biol. Chem. 269, 9659-9663. lo Staatz, W.D. et al. (1989) J. Cell Biol. 108, 1917-1924. 11 Goldman, R. et al. (1992) Eur. J. Immunol. 22, 1109-1114. 12 Elices, M.J. and Hemler, M.E. (1989) Proc. Natl Acad. Sci. USA 86, 9906-9910.
-I 5O i00 150 200 250 300 350 400 450 5OO 550 600 650 7OO 75O 800 85o 90O 95O i000 1050 ii00 1150 1152
13 14 is 16 17 18
Wayner, E.A. and Carter, W.G. (1987) J. Cell Biol. 105, 1873-1884. Schiro, J.A. et al. (1991)Cell 67, 403-410. Klein, C.E. et al. (1991) J. Cell Biol. 115, 1427-1436. Chan, B.M.C. et al. (1991) Science 251, 1600-1602. Chen, F.A. et al. (1991) J. Exp. Med. 173, 1111-1119. Bergelson, J.M. et al. (1992) Science 255, 1718-1720.
.5 c
CD49c Other names Integrin ~3 subunit VLA-3 a subunit Molecular weights Polypeptides 113 343 SDS-PAGE reduced unreduced
130 + 25 kDa 150 kDa
Carbohydrate N-linked sites O-linked sites
14 unknown
Human gene location 17q
CD49c/CD29
Tissue distribution CD49c is expressed at low levels on monocytes, and B and T lymphocytes 1,2. It is expressed on most cultured adherent cell lines but not on lymphoid cell lines a. Structure CD49c is the a3 integrin subunit which forms non-covalently associated heterodimers with CD29 (]31 integrin subunit). It belongs to the integrin subclass which does not contain an I-domain and is cleaved into large Nterminal and small C-terminal chains which remain disulfide linked 2,4. The N-terminal sequence of CD49c has been determined s. Ligands and associated molecules The integrin CD49c/CD29 has been shown to bind many ligands, but its affinity for these ligands depends on the cell type on which it is expressed, divalent cation concentrations, and the presence of other integrin heterodimers on the cell 6'7. Thus, CD49c/CD29 binds fibronectin at the RGD sites but this binding is only detectable on cells that do not express the CD49e/CD29 integrin 6. CD49c/CD29 also binds collagen and laminin, but this binding is not inhibited by the RGD-containing peptides 6,7. CD49c/ CD29 has been shown to mediate the binding of cells to epiligrin, an extracellular matrix protein on epithelial basement membranes s. Function K562 cells (which do not express any endogenous CD49 antigens)failed to bind fibronectin, collagen or laminin when transfected with CD49c cDNA, but showed binding to epiligrin, suggesting a role for CD49c/CD29 in mediating
~_6f
CD49c
cell adhesion to the epithelial basement membranes 8,9. CD49c/CD29 has been found at contact sites between cultured cells and so may have a role in cell-cell as well as cell-matrix adhesion lo
Database accession numbers Human Hamster
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A40021 A35761
P26006 P17852
M59911 J05281
4 11
Amino acid sequence of human CD49c MGPGPSRAPR FNLDTRFLVV GYTNRTGAVY PAGRVLVCAH EMCNSNTDYL DLSEYSYKDP LLSQEAGGDL ERKEEVGGAI GFQDIAVGAP SLSGQMDVDE ALCTATSCVQ GSESAVFHGF RPRLGLRSLD SEQQQKLSRL LLLSSVRPPG QVQLQLSTSS SGMKTVEDVG PTEITVHGNG QGPPPVTLAA VWNSTFIEDY VEELPAEIEL QKAEMKSQPS
APAPRLMLCA KEAGNPGSLF LCPLTAHKDD RYTQVLWSGS ETGMCQLGTS EDQGNLYIGY RRRQVLEGSQ YVFMNQAGTS FEGLGKVYIY NFYPDLLVGS VELCFAYNQS FSMPEMRCQK AYPILNQAQA QYSRDVRKLL ACQANETIFC HQDNLWPMIL SPLKYEFQVG SWPCRPPGDL AKKAKSETVL RDFDRVRVNG WLVLVAVGAG ETERLTDD
LALMVAAGGC GYSVALHRQT CERMNITVKN EDQRRMVGKC GGFTQNTVYF TMQVGSFILH VGAYFGSAIA FPAHPSLLLH HSSSKGLLRQ LSDHIVLLRA AGNPNYRRNI LELLLMDNLR LENHTEVQFQ LSINVTNTRT ELGNPFKRNQ TLLVDYTLQT PMGEGLVGLG INPLNLTLSD TCATGRAHCV WATLFLRTSI LLLLGLIILL
VVSA ERQQRYLLLA DPGHHIIEDM YVRGNDLELD GAPGAYNWKG PKNITIVTGA LADLNNDGWQ GPSGSAFGLS PQQVIHGEKL RPVINIVHKT TLAYTLEADR DKLRPIIISM KECGPDNKCE SERSGEDAHE RMELLIAFEV SLSMVNHRLQ TLVLGLEWPY PGDRPSSPQR WLECPIPDAP PTINMENKTT LWKCGFFKRA
GAPRELAVPD WLGVTVASQG SSDDWQTYHN NSYMIQRKEW PRHRHMGAVF DLLVGAPYYF VASIGDINQD GLPGLATFGY LVPRPAVLDP DRRPPRLRFA NYSLPLRMPD SNLQMRAAFV ALLTLVVPPA IGVTLHTRDL SFFGGTVMGE EVSNGKWLLY RRRQLDPGGG VVTNVTVKAR WFSVDIDSEL RTRALYEAKR
-i 5O i00 150 2OO 250 300 350 400 450 50O 550 600 650 7O0 75O 8OO 85O 90O 95O I000 1018
References
I
1 Pigott, R. and Power, C. (1993) The Adhesion Molecule FactsBook. Academic Press, London. 2 Hemler, M.E. (1990) Annu. Rev. Immunol. 114, 365-400. 3 Rettig, W.J. and Old, L.J. (1992)Annu. Rev. Immunol. 7, 481-511. 4 Takada, Y. et al. (1991) J. Cell Biol. 115, 257-266. s Takada, Y. et al. (1987) Proc. Natl Acad. Sci. USA 84, 3239-3243. 4 Elices, M.J. et al. (1991) J. Cell Biol. 112, 169-181. 7 Gehlsen, K.R. et al. (1989) J. Biol. Chem. 264, 19034-19038. 8 Carter, W.G. et al. (1991) Cell 65, 599-610. 9 Weitzman, J.B. et al. (1993) J. Biol. Chem. 268, 8651-8657. lo Kaufmann, R. et al. (1989) J. Cell Biol. 109, 1807-1815. 11 Tsuji, T. et al. (1990) J. Biol. Chem. 265, 7016-7021.
~_61
CD49d Other names
Integrin ~4 subunit VLA-4 a subunit Molecular weights
Polypeptides
111228
SDS-PAGE reduced unreduced unreduced variant
150 kDa 180 kDa 150 kDa
Carbohydrates
N-linked sites O-linked sites
11 unknown
H u m a n gene location
2q31 -q32
CD49d/CD29
Tissue distribution
i
CD49d is the ~4 integrin subunit which can combine with CD29 (the fll integrin subunit), to form the integrin CD49d/CD29 (a4fll, VLA-4), or with the//7 integrin subunit, to form the integrin a4]17. CD49d/CD29 is expressed on most leucocytes, with the possible exception of neutrophils and platelets 1-3. a4]17 is expressed on most lymph node T and B cells, NK cells and eosinophils 3,4. CD49d/CD29 is also expressed in non-lymphoid tissues s.
Structure CD49d does not belong to either of the two main groups of integrin ~ subunits: it contains neither an I-domain, nor a cleavage site to yield the characteristic Nterminal 125 kDa and C-terminal 30 kDa disulfide-linked peptides a. Instead, it contains a variably used cleavage site which yields two non-disulfide-linked fragments of 80 and 70 kDa 7. CD49d migrates at either 150 kDa or 180 kDa in non-reducing SDS-PAGE. The 180 kDa form requires unoxidized cysteines and divalent cations and is likely to be the functionally more important form of the molecule s,9. The N-terminal sequence of CD49d has been determined lo.
Ligands and associated molecules Both CD49d/CD29 (VLA-4)and the a4]17 integrin bind VCAM-1 as well as an alternatively spliced form of fibronectin which contains the peptide CS-12,//. The ~4fl7 integrin also binds the mucosal addressin MAdCAM-12,3,12.
Function CD49d/CD29 is involved in the migration of leucocytes from blood to tissues at sites of inflammation. In addition, CD49d/CD29-mediated binding has been
~_6"~
CD49d
shown to provide a co-stimulatory signal to T cells for activation and proliferation 13. The ~4fl7 integrin is responsible for the homing of a4f17§ lymphocytes to the gut through recognition of MAdCAM-1 on mucosal high endothelial venules 3,12 (see the entry Integrin //7 subunit). Unlike the //2 integrins, which require selectin-mediated tethering and rolling of leucocytes before they can mediate arrest and firm adhesion to endothelium 14, CD49d/ CD29 is and a4f1716 can mediate both the initial tethering/rolling as well as the subsequent firm adhesion. Database accession numbers Human Mouse
PIR S06046 A41131
SWISSPR OT P13612 Q00651
EMBL/GENBANK X16983 X53176
REFERENCE 5,17 18
Amino acid sequence of human CD49d MFPTESAWLG YNVDTESALL NPGAIYRCRI QPGENGSIVT YQDYVKKFGE KYKAFLDKQN FSIDEKELNI REEGRVFVYI EDVAIGAPQE ISGQIDADNN FDCVENGWPS YFSSNGTSDV GPHVISKRST SAKIGFLKPH YFIKILELEE SLSRAEEDLS PTSFVYGSND PQTDKLFNIL DKRLLYCIKA LKFEIRATGF SLLLGLIVLL
KRGANPGPEA YQGPHNTLFG GKNPGQTCEQ CGHRWKNIFY NFASCQAGIS QVKFGSYLGY LHEMKGKKLG NSGSGAVMNA DDLQGAIYIY GYVDVAVGAF VCIDLTLCFS ITGSIQVSSR EEFPPLQPIL ENKTYLAVGS KQINCEVTDN ITVHATCENE ENEPETCMVE DVQTTTGECH DPHCLNFLCN PEPNPRVIEL LISYVMWKAG
AVRETVMLLL YSVVLHSHGA LQLGSPNGEP IKNENKLPTG SFYTKDLIVM SVGAGHFRSQ SYFGASVCAV METNLVGSDK NGRADGISST RSDSAVLLRT YKGKEVPGYI EANCRTHQAF QQKKEKDIMK MKTLMLNVSL SGVVQLDCSI EEMDNLKHSR KMNLTFHVIN FENYQRVCAL FGKMESGKEA NKDENVAHVL FFKRQYKSIL
CLGVPTGRP NRWLLVGAPT CGKTCLEERD GCYGVPPDLR GAPGSSYWTG HTTEVVGGAP DLNADGFSDL YAARFGESIV FSQRIEGLQI RPVVIVDASL VLFYNMSLDV MRKDVRDILT KTINFARFCA FNAGDDAYET GYIYVDHLSR VTVAIPLKYE TGNSMAPNVS EQQKSAMQTL SVHIQLEGRP LEGLHHQRPK QEENRRDSWS
ANWLANASVI NQWLGVTLSR TELSKRIAPC SLFVYNITTN QHEQIGKAYI LVGAPMQSTI NLGDIDNDGF SKSLSMFGQS SHPESVNRTK NRKAESPPRF PIQIEAAYHL HENCSADLQV TLHVKLPVGL IDISFLLDVS VKLTVHGFVN VEIMVPNSFS KGIVRFLSKT SILEMDETSA RYFTIVIISS YINSKSNDD
-i 5O i00 150 200 250 300 350 400 450 5OO 55O 600 650 70O 750 8oo 85O 9OO 95O 999
References
E
1 Pigott, R. and Power, C. (1993) The Adhesion Molecule FactsBook. Academic Press, London. 2 Lobb, R.R. and Hemler, M.E. (1994) J. Clin. Invest. 94, 1722-1728. 3 Rott, L.S. et al. (1996)J. Immunol. 156, 3727-3736. 4 Erle, D.J. et al. (1994)J. Immunol. 153, 517-528. s Rosen, G.D. et al. (1992)Cell 69, 1107-1119. 6 Takada, Y. et al. (1989) EMBO J. 8, 1361-1368. 7 Hemler, M.E. et al. (1987)J. Biol. Chem. 262, 11478-11485. 8 Parker C.M. et al. (1990) J. Biol. Chem. 268, 7028-7035. 9 Pujades, C. et al. (1996) Biochem. J. 313, 899-908. lo Takada, Y. et al. (1987) Proc. Natl Acad. Sci. USA 84, 3239-3243. 11 Kilger, G. et al. (1995)J. Biol. Chem. 270, 5979-5984. 12 Berlin, C. et al. (1993)Cell 74, 185-195.
~_6~
CD49d
13 14 is 16 17 18
~.64
Shimizu, Y. et al. (1990) J. Immunol. 145, 59-67. Springer, T.A. (1994)Cell 76, 301-314. Alon, R. et al. (1995) J. Cell Biol. 128, 1243-1253. Berlin, C. et al. (1995) Cell 80, 413-422. Rubio, M. et al. (1992) Eur. J. Immunol. 22, 1099-1102. Neuhaus, H. et al. (1991) J. Cell Biol. 115, 1149-1158.
CD49e Other names Integrin a5 subunit VLA-5 (fibronectin receptor)a subunit Ic subunit of GPIc-IIa
Molecular weights Polypeptides
110012
SDS-PAGE reduced unreduced
135 + 25 kDa 155 kDa
S
s
Carbohydrate N-linked sites O-linked sites
14 unknown
Human gene location
CD49e/CD29
12qll-q13
Tissue distribution CD49e is expressed on thymocytes, T cells, monocytes and platelets 1'2. Expression is increased on activated and memory T cells 2'3. It is also expressed on very early B cells and on activated B cells 4.
Structure CD49e associates with CD29 to form the CD49e/CD29 integrin (a5fil, VLA-5). CD49e does not have an I-domain s,6. It is cleaved at one site to yield 135 kDa N-terminal and 25 kDa C-terminal disulfide-linked peptides. The N-terminal sequence of CD49e has been determined 7. Electron microscopy of purified CD49e/CD29 has provided the structural model for all integrins 8.
Ligands and associated m o l e c u l e s CD49e/CD29 is the fibronectin receptor and binds to the RGD sequence of fibronectin 2'9. CD49e/CD29 has been shown to bind the neural adhesion molecule L1, which is also expressed on some leucocytes lo.
Function CD49e/CD29-mediated binding to fibronectin provides a co-stimulatory signal to T cells 11. It also enhances Fc? receptor- and complement receptor-mediated phagocytosis 12,13, and VLA-2-mediated binding of monocytes to collagen 14 CD49e/CD29 is involved in monocyte migration into extracellular tissues is
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
A27079 PL0103
P08648 P11688
X06256 X15203
REFERENCE 5,6 16
~_6~
A m i n o acid sequence of h u m a n CD49e MGSRTPESPL FNLDAEAPAV QGGAVYLCPW FGATVRAHGS CRSDFSWAAG AESYYPEYLI GVPKGNLTYG DLLVGAPLLM FGSSLTPLGD VLQPLWAASH IVSASASLTI FTVELQLDWQ RNESEFRDKL LLDCGEDNIC ELRVTAPPEA SLWGGLRFTV VTLNGVSKPE QGVLELSCPQ QKREAPSRSS AKTFLQREHQ EGSYGVPLWI KPPATSDA
HAVQLRWGPR LSGPPGSFFG GASPTQCTPI SILACAPLYS QGYCQGGFSA NLVQGQLQTR YVTILNGSDI DRTPDGRPQE LDQDGYNDVA TPDFFGSALR FPAMFNPEER KQKGGVRRAL SPIHIALNFS VPDLQLEVFG EYSGLVRHPG PHLRDTKKTI AVLFPVSDWH ALEGQQLLYV ASSGPQILKC PFSLQCEAVY IILAILFGLL
RRPPLVPLLL FSVEFYRPGT EFDSKGSRLL WRTEKEPLSD EFTKTGRVVL QASSIYDDSY RSLYNFSGEQ VGRVYVYLQH IGAPFGGETQ GGRDLDGNGY SCSLEGNPVA FLASRQATLT LDPQAPVDSH EQNHVYLGDK NFSSLSCDYF QFDFQILSKN PRDQPQKEED TRVTGLNCTT PEAECFRLRC KALKMPYRIL LLGLLIYILY
LLVPPPPRVG DGVSVLVGAP ESSLSSSEGE PVGTCYLSTD GGPGSYFWQG LGYSVAVGEF MASYFGYAVA PAGIEPTPTL QGVVFVFPGG PDLIVGSFGV CINLSFCLNA QTLLIQNGAR GLRPALHYQS NALNLTFHAQ AVNQSRLLVC LNNSQSDVVS LGPAVHHVYE NHPINPKGLE ELGPLHQQES PRQLPQKERQ KLGFFKRSLP
G KANTSQPGVL EPVEYKSLQW NFTRILEYAP QILSATQEQI SGDDTEDFVA ATDVNGDGLD TLTGHDEFGR PGGLGSKPSQ DKAVVYRGRP SGKHVADSIG EDCREMKIYL KSRIEDKAQI NVGEGGAYEA DLGNPMKAGA FRLSVEAQAQ LINQGPSSIS LDPEGSLHHQ QSLQLHFRVW VATAVQWTKA YGTAMEKAQL
References 1 Pigott, R. and Power, C. (1993)The Adhesion Molecule FactsBook. Academic Press, London. 2 Hemler, M.E. (1990) Annu. Rev. Immunol. 114, 365-400. 3 Shimizu, Y. et al. (1990) Nature 345, 250-253. 4 Ballard, L.L. et al. (1991) Clin. Exp. Immunol. 84, 336-346. s Argraves, W.S. et al. (1986) J. Cell Biol. 105, 1183-1190. 6 Fitzgerald, L.A. et al. (1987) Biochemistry 26, 8158-8165. 7 Takada, Y. et al. (1987) Proc. Natl Acad. Sci. USA 84, 3239-3243. 8 Nermut, M.N. et al. (1988) EMBO J. 7, 4093-4099. 9 Hynes, R.O. (1992) Cell 69, 11-25. ~o Ruppert, M. et al. (1995) J. Cell Biol. 131, 1881-1891. 11 Shimizu, Y. et al. (1990)J. Immunol. 145, 59-67. 12 Wright, S.D. et al. (1984) J. Cell Biol. 99, 336-339. 13 Pommier, C.G. et al. (1983) J. Exp. Med. 157, 1844-1854. 14 Pacifici, R. et al. (1994) J. Immunol. 153, 2222-2233. is Weber, C. et al. (1996) J. Cell Biol. 134, 1063-1073. 14 Holers, V.M. et al. (1989) J. Exp. Med. 169, 1589-1605.
~_6(
-i 50 I00 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 i000 1008
Other names Integrin ~6 subunit VLA-6 a subunit Ic subunit of GPIc-IIa
Molecular weights Polypeptides A form B form
117 265 119 869
SDS-PAGE reduced unreduced
120 + 30 kDa 140 kDa
O
Carbohydrate N-linked sites O-linked sites
8 unknown
CD49f/CD29
Human gene location 2 z
Tissue distribution
E
[ E
!
+++
CD49f is the ~6 integrin subunit which can combine with CD29 (the fll integrin subunit) to form the integrin VLA-6 (CD49f/CD29), which is expressed on thymocytes, T lymphocytes and monocytes 1,2. Increased expression is found on activated and memory T cells 2'3. CD49f also combines with CD104 to form the a6f14 integrin. Both CD49f/CD29 and CD49f/CD104 are widely expressed on epithelia in non-lymphoid tissues 4 (see CD104).
Structure CD49f does not have an I-domain s,6. It is cleaved at one site to yield 120 kDa Nterminal and 30 kDa C-terminal peptides which remain disulfide-linked. The N-terminal sequence of CD49f has been determined 7,8. Two alternatively spliced forms of CD49f cDNA have been described with different cytoplasmic domains 6,9. The A form alone is expressed in the lung, liver, spleen and cervix, whereas only the B form is seen in the brain, ovary and kidney. Both forms are detected in other tissues 9.
Ligands and associated m o l e c u l e s CD49f/CD29 (~6fll integrin) is the monocytes 11 and T lymphocytes 3,1z
laminin
receptor on platelets I~
Function CD49f/CD29-mediated T cell binding to laminin provides a co-stimulatory signal to T cells for activation and proliferation 12
~_67
CD49f
Database accession numbers PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
B36429
P23229
X53586 X59512
5
Human
9.
6
Amino acid sequence of human CD49f MAAAGQLCLL FNLDTREDNV LQRANRTGGL PGGKVVTCAH FCDGRLRGHE NTFFDMNIFE ITFVSGAPRA DLNKDGWQDI DSMFGIAVKN VLKGISPYFG ITVTPNRIDL AEKERRKSGL RDKLRPIPIT EGCGDDNVCN LEITVTNSPS LSCVANQNGS ETTSNQDNLA DEVGSLIEYE KGLEKVTCEP QTLNCSVNVN LMRAFIDVTA AGILMLALLV
YLSAGLLSRL IRKYGDPGSL YSCDITARGP RYEKRQHVNT KFGSCQQGVA DGPYEVGGET NHSGAVVLLK VIGAPQYFDR IGDINQDGYP YSIAGNMDLD RQKTACGAPS $SRVQFRNQG ASVEIQEPS$ SNLKLEYKFC NPRNPTKDGD QADCELGNPF PITAKAKVVI FRVINLGKPL QKEINSLNLT CVNIRCPLRG AAENIRLPNA FILWKCGFFK
GAA FGFSLAMHWQ CTRIEFDNDA KQESRDIFGR ATFTKDFHYI EHDESLVPVP RDMKSAHLLP DGEVGGAVYV DIAVGAPYDD RNSYPDVAVG GICLQVKSCF SEPKYTQELT RRRVNSLPEV TREGNQDKFS DAHEAKLIAT KRNSNVTFYL ELLLSVSGVA TNLGTATLNI ESHNSRKKRE LDSKASLILR GTQVRVTVFP RNKKDHYDAT
LQPEDKRLLL DPTSESKEDQ CYVLSQNLRI VFGAPGTYNW ANSYLGFSLD EHIFDGEGLA YMNQQGRWNN LGKVFIYHGS SLSDSVTIFR EYTANPAGYN LKRQKQKVCM LPILNSDEPK YLPIQKGVPE FPDTLTYSAY VLSTTEVTFD KPSQVYFGGT QWPKEISNGK ITEKQIDDNR SRLWNSTFLE SKTVAQYSGV YHKAEIKAQP
VGAPRGEALP WMGVTVQSQG EDDMDGGDWS KGIVRVEQKN SGKGIVSKDE SSFGYDVAVV VKPIRLNGTK ANGINTKPTQ SRPVINIQKT PSISIVGTLE EETLWLQDNI TAHIDVHFLK LVLKDQKDIA RELRAFPEKQ TPYLDINLKL VVGEQAMKSE WLLYLVKVES KFSLFAERKY EYSKLNYLDI PWWIILVAIL SDKERLTSDA
-I 50 i00 150 200 250 300 350 400 450 500 550 600 650 7O0 75O 8OO 85O 9OO 95O i000 1050
The above sequence is the A form. In the B form the sequence in bold is replaced with: CGFFK RSRYDDSVPR YHAVRIRKEE REIKDEKYID NLEKKQWITK WNRNESYS
1050 1068
References
1 Pigott, R. and Power, C. (1993)The Adhesion Molecule FactsBook. Academic Press, London. 2 Hemler, M.E. (1990) Annu. Rev. Immunol. 114, 365-400. 3 Shimizu, Y. et al. (1990) Nature 345, 250-253. 4 Natali, P.G. et al. (1992) J. Cell Sci. 103, 1243-1247. s Tamura, R.N. et al. (1990) J. Cell Biol. 111, 1593-1604. 4 Hogervorst, F. et al. (1991) Eur. J. Biochem. 199, 425-433. 7 Hemler, M.E. et al. (1989) J. Biol. Chem. 264, 6529-6535. s Kajiji, S. et al. (1989) EMBO J. 8, 673-680. 9 Tamura, R.N. et al. (1991) Proc. Natl Acad. Sci. USA 88, 10183-10187. lo Sonnenberg, A. et al. (1988) Nature 336, 487-489. 11 Tobias, J.W. et al. (1987) Blood 69, 1265-1268. 12 Shimizu, Y. et al. (1990) J. Immunol. 145, 59-67.
~.6~
CD50
ICAM-3, ICAM-R
Molecular weights Polypeptide
56 255
SDS-PAGE reduced
120-160 kDa (neutrophils) 110-130 kDa (T lymphocytes)
Carbohydrate N-linked sites 15 O-linked probably nil
Human gene location 19p13.3-13.21
t1I' 111I' COOH Domains
I sl
CST
I
c~
~LcI
CQV
I
I
CTL
FSCI I c~ I
c~
IvcI
CMA
I
I
CQA
FFCI I c~ I
YQCI, c2 I TM 16','1
Tissue distribution CD50 is constitutively expressed at high levels on leucocytes, including resident epidermal dendritic Langerhans cells 2. CD50 is generally not found on endothelia but has been detected on blood vessels within some lymphomas 3 CD50 is released from activated lymphocytes and neutrophils, probably by proteolytic cleavage, and soluble forms of CDS0 are detectable in the blood 4,s
Structure CD50 is a heavily and variably n glycosylated cell surface glycoprotein closely related to CD54 (ICAM-1)whose gene is also located in the region 19p13.313.2 7-9. Both proteins have five C2-set IgSF domains in their extracellular regions with 52% identity. The two membrane-distal IgSF domains of CD50 share -37% identity with the extracellular portion of CD102 (ICAM-2). Electron microscopy studies suggest that CD50 is a straight rod approximately 15 nm in length to. The CD50 cytoplasmic domain, which is poorly conserved in CD54 and CD102, is phosphorylated on tyrosine and serine residues following cell activation 11.
Ligands and associated molecules CD50, like CD54 and CD102, is a ligand for the leucocyte integrin LFA-1 (CDlla/CD18) 7-9. Residues on the GFC//-sheet of the N-terminal domain
~.6~
CD50
interact with CD1 la I domain lo,12. Optimal LFA-1 binding to CD50 (as well as CD54 and CD102)requires activation of the cells expressing LFA-1 la. Unlike CD54, CD50 does not bind the integrins CD 1 l b / C D 18 (Mac- 1) or CD 11 c/ CD18 (p150,95). Crosslinked CD50 antibodies co-precipitate the tyrosine kinases Fyn and Lck 14.
Function
! i
!
CD50 mediates adhesion between leucocytes. It is constitutively expressed on resting antigen-presenting cells, including dendritic cells, and appears to be important in the initial interactions between T cells and dendritic cells leading to T cell activation 2. Extensive crosslinking of CD50 on T cells with mAbs can mobilize intracellular Ca ~§ 14, activate tyrosine phosphorylation, and stimulate T cell adhesion and proliferation 14"1s, but the functional significance of these observations is unclear.
Database accession numbers PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
$28904
P32942
X69711
7-9
Human
9.
Amino acid sequence of human CD50 MATMVPSVLW QEFLLRVEPQ WAAFNLSNVT PVGQNFTLRC SRDDHGAPFS FLEVETSWPV ATARADQEGA STVTVSCMAG EVDGEFLHRN YPELRCLKEG EAGSSHFVPV LTSMQPTEAM
PRACWTLLVC NPVLSAGGSL GNSRILCSVY QVEGGSPRTS CRTELDMQPQ DCTLDGLFPA REIVCNVTLG ARVQVTLDGV SSVQLRVLYG SSREVPVGIP FVAVLLTLGV GEEPSRAE
CLLTPGVQG FVNCSTDCPS CNGSQITGSS LTVVLLRWEE GLGLFVNTSA SEAQVYLALG GERREARENL PAAAPGQPAQ PKIDRATCPQ FFVNVTHNGT VTIVLALMYV
SEKIALETSL NITVYGLPER ELSRQPAVEE PRQLRTFVLP DQMLNATVMN TVFSFLGPIV LQLNATESDD HLKWKDKTRH YQCQASSSRG FREHQRSGSY
SKELVASGMG VELAPLPPWQ PAEVTATVLA VTPPRLVAPR HGDTLTATAT NLSEPTAHEG GRSFFCSATL VLQCQARGNP KYTLVVVMDI HVREESTYLP
References 1 2 a 4 s 6 7 8 9 lo 11 12 13 14 is
~.7~
Bossy, D. et al. (1994) Genomics 23, 712-713. Starling, G.C. et al. (1995) Eur. J. Immunol. 25, 2528-2532. Cordell, J.L. et al. (1994) J. Clin. Pathol. 47, 143-147. Del Pozo, M.A. et al. (1994) Eur. J. Immunol. 24, 2586-2594. Pino-Otin, M.R. et al. (1995)J. Immunol. 154, 3015-3024. de Fougerolles, A.R. et al. (1995) Eur. J. Immunol. 25, 1008-1012. de Fougerolles, A.R. et al. (1993) J. Exp. Med. 177, 1187-1192. Fawcett, J. et al. (1992) Nature 360, 481-484. Vazeux, R. et al. (1992) Nature 360, 485-488. Sadhu, C. et al. (1994) Cell Adhesion C o m m u n . 2, 429-440. Skubitz, K.M. et al. (1995) J. Immunol. 154, 2888-2895. Van Kooyk, Y. et al. (1996) J. Exp. Med. 183, 1247-1252. de Fougerolles, A.R. et al. (1994) J. Exp. Med. 179, 619-629. Juan, M. et al. (1994) J. Exp. Med. 179, 1747-1756. Hernandez-Caselles, T. et al. (1993) Eur. J. Immunol. 23, 2799-2806.
-i 50 i00 150 200 250 300 350 400 450 500 518
subunit of vitronectin receptor, integrin ~V subunit Molecular weights Polypeptide
121 716
SDS-PAGE reduced unreduced
125 + 24 kDa 150 kDa
Carbohydrate N-linked sites O-linked sites
13 unknown
Human gene location 2q31-q32
s
TT
T
'
TTT
CD511CD61
Tissue distribution CD51 in combination with CD61 (aV//3 integrin)is expressed on platelets but at a lower level than the integrin CD41/CD61 1,2. It is expressed on endothelial cells, certain activated leucocytes, NK cells, macrophages and neutrophils a,4. It is also expressed on other tissues including smooth muscle cells and osteoclasts a. CD51 is the most promiscuous integrin a subunit; it can form heterodimers with the fll (CD29),//3 (CD61), flS, fig and f18 integrin subunits in various tissues s.
Structure CD51 (integrin aV subunit) falls into the a subclass with no I-domain. It is processed into a large N-terminal chain and a small C-terminal chain which remain disulfide linked 6. The N-terminal sequences of both chains have been determined 7
Ligands and associated molecules CD51/CD61 is also known as the vitronectin receptor. It mediates binding of platelets to immobilized vitronectin without prior activation. Other ligands for CD51/CD61 include RGD-containing proteins such as fibrinogen, fibronectin, von Willebrand factor, laminin and thrombospondin2"S and the neural adhesion molecule L1, which is expressed on certain leucocytes 9. CD51/CD61 also mediates cell-cell adhesion via an interaction with CD31 lo,11. It has been shown to associate on the cell surface with CD47 4,12,13
Function CD51/CD61 acts as an activation-independent receptor for platelet attachment and spreading on vitronectin and other RGD-containing proteins, including )_7]
CD51
matrix components 2'8. It mediates leucocyte-endothelial cell adhesion via interaction with CD31 lo,11. It has also been shown to initiate bone resorption by mediating the adhesion of osteoclasts to osteopontin 14, and it may play a role in angiogenesis 3. Antagonists of CD51/CD61 have been shown to inhibit tumour growth by disrupting angiogenesis is Database accession numbers Human Mouse
SWISSPR OT
EMBL/GENBANK
REFERENCE
P06756 P43406
M14648 U14135
6,7 16
A m i n o acid sequence of human CD51
st
MAFPPRRRLR FNLDVDSPAE VEGGQVLKCD KQDKILACAP GQGFCQGGFS SIKYNNQLAT GMVYIYDGKN MDRGSDGKLQ DQDGFNDIAI PPSFGYSMKG PSILNQDNKT KLKQKGAIRR KLTPITIFME VCKPKLEVSV ADFIGVVRNN VHQQSEMDTS DHIFLPIPNW YKYNNNTLLY GERDHLITKR SLLWTETFMN VTWGIQPAPM QEREQLQPHE
! ! J
~.7~
PIR
A27421
LGPRGLPLLL YSGPEGSYFG WSSTRRCQPI LYHWRTEMKQ IDFTKADRVL RTAQAIFDDS MSSLYNFTGE EVGQVSVSLQ AAPYGGEDKK ATDIDKNGYP CSLPGTALKV ALFLYSRSPS YRLDYRTAAD DSDQKKIYIG EALARLSCAF VKFDLQIQSS EHKENPETEE ILHYDIDGPM DLALSEGDIH KENQNHSYSL PVPVWVIILA NGEGNSET
SGLLLPLCRA FAVDFFVPSA EFDATGNRDY EREPVGTCFL LGGPGSFYWQ YLGYSVAVGD QMAAYFGFSV RASGDFQTTK GIVYIFNGRS DLIVGAFGVD SCFNVRFCLK HSKNMTISRG TTGLQPILNQ DDNPLTLIVK KTENQTRQVV NLFDKVSPVV DVGPVVQHIY NCTSDMEINP TLGCGVAQCL KSSASFNVIE VLAGLLLLAV
SSRMFLLVGA AKDDPLEFKS QDGTKTVEYA GQLISDQVAE FNGDGIDDFV AATDINGDDY LNGFEVFARF TGLNAVPSQI RAILYRARPV ADGKGVLPRK GLMQCEELIA FTPANISRQA AQNQGEGAYE CDLGNPMKAG SHKVDLAVLA ELRNNGPSSF LRIKISSLQT KIVCQVGRLD FPYKNLPIED LVFVMYRMGF
PKANTTQPGI HQWFGASVRS PCRSQDIDAD IVSKYDPNVY SGVPRAARTL ADVFIGAPLF GSAIAPLGDL LEGQWAARSM ITVNAGLEVY LNFQVELLLD YLRDESEFRD HILLDCGEDN AELIVSIPLQ TQLLAGLRFS AVEIRGVSSP SKAMLHLQWP TEKNDTVAGQ RGKSAILYVK ITNSTLVTTN FKRVRPPQEE
References 1 Pigott, R. and Power, C. (1993) The Adhesion Molecule FactsBook. Academic Press, London. 2 Keiffer, N. and Phillips, D.R. (1990) Annu. Rev. Cell Biol. 6, 329-357. 3 Varner, J.A. et al. (1995) Cell Adhes. Comm. 3, 367-374. 4 Hendey, B. et al. (1996) Blood 87, 2038-2048. s Busk, M. et al. (1992)J. Biol. Chem. 267, 5790-5796. 6 Suzuki, S. et al. (1987) J. Biol. Chem. 262, 14080-14085. 7 Suzuki, S. et al. (1986) Proc. Natl Acad. Sci. USA 83, 8614-8618. 8 Ginsberg, M.H. et al. (1993) Thromb. Haemost. 70, 87-93. 9 Ebeling, O. et al. (1996) Eur. J. Immunol. 26, 2508-2516. lo Piali, L. et al. (1995) J. Cell Biol. 130, 451-460. 11 Buckley, C.D. et al. (1996)J. Cell Sci. 109, 437-455. 1~ Lindberg, F.P. et al. (1996) J. Cell Biol. 134, 1313-1322.
-i 5O i00 150 200 250 300 350 400 45O 5OO 55O 600 650 7OO 75O 8OO 850 90O 95O i000 1018
13 ~4 is 16
Gao, A.G. et al. (1996) J. Cell Biol. 135, 533-544. Ross, F.P. et al. (1993) J. Biol. Chem. 268, 9901-9907. Varner, J.A. and Cheresh, D.A. (1996)Curr. Opin. Cell Biol. 8, 724-730. Wada, J. et al. (1996) J. Cell Biol. 132, 1161-1176.
~.7~
CAMPATH-1
Molecular weights Polypeptide 1208 SDS-PAGE reduced unreduced
21-28 kDa 21-28 kDa
Carbohydrate N-linked sites O-linked
1 Nil
NH 2
Tissue distribution
E
CD52 is expressed at high levels (5 x l0 s molecules per cell) on lymphocytes and monocytes, but not on plasma cells, platelets or erythrocytes. In nonlymphoid tissues, CD52 is expressed by the epithelial cells of the epididymis and seminal vesicle. These cells appear to secrete the antigen into seminal plasma, where it can be taken up by sperm which are CD52 positive but do not express CD52 mRNA 1 Structure CD52 is an unusually short glycoprotein bearing a GPI anchor. The cDNA encodes a leader sequence of 24 amino acids followed by a 12 amino acid Nterminal sequence and a C-terminal GPI signal sequence 2. The major component of the molecule is a large sialylated, polylactosamine-containing core-fucosylated tetraantennary N-linked oligosaccharide a. Two subclasses of CD52 have been identified, CD52-I and CD52-II, differing only in the PI moiety of the GPI anchor a. The functional relevance of the two subclasses is not known. The N-terminus and GPI anchor attachment site have been determined biochemically x,2 CD52 species homologues have been identified in the monkey 4, dog s, rat 6 and mouse 7. Although these molecules share little amino acid sequence homology within the short peptide core, significant sequence identities are apparent in the leader and GPI signal sequences (see below). Where tissue expression data are available, the expression patterns of the CD52 species homologues are similar to that of the human 4-7. Moreover, the similarities in gene structure and promoter sequence of human CD52 and the predicted mouse homologue, B7(2), strongly support their classification as species homologues (M. Tone, personal communication).
L
E
Function The function of CD52 is unknown. However, CD52 is an exceptionally good target for complement-mediated cell lysis and antibody-mediated cellular cytotoxicity 8. CD52 mAbs are very effective in vivo for lymphocyte depletion and a humanized CD52 mAb (CAMPATH-1H 9) is giving promising results in clinical studies as a treatment of malignant lymphomas and rheumatoid arthritis (e.g. see ref. 10). CD52 may be such a good target for
~.7~
CD52
serotherapy because the CAMPATH-1 epitope lies close to the plasma membrane, within the C-terminal tripeptide and the GPI anchor, thus facilitating complement lysis 8 Database accession numbers Human Monkey Dog Rat Mouse
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
S18766 $27152
P31358 P32763
X62466 X67495 $77412 X76697 M55561
2 4 s 6 7
$40081
A m i n o acid sequences of CD52 Rat Mouse Human Monkey Dog
Rat Mouse Human Monkey Dog
Rat Mouse Human Monkey Dog !
r! I !
I i
!
i
MNTFL.LLLT MKSFL.LFLT MKRFLFLLLT MKRFLFLLLT MKGFLFLLLT
ISLLVVVQIQ IILLVVIQIQ ISLLVMVQIQ ISLLVMVQIQ ISLLVMIQIQ
TGDL TGSL TGLS TGVT TGVL
-I -i -i -i -i
GQNSTAVTTP ANKAATTAAA TTKAAATTAT KTTTAVRKTP GKPPKAG GQ . . . . . . . . . . . . A T T A A S G T N K N S T S T K KT . . . . . . . . . . P L K S G G Q N D T S Q T S S PS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Q N A T S Q . S S PS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GNSTTPRMTT KKVKSATPA ............................
47 25 12 ii 19
ASSITDVGAC TFLFF .ANTL ASSIIDAGAC SFLFF .ANTL ASSNI.SG.G IFLFFVANAI ASSNL.SG.GGFLFFVANAI LSSL..GG.G SVLLFLANTL
73 51 37 36 43
MCLFYLS MCLFYLS IHLFCFS IHLFYFS IQLFYLS
References 1 2 a 4 s 6 7 s 9
Taylor, V. and Hale, G. (1995) Leucocyte Typing V, 2 3 5 - 2 3 7 . Xia, M.-Q. et al. (1991) Eur. J. I m m u n o l . 21, 1677-1684. Treumann, A. et al. (1995) J. Biol. Chem. 270, 6 0 8 8 - 6 0 9 9 . Perry, A.C.F. et al. (1992) Biochim. Biophys. Acta 1171, 122-124. Ellerbrock, K. et al. (1994) Int. J. Androl. 17, 3 1 4 - 3 2 3 . Kirchhoff, C. (1994)Biol. Reprod. 50, 896-902. Kubota, H. et al. (1990) J. I m m u n o l . 145, 3 9 2 4 - 3 9 3 1 . Xia, M.-Q. et al. (1993) Mol. I m m u n o l . 30, 1089-1096. R i e c h m a n n , L. et al. (1988) N a t u r e 332, 323-39.7. lO Osterborg, A. et al. (1996) Br. J. H a e m a t o l . 93, 151-153.
~7~
CD53
OX-44 (rat)
Molecular weights Polypeptide
24 341
SDS-PAGE reduced unreduced
35-42 kDa 35-42 kDa
????
Carbohydrate N-linked sites O-linked
2 nil
NH2
Human gene location lp21-p13.3; >26 kb 1
Tissue distribution CD53 is expressed by T and B cells, monocytes, macrophages, granulocytes, dendritic cells, osteoblasts and osteoclasts. It is absent from platelets and erythrocytes (reviewed in ref. 2). In the rat and mouse, CD53 is expressed on only about 12% of thymocytes and is largely absent from the CD4+CD8 § population a. CD53 expression on thymocytes is induced by T cell receptor engagement during repertoire selection a.
Structure CD53 is a member of the TM4 superfamily and is predicted to have four transmembrane regions, short cytoplasmic N- and C-termini, and two extracellular regions (reviewed in ref. 2). Epitope mapping of anti-rat CD53 mAbs has proved the major hydrophilic region to be extracellular, which is consistent with the proposed topology of TM4SF molecules 2. Gene mapping data suggest that CD37 and CD53 have evolved by gene duplication and divergence from a common ancestral gene 4.
L
E +
Ligands and associated molecules
i
Co-immunoprecipitation and flow cytometric energy transfer studies have shown that CD53 associates non-covalently with a number of other molecules: CD2 in rat NK and T cells 2; the integrin CD29/CD49d (VLA-4) in a human T cell lineS; MHC Class I and II, CD19, CD20, CD21, CD37, CD81 and CD82 in human B cells 6; and a protein tyrosine phosphatase in rat lymph node cells 7. No extracellular ligand has been identified for CD53.
Function In vitro experiments using mAbs suggest that CD53 can transduce signals in
human B cells, monocytes and granulocytes, and in rat macrophages, NK and T cells 2
~.7~
CD53
Database accession numbers Human Rat Mouse
St
PIR
SWISSPR OT
EMBL/GENBANK
A37243 A39574
P19397 P24485
M37033 M57276 X97227
REFERENCE 8,9 10 4
A m i n o acid s e q u e n c e of h u m a n C D 5 3 MGMSSLKLLK Y V L F F F N L L F W I C G C C I L G F G I Y L L I H N N F GVLFHNLPSL
TLGNVFVIVG ILLFVYEQKL SDWTSGPPAS SFALTLNCQI
SIIMVVAFLG CMGSIKENKC L L M S F F I L L L I I L L A E V T L A N E Y V A K G L T D SIHRYHSDNS TKAAWDSIQS F L Q C C G I N G T CPSDRKVEGC YAKARLWFHS N F L Y I G I I T I C V C V I E V L G M DKTSQTIGL
50
i00 150 200 219
References 1 Korinek, V. and Horejsi, V. (1993) Immunogenetics 38, 272-279.
2 Wright, M.D. and Tomlinson, M.G. (1994) Immunol. Today 15, 588-594. 3 4 s 6 7 8 9 lo
Tomlinson, M.G. et al. (1995) Eur. J. Immunol. 25, 2201-2206. Wright, M.D. et al. (1993) Int. Immunol. 5, 209-216. Mannion, B.A. et al. (1996)J. Immunol. 157, 2039-2047. Sz611osi, J. et al. (1996)J. Immunol. 157, 2939-2946. Carmo, A.M. and Wright, M.D. (1995) Eur. J. Immunol. 25, 2090-2095. Amiot, M. (1990) J. Immunol. 145, 4322-4325. Angelisova, P. et al. (1990) Immunogenetics 32, 281-285. Bellacosa, A. et al. (1990) Mol. Cell Biol. 11, 2864-2872.
~7~
CD54 7-
ICA/VI-1
Molecular weights Polypeptide
55 402
SDS-PAGE reduced
85-110 kDa
Carbohydrate N-linked sites O-linked
8 unknown
Human gene location and size 19p13.3-13.21; 12kb z
COOH Domains
PGP
CST
I
PMC
CQV
C2
Exon boundaries
YVVT
I
FSCI ICSL LTGI
11
02
FVL
CEA
I
[I
YSF
FsC 02
CQA
I
I1
LYG
YLq c2 ITMlcYI ll LSP
Tissue distribution CD54 has a wide tissue distribution, being expressed on both haematopoietic and non-haematopoietic cells a. Expression on resting leucocytes is low but is upregulated on activation 4,s. Similarly, expression on endothelium and other non-haematopoietic cells is strongly upregulated by inflammatory mediators 4,s. A soluble form of ICAM-1 is detectable in the blood 6.
Structure CD54 is closely related to CD50 (ICAM-3)whose gene is also located in the region 19p 13.3-13.2. Both proteins have five C2-set IgSF domains in their extracellular regions with 52% identity. The two membrane-distal IgSF domains of CD54 also share -34% identity with the extracellular portion of CD102 (ICAM-2). Electron microscopy studies suggest that CD54 is a bent rod, 18.7nm in length 7. Cell surface CD54 may exist as a non-covalent linked dimer s. A number of alternatively spliced forms of CD54 are expressed in the mouse 9. These isoforms, which have a more restricted pattern of expression, lack combinations of domains 2-4, but most still bind LFA-1.
Ligands and associated molecules CD54, like CD50 and CD102, is a ligand for the leucocyte integrin CD 11 a/CD 18 (LFA-1) 7. CD54 also binds the related integrins CD 1 lb/CD 18
~.78
,,, .........
[
(Mac-l) 1~ and C D l l c / C D 1 8 (p150,95) 11 and has been reported to bind hyaluronan 12 and fibrinogen la. CD54 is a receptor for the major group of rhinoviruses 7 and is one of the receptors on endothelium (others include CD36 and thrombospondin) for P l a s m o d i u m falciparum-infected erythrocytes 14. The LFA-1 binding site lies on the GFC fl sheet of the membrane-distal domain (domain 1)7. Rhinoviruses bind an overlapping site on domain 1 (and possibly 2)7 whereas Mac-1 binds to domain 3 lo.
Function Endothelial CD54 contributes to the extravasation of leucocytes from blood vessels, particularly in areas of inflammation is. CD54 on antigen-presenting cells contributes to antigen-specific T cell activation, presumably by enhancing interactions between T cells and antigen-presenting cells is
Database accession numbers Human Mouse Rat
;<
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A29849 A45815
P05362 P13597
X06990 X52264 D00913
16 17 18
A m i n o acid s e q u e n c e of h u m a n C D 5 4 MAPSSPRPAL NAQTSVSPSK RKVYELSNVQ PVGKNLTLRC RRDHHGANFS VLEVDTQGTV VSVTAEDEGT TEVTVKCEAH EVAGQLIHKN LPELKCLKDG SPRYEIVIIT NTQATPP
PALLVLLGAL VILPRGGSVL EDSQPMCYSN QVEGGAPRAN CRTELDLRPQ VCSLDGLFPV QRLTCAVILG PRAKVTLNGV QTRELRVLYG TFPLPIGESV VVAAAVIMGT
FPGPG VTCSTSCDQP CPDGQSTAKT LTVVLLRGEK GLELFENTSA SEAQVHLALG NQSQETLQTV PAQPLGPRAQ PRLDERDCPG TVTRDLEGTY AGLSTYLYNR
KLLGIETPLP FLTVYWTPER ELKREPAVGE PYQLQTFVLP DQRLNPTVTY TIYSFPAPNV LLLKATPEDN NWTWPENSQQ LCRARSTQGE QRKIKKYRLQ
KKELLLPGNN VELAPLPSWQ PAEVTTTVLV ATPPQLVSPR GNDSFSAKAS ILTKPEVSEG GRSFSCSATL TPMCQAWGNP VTREVTVNVL QAQKGTPMKP
-I 50 i00 150 200 250 300 350 400 450 500 507
References 1 2 a 4 s 6 z
Trask, B. et al. (1993) Genomics 15, 133-145. Voraberger, G. et al. (1991) J. Immunol. 147, 2777-2786. Smith, M.E.F. and Thomas, J.A. (1990)J. Clin. Pathol. 43, 893-900. Springer, T.A. (1990) Nature 346, 425-434. Dustin, M.L. and Springer, T.A. (1991) Annu. Rev. Immunol. 9, 27-66. Gearing, A.J.H. and Newman, W. (1993)Immunol. Today 14, 506-512. Staunton, D.E. et al. (1990) Cell 61,243-254.
8 Miller, J. et al. (1995) J. Exp. Med. 182, 1231-1241.
.....
9 lo 11 12 13 14
King, P.D. et al. (1995) J. Immunol. 154, 6080-6093. Diamond, M.S. et al. (1991 ) Cell 65, 961-971. de Fourgerolles, A.R. et al. (1995) Eur. J. Immunol. 25, 1008-1012. McCourt, P.A.G. et al. (1994)J. Biol. Chem. 269, 30081-30084. Languino, L.R. et al. (1993)Cell 73, 1423-1434. Berendt, A.R. et al. (1992)Cell 68, 71-81.
~.7~
CD54
f]
~.8C
is 16 17 ~s
Xu, H. et al. (1994) J. Exp. Med. 180, 95-109. Simmons, D. et al. (1988) Nature 331,624-627. Ballantyne, C.M. et al. (1989) Nucleic Acids Res. 17, 5853. Kita, Y. et al. (1992) Biochim. Biophys. Acta 1131, 108-110.
CD55
C o m p l e m e n t decay acclerating factor (DAF)
Molecular weights Polypeptides 34 964
NH2
)
i
--0
SDS-PAGE reduced unreduced
60- 70 kDa 64- 73 kDa
c
Carbohydrate N-linked sites O-linked sites
1
!
abundant
T tTt
Human gene location and size 1q32; -40 kb 1,2 Domains
Isl Exon boundaries
CGL ]
WGD
c
CEV EFC] I
"
CPN
c
EF?]
NRS
I
KKK
I
I
CPA
c TGY
I
"
REI
c
~c
II
RG
4/\
IGI 1
QAT SGT SGH
Tissue distribution CD55 is expressed on all cells in contact with serum, including all haematopoietic cells and the vascular endothelium. It is widely expressed on epithelia in the gastointestinal and genitourinary tracts and the central nervous system. A soluble form of CD55 is present in plasma and body fluids e
Structure The N-terminal region of the extracellular portion of CD55 consists of four complement control protein (CCP)domains attached to a Ser- and Thr-rich segment that is heavily O-glycosylatedl-4. It has been proposed that this segment serves as a spacer to project the functional CCP domains from the cell surface e. CD55 expressed on a Chinese hamster ovary cell line defective in O-glycosylation is rapidly degraded, suggesting that the carbohydrate structures serve to protect CD55 from proteolysis s. The Nterminus of CD55 has been determined by protein sequencing 6. The glycoprotein is GPI-anchored at Ser319 7. A minor form of CD55 mRNA arises from alternate splicing of an additional exon, but a corresponding protein has not been identified 1-3. A covalently linked homodimeric form has been described 8.
Ligands and associated molecules CD55 interacts with complement components (see below)and CD97, a seventransmembrane domain protein which also contains three extracellular EGF domains 9.
~_81
CD55
[
Function CD55 is a member of the regulator of complement activation (RCA) family of proteins 2 (see CD35). It accelerates the dissociation of the components of the C3-convertases, namely C2a from C4b in the C4b2a complex (the C3convertase of the classical pathway), and factor Bb from the C3bBb complex (the C3-convertase of the alternative pathway). CD55 expression increases upon T cell activation and antibodies to CD55 are mitogenic in the presence of phorbol esters 1~ CD55 and the two other complement regulatory proteins, CD46 and CD59, have received much attention in their role in reproduction and xenotransplantaion (see CD46).
Comments The mouse has two genes for CD55, one codes for a GPI-anchored protein and the other for a transmembrane protein 11. Analysis of mRNA expression showed that the GPI-anchored form is detected in most tissues, whereas the transmembrane form is highly expressed in the testes, and is also detected in bone marrow, lymph nodes, lung and liver. CD55 expressed on mouse erythrocytes is resistant to phosphatidylinositol-specific phospholipase C treatment whereas CD55 on the surface of spleen and PBMCs is removed by phospholipase C, suggesting that the two forms of CD55 are expressed on different cell types 12. This is another example of the lack of conservation between species of genes in the RCA complex (see CD35).
Database accession numbers Human
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A26359
P08174 P09679
M35156
3,4
M30142 L41366 L41365
1,3 11 11
Human Alt-form Mouse GPI-form Mouse TM-form
9.
A m i n o acid s e q u e n c e of h u m a n C D 5 5 MTVARPSVPA DCGLPPDVPN SQWSDIEEFC REPSLSPKLT TISFSCNTGY IIQGERDHYG GKSLTSKVPP RTTKHFHETT GTTRLLSGHT
ALPLLGELPR AQPALEGRTS NRSCEVPTRL CLQNLKWSTA KLFGSTSSFC YRQSVTYACN TVQKPTTVNV PNKGSGTTS CFTLTGLLGT
LLLLVLLCLP FPEDTVITYK NSASLKQPYI VEFCKKKSCP LISGSSVQWS KGFTMIGEHS PTTEVSPTSQ
AVWG CEESFVKIPG TQNYFPVGTV NPGEIRNGQI DPLPECREIY IYCTVNNDEG KTTTKTTTPN
EKDSVICLKG VEYECRPGYR DVPGGILFGA CPAPPQIDNG EWSGPPPECR AQATRSTPVS
LVTMGLLT
References 1 Post, T.W. et al. (1990)J. Immunol. 144, 740-744.
2 Liszewski, M.K. et al. (1996) Adv. Immunol. 61, 201-283. 3 Caras, I.W. et al. (1987) Nature 325, 545-549. 4 Medor, M.E. et al. (1987) Proc. Natl. Acad. Sci. USA 84, 2007-2011. s Reddy, P. et al. (1989) J. Biol. Chem. 264, 17329-17336.
t82
-i 50 i00 150 200 250 300 319 +28
CD55
6 7 s 9 lo 11 12
Davitz, M.A. et al. (1987) J. Immunol. Methods 97, 71-76. Moran, P. et al. (1991)J. Biol. Chem. 266, 1250-1257. Nickells, M.W. et al. (1994) J. Immunol. 152, 676-685. Hamann, J. et al. (1996)J. Exp. Med. 184, 1185-1189. Davis, L.S. et al. (1988) J. Immunol. 141, 2246-2252. Spicer, A.P. et al. (1995)J. Immunol. 155, 3079-3091. Kameyoshi, Y. et al. (1989)Immunology 68, 439-444.
~.8S
CD56
Neural cell adhesion molecule (NCAM) isoform
Other names NKH- 1 antigen Leu- 19 antigen Fasciclin II (Drosophila) apCAM (Mollusc)
Molecular weights SDS-PAGE reduced or unreduced 180 kD (long cytoplasmic domain form) 140 kD (short cytoplasmic domain form) 120 kD (GPI-linked form)
Carbohydrate N-linked sites O-linked
<
6 unknown
Human gene location 1 lq23.1 COOH CQV
Domains
sl
YKC
02
CDV
CDA
YRC
c~
I
YICI
c~
CEA
1
I
YIC
c~
CEV
I
YNC
c2
FDE
I
i
TT~ F3
I
LSP [ RHY
F9
ITM ICYI
T i s s u e distribution On human haematopoietic cells CD56 is restricted to NK cells and a subpopulation of T lymphocytes. CD56 is not expressed on mouse or rat haematopoietic cells. Amongst both human and murine non-haematopoietic tissues, CD56 is expressed in adult neural tissue and muscle, and in many embryonic tissues. A number of tumour cell types are positive for CD56, including some myeloid leukaemias, myelomas, neuroblastomas, Wilms' tumours and small cell lung carcinomas (reviewed in refs 1 and 2).
=
Structure CD56 is an isoform of the neural cell adhesion molecule (NCAM) that is virtually identical to the 140 kDa brain isoform. Various NCAMs, including soluble and GPI-anchored isoforms, arise by alternative splicing from a single gene and are post-translationally modified by glycosylation, acylation, sulfation and phosphorylation. The extracellular domain of NCAM consists of five C2-set IgSF domains and two fibronectin type III domains. Electron microscopy of the purified protein suggests that these domains are tandemly arranged to form a flexible rod-like structure projecting from the cell surface (reviewed in ref. 1). The three-dimensional structure of the N-terminal IgSF domain has a high similarity to domain 1 of VCAM-13
Z84
CD56
Ligands and associated m o l e c u l e s CD56 has a clear function in homotypic binding to CD56 molecules on other cells. All five IgSF domains appear to be involved, and these pair up in an antiparallel alignment such that domain 1 binds to domain 5, domain 2 to 4, and domain 3 with itself 4. NCAM on neuronal cells can bind to chondroitin sulfate proteoglycans of the cell matrix. The result of this interaction is the inhibition of neurite outgrowth s NCAM cisinteractions with other molecules have not been defined, although indirect evidence suggests a potential association of NCAM with the related neural cell adhesion molecule L1, and with the fibroblast growth factor receptor through which NCAM may transduce signals (reviewed in refs 6 and 7).
Function The significance of CD56 expression on NK cells is not clear and remains controversial 2. However, the role of NCAM as a homotypic cell adhesion molecule on neuronal cells has been well studied (reviewed in refs 7 and 8). In vitro data suggest that NCAM functions in the control of neuronal development by regulating cell migration, neurite outgrowth, selective fasciculation and axon sorting, target recognition, and synaptic plasticity 8. NCAM knockout mice appear grossly normal, but do have subtle defects in neuronal guidance and connectivity. Moreover, these animals exhibit deficient spatial learning which implicates NCAM in synaptic plasticity 8. NCAM gene targeting to produce mice that express only a secreted NCAM results in a dominant embryonic lethality. This suggests a role for NCAM in heterophilic interactions, in addition to its relatively well characterized function in homotypic adhesion 9. The capacity of NCAM for cell adhesion can be down-modulated by post-translational attachment of a large polysialic acid (PSA)moiety to the fifth IgSF domain. PSA expression is regulated by neuronal activity at the synapse and is required for the induction of synaptic plasticity by NCAM lO
Comment NCAM is highly conserved between species, the murine homologue sharing 24% and 26 % amino acid identity with the insect and mollusc homologues, respectively 11. The Drosophila homologue (fasciclin II) functions in selective fasciculation and axon sorting 8, and the mollusc Aplysia homologue (apCAM) has a role in synaptic plasticity 11.
Database accession numbers for some NCAM isoforms Human Rat Mouse Chicken Xenopus Drosophila Mollusc
PIR A26883 S00846 A29673 A25435 S09600 B41054 C42632
SWISSPROT P13592 P13596 P13595 P 13590 P 16170 P34082
EMBL/GENBANK X16841 X06564 Y00051 M15861 M25696 M77166 M89648
REFERENCE 12 13 14 15 16 17 11
~_8~
CD56
A m i n o acid sequence of NK cell CD56 (ref. 18)
9,
MLQTKDLIWT LQVDIVPSQG SVVWNDDSSS FKNAPTPQEF SNNYLQIRGI IVNATANLGQ SQLTIKKVDK ELEEQVTLTC VVRSHARVSS GPVAVYTWEG SASYLEVTPD PYSSTAQVQF IVTIVGLKPE QMGEDGNSIK MLKSLDWNAE STGAIVGILI KDMEEGKAAF KGPVEAKPEC
LFFLGTAVS EISVGESKFF TLTIYNANID REGEDAVIVC KKTDEGTYRC SVTLVCDAER NDEAEYICIA EASGDPIPSI LTLKSIQYTD NQVNITCEVF SENDFGNYNC DEPEATGGVP TTYAVRLAAL VNLIKQDDGG YEVYVVAENQ VIFVLLLVVV SKDESKEPIV QETETKPAPA
LCQVAGDAKD DAGIYKCVVT DVVSSLPPTI EGRILARGEI FPEPTMSWTK ENKAGEQDAT TWRTSTRNIS AGEYICTASN AYPSATISWF TAVNRIGQES ILKYKAEWRA NGKGLGEISA SPIRHYLVRY QGKSKAAHFV DITCYFLNKC EVRTEEERTP EVKTVPNDAT
KDISWFSPNG GEDGSESEAT IWKHKGRDVI NFKDIQVIVN DGEQIEQEED IHLKVFAKPK SEEKASWTRP TIGQDSQSMY RDGQLLPSSN FEFILVQADT VGEEVWHSKW ASEFKTQPVQ RALSSEWKPE FRTSAQPTAI GLFMCIAVNL NHDGGKHTEP QTKENESKA
EKLTPNQQRI VNVKIFQKLM LKKDVRFIVL VPPTIRARQN DEKYIFSDDS ITYVENQTAM EKQETLDGHM LEVQYAPKLQ YSNIKIYNTP PSSPSIDQVE YDAKEASMEG GEPSAPKLEG IRLPSGSDHV PANGSPTSGL CGKAGPGAKG NETTPLTEPE
References !
! l |
rI I
...........
'.8(
!
1 z 3 4 s 6 7 8 9 lo 11 12 13 14 is 16 17 18
Goridis, C. and Brunet, J.-F. (1992) Scmin. Cell Biol. 3, 189-197. Lanier, L.L. and Hemperly, J.J. (1995) Leucocyte Typing V, 1398-1400. Thomsen, N.K. et al. (1996) Nat. Struct. Biol. 3, 581-585. Ranheim, T.S. et al. (1996) Proc. Natl Acad. Sci. USA 93, 4071-4075. Friedlander, D.R. et al. (1994) J. Cell. Biol. 125, 669-680. Feizi, T. (1994)Trends Biochcm. Sci. 19, 233-234. Baldwin, T.J. et al. (1996) J. Cell. Biochem. 6 1 , 5 0 2 - 5 1 3 . Goodman, C.S. (1996)Annu. Rev. Neurosci. 19, 341-377. Rabinowitz, J.E. et al. (1996) Proc. Natl Acad. Sci. USA 93, 6421-6424. Muller, D. et al. (1996) Neuron 17, 413-422. Mayford, M. et al. (1992)Science 256, 638-644. Barton, C.H. et al. (1988) Development 104, 165-173. Small, S.J. et al. (1987) J. Cell Biol. 105, 2335-2345. Santoni, M.-J. et al. (1987) Nucleic Acids Res. 15, 8621-8641. Cunningham, B.A. et al. (1987) Science 236, 799-806. Krieg, P.A. et al. (1989) Gene 17, 10321-10335. Grenningloh, G. et al. (1991)Cell 67, 45-57. Hemperly, J.J. et al. (1990) J. Mol. Neurosci. 2, 71-78.
-i 50 I00 150 200 250 300 350 400 450 500 550 600 650 700 750 800 839
HNK-1, Leu-7 antigen
Tissue distribution CD57 is present on a subset of NK cells and of T lymphocytes 1. Red blood cells, granulocytes, monocytes and platelets do not express CD571. The antigen is expressed on receptors in the nervous system 2,3. In chicken, CD57 has been identified on integrins 2-4
Structure CD57 is an oligosaccharide antigenic determinant present on a variety of polypeptides, lipids and chondroitin sulfate proteoglycans depending on the tissue examined 2'3. The polypeptides associated with the antigen on leucocytes have not been characterized in detail. The structure of CD57 as carried by a glycolipid is a 3'-sulfated glucuronic acid and the COOH of the glucuronic acid is important for the epitope 3. The antigen is conserved across species 2-4.
Ligands and associated molecules CD57 binds to L- and P- but not E-selectin in a Cag+-dependent manner and is involved in homophilic interactions of P0 2-4. CD57 binds to the second globular domain of the E8 fragment of laminin 2-4.
Function The function of CD57 is unknown. Myelinating Schwann cells associating with motor but not sensory axons express CD57 on myelin-associated glycoprotein (MAG)2-4.
References Schubert, J. et al. (1989) In Leucocyte Typing IV (Knapp, W., ed.) Oxford University Press, Oxford, pp. 711- 714. z Schachner, M. et al. (1995) Prog. Brain Res. 105, 183-188. 3 Jungalwala, F. B. (1994) Neurochem. Res. 19, 945-957. 1
4 Schachner, M. and Martini, R. (1995) Trends Neurosci. 18, 183-191.
~.8"/
LFA-3 Molecular weights Polypeptide Transmembrane form
25 339
SDS-PAGE reduced unreduced
45-70 kDa 45-70 kDa
OR
Carbohydrate 6
N-linked sites O-linked
unknown
Human gene location
COOH
lp13.1 1,2
FHV
Domains
I si
v
YEM I
CMI
IQC
c=
I
I TM I cu
Tissue distribution CD58 is expressed on most haematopoietic cells (including erythrocytes)and on various non-haematopoietic cells, such as fibroblasts, endothelium and epithelia a,,. It is expressed on about half of peripheral blood B and T cells and at high levels on monocytes a. Expression is particularly high on memory T cells a and dendritic cells s. In lymphoid tissues CD58 is expressed on all dendritic cells and macrophages, and on germinal centre B cells, medullary thymocytes and medullary (but not cortical) thymic epithelial cells a. The sheep CD58 homologue binds human CD2 and mediates the phenomenon of sheep erythrocyte rosetting on human T cells 1.
i
Structure CD58 contains a membrane-distal V-set and a membrane-proximal C2-set IgSF domain, and is a member of the CD2 family of molecules, which includes CD48, Ly-9, 2B4, and CD150. Within this group CD58 is most similar to CD48, which is the major CD2 ligand in the mouse and rat 6. Alternative splicing gives rise to either transmembrane- or glycosyl-phosphatidylinositol-anchored forms 7. The N-terminus of the mature polypeptide has been established by protein sequencing 8.
Ligands and associated m o l e c u l e s The extracellular portion of CD58 binds to CD21,7 on the GFCC'C" fl sheet of the V-set domain 6. CD2 binds CD58 in solution with a very low affinity (Kd 9-22 ~tM)as a result of a very fast dissociation rate constant (koff ___4 s -1) 6 Membrane-attached (GPI-anchored) CD58 binds half-maximally to cell surface CD2 at a surface density of-~20molecules/~tm 2 and there is rapid exchange of bound and free CD58 at the adhesion interface 9.
~.8~
Function CD58 expressed on antigen-presenting cells and target cells enhances T cell antigen recognition through binding to CD2 on T cells 1,6,1o. This is partly a consequence of improved adhesion but signals transmitted through the CD2 may also contribute 1,1o,11. Structural studies suggest that the CD2-mediated adhesion may optimize the inter-membrane distance for antigen recognition by the T cell receptor 6. Database accession numbers Human Human (GPI)
Amino
9.
PIR A28564
S01269
acid sequence
MLRLLLALNL FSQQIYGVVY NRVYLDTVSG LTCALTNGSI NDLPQKIQCT CIVLYMNGIL
SWISSPR O T P 19256
FPSIQVTGNK GNVTFHVPSN SLTIYNLTSS EVQCMIPEHY LSNPLFNTTS KCDRKPDRTN
of h u m a n
EMBL/GENBANK Y00636
REFERENCE
X06296
12
8
CD58
ILVKQSPM VPLKEVLWKK DEDEYEMESP NSHRGLIMYS SIILTTCIPS SN
QKDKVAELEN NITDTMKFFL WDCPMEQCKR SGHSRHRYAL
SEFRAFSSFK YVLESLPSPT NSTSIYFKME IPIPLAVITT
-I 50 i00 150 200 222
GPI-anchored form (Ser180 is a potential site for GPI attachment): SGHSRHRYAL IPIPLAVITT ClVLYMNVL
i i
209
References 1 Moingeon, P. et al. (1989) Immunol. Rev. 111, 111-144. z Mitchell, E.L.D. et al. (1995) Cytogenet. Cell Genet. 70, 183-185. a Sanders, M.E. et al. (1988) J. Immunol. 140, 1401-1407. 4 Smith, M.E.F. and Thomas, J.A. (1990)J. Clin. Pathol. 43, 893-900. s Freudenthal, P.S. and Steinman, R.M. (1990) Proc. Natl Acad. Sci. USA 87, 7698-7702. 6 Davis, S.J. and van der Merwe, P.A. (1996) Immunol. Today 17, 177-187. 7 Dustin, M.L. and Springer, T.A. (1991) Annu. Rev. Immunol. 9, 27-66. s Wallner, B.P. et al. (1987)J. Exp. Med. 166, 923-932. 9 Dustin, M.L. et al. (1996) J. Cell. Biol. 132, 465-474. lo Bierer, B.E. and Burakoff, S.J. (1989)Immunol. Rev. 111,267-294. ~1 Beyers, A.D. et al. (1989)Immunol. Rev. 111, 59-77. lZ Seed, B. (1987) Nature. 329, 840-842.
~_8~
CD59
Complement protectin, MIRL, H19, MACIF, HRF20, P-18 [
Molecular weights Polypeptide 8961
IH2
SDS-PAGE reduced non-reduced
19 kDa 19 kDa
Carbohydrate N-linked sites O-linked sites
1 unknown
'
Human gene location and size 1 lp13; >27 kb 1,2 GYN
Domain
Isl Exon boundaries
DLO
I
I1
SGH
i
I1
AGL
Tissue distribution CD59 is expressed on leucocytes, erythrocytes, platelets, a variety of endothelial and epithelial cells, placenta and spermatozoa. It is also found in a number of bodily fluids including blood plasma, saliva, amniotic fluid, seminal fluid and urine 2-4.
Structure CD59 contains a single Ly-6 domain and it attaches to the cell surface via a GPI anchor s'6. The N-terminal residue has been determined by protein sequencing s, and the GPI attachment sites has been shown to be Asn77 7. The disulfide bond pattern has been established to be Cys3-Cys26, Cys6Cysl3, Cys19-Cys39, Cys45-Cys63, and Cys64-Cys69 7-9. The threedimensional structure has been determined by NMR s,9 and shown to be similar to the snake venom neurotoxins.
Ligands and associated molecules CD59 binds complement components C8 and C9 (see below). A proposed interaction with CD2 has not been confirmed lo.
Function CD59 restricts the cytolytic activity of homologous complement by binding to C8 and C9 and blocking assembly of the membrane attack complex 2"a'11. It does not block the lytic activity of perforin in cell-mediated cytotoxicity 12. It is unlikely that CD59 is synthesized by all cells on which it is expressed. CD59 can be transferred between cells via fluid phase vesicles (prostasomes)
~_9[
CD59 i
i
and other non-membranous complexes, which also explains its presence in m a n y body fluids 4,13. Paroxysmal noctural hemoglobinuria (PNH) is likely to be the direct consequence of the absence of CD59 on the cell surface, although, in most cases, it results from defects in the synthesis of GPI anchors 14'1s. CD59 and the two other complement regulatory proteins, CD46 and CD55, have received m u c h attention for their role in reproduction and xenotransplantation (see CD46).
Database accession numbers Human Rat
P.
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
A46252 S53340
P13987 P27274
X16447 U48255
5
16
A m i n o acid s e q u e n c e of h u m a n C D 5 9 MGIQGGSVLF GLLLVLAVFC HSGHS LQCYNCPNPT ADCKTAVNCS SDFDACLITK AGLQVYNKCW KFEHCNFNDV
TTRLRENELT YYCCKKDLCN FNEQLEN GGTSLSEKTV LLLVTPFLAA AWSLHP
-i 50 77 +26
References
1 Tone, M. et al. (1992)J. Mol. Biol. 227, 971-976.
2 Liszewski, M.K. et al. (1996) Adv. Immunol. 61, 201-283. 3 4 s 6 7 8 9 lo
Davies, A. and Lachmann, P.J. (1993)Immunol. Res. 12, 258-275. Rooney, I.A. et al. (1993) J. Exp. Med. 177, 1409-1420. Davies, A. et al. (1989) J. Exp. Med. 170, 637-654. Stefanova, I. et al. (1989) Mol. Immunol. 26, 153-161. Sugita, Y. et al. (1993) J. Biochem. 114, 473-477. Fletcher, C.M. et al. (1994) Structure 2, 185-199. Kieffer, B. et al. (1994) Biochemistry 33, 4471-4482. van der Merwe P.A. et al. (1994) Biochemistry 33, 10149-10160.
11 Lachmann, P.J. (1991) Immunol. Today 12, 312-315.
12 Meri, S. et al. (1990) J. Exp. Med. 172, 367-370. 13 Rooney, I.A. et al. (1996)J. Clin. Invest. 97, 1675-1686. 14 Yamashina, M. et al. (1990) N e w Engl. J. Med. 323, 1184-1189. i s Bessler, M. et al. (1994) EMBO J. 13, 110-117. 16 Rushmere, N.K. et al. (1994) Biochem. J. 304, 595-601.
~_91
CD60
UM4D4
I
Tissue distribution CD60 is expressed on 25-45 % of T cells and on platelets 1. Smaller proportions of leucocytes from other lineages also express the antigen. Red blood cells do not express CD60.
Structure The minimal CD60 epitope consists of the trisaccharide NeuAc2--. 8NeuAc2 --. 3Galfll --* 4 z. Four mAbs to CD60 have been shown to recognize mainly acetylated forms of the ganglioside GD3. Antibody binding is reduced or abolished following O-deacetylation of the antigen 3.
Ligands and associated molecules The N e u A c 2 - - , 8 N e u A c 2 - . 3 G a l f l l - - * 4 epitope is present on some gangliosides and is immunoprecipitated in association with polypeptides whose molecular weights vary with the tissue examined 2'4. It remains to be determined whether the oligosaccharide/glycolipid is covalently associated with these polypeptides.
Function CD60 antibodies deliver co-stimulatory signals in the presence of accessory cells or low doses of phorbol esters s, and augment signalling via CD3 4. CD60 § T cells within the CD4 and CD8 subsets provide B cell help and secrete higher levels of IL-4, whereas CD60-CD8 § cells perform cytotoxic and suppressor functions and are a major source of IFN7 4,4.
Comment The CD60 antigen is expressed at high levels on 75% of T cells from the synovial fluid of arthritic patients 7, and on 75% of T cells from cutaneous psoriatic lesions s, and may have a role in the pathogenesis of these diseases.
References 1 2 3 4 s 6 7 s
F
~.92
Rieber, E.P. (1989) Leucocyte Typing IV, 361. Kniep, B. et al. (1989) Leucocyte Typing IV, 362-364. Kniep, B. et al. (1992) Biochem. Biophys. Res. Commun. 187, 1343-1349. Rieber, E.P. et al. (1989) Leucocyte Typing IV, 366-368. Higgs, J.B. et al. (1988) J. Immunol. 140, 3758-3765. Rieber, E.P. and Rank, G. (1994) J. Exp. Med. 179, 1385-1390. Fox, D.A. et al. (1990)J. Clin. Invest. 86, 1124-1136. Baadsgaard, O. et al. (1990)J. Invest. Dermatol. 95, 275-282.
CD61
Integrin//3 subunit
i Molecular weights '~ Polypeptide ! variant
!
84 390 83 445
!
! ] [ _
SDS-PAGE reduced unreduced
105 kDa 90 kDa
! ! i
[
Carbohydrates N-linked sites O-linked sites
6 unknown
S
8
[ Human gene location and size i
17q21.3; -65kb1'2
CD411CD61
Tissue distribution CD61 is expressed on platelets, megakaryocytes, monocytes, macrophages and endothelial cells 1,3.
Structure CD61 is the integrin f13 subunit. The N-terminus of CD61 has been determined 4. An alternative cytoplasmic domain of CD61 has been detected at the mRNA level in human placental tissue and two human cell lines s. An intramolecular disulfide bond map has been proposed 6
Ligands and associated molecules CD61 combines with CD41 to form the platelet glycoprotein IIb/IIIa (integrin aIIbfl3) and with CD51 to form the vitronectin receptor (integrin aVfl3). The cytoplasmic tail of CD61 has been shown to interact with the molecule fl3-endonexin 7.
Function See CD41 and CD51 for the functions of the CD41/CD61 and CD51/CD61 complexes respectively 8. Variations in the CD61 sequence are associated with platelet-specific alloantigens: L33P (HPA-1A/B); R143Q (HPA-4A/B); P407A (MO+); R489Q (CA+) and R636C (SR a +)(see SWISSPROT entry). Five missense mutations of CD61 have been associated with Glanzmann thrombasthenia (residue locations are given according to the sequence of the mature protein as shown below): D119Y, R214Q and R214W lead to the loss of ligand binding activities of CD41/CD61 complexes g-11; C374Y results in the diminished expression of CD41/CD61 on platelets although ligand
'9~
CD61
binding specificities appear to remain intact 12; and $752P in the cytoplasmic domain renders the CD41/CD61 non-responsive to platelet activation 13. Other mutations, including deletions and erroneous splicing, have been described 14.
Database accession numbers Human
FIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A26547
P05106
J02703 M20311
15 16
Amino acid sequence of human CD61 MRARPRPRPL GPNICTTRGV PESIEFPVSE SIQVRQVEDY IGFGAFVDKP TRFNEEVKKQ DAKTHIALDG QKNINLIFAV IRSKVELEVR AKVRGCPQEK GNGTFECGVC CLCGQCVCHS SDWTGYYCNC CPTCPDACTF DAVNCTYKNE MGAILLIGLA
WVTVLALGAL SSCQQCLAVS ARVLEDRPLS PVDIYYLMDL VSPYMYISPP SVSRNRDAPE RLAGIVQPND TENVVNLYQN DLPEELSLSF EKSFTIKPVG RCGPGWLGSQ SDFGKITGKY TTRTDTCMSS KKECVECKKF DDCVVRFQYY ALLIWKLLIT
AGVGVG PMCAWCSDEA DKGSGDSSQV SYSMKDDLWS EALENPCYDM GGFDAIMQAT GQCHVGSDNH YSELIPGTTV NATCLNNEVI FKDSLIVQVT CECSEEDYRP CECDDFSCVR NGLLCSGRGK DRGALHDENT EDSSGKSILY IHDRKEFAKF
A N N P L Y K E A T STFTNITYRG T V R D G A G R F L K SLV
LPLGSPRCDL TQVSPQRIAL IQNLGTKLAT KTTCLPMFGY VCDEKIGWRN YSASTTMDYP GVLSMDSSNV PGLKSCMGLK FDCDCACQAQ SQQDECSPRE YKGEMCSGHG CECGSCVCIQ CNRYCRDEIE VVEEPECPKG EEERARAKWD
KENLLKDNCA RLRPDDSKNF QMRKLTSNLR KHVLTLTDQV DASHLLVFTT SLGLMTEKLS LQLIVDAYGK IGDTVSFSIE AEPNSHRCNN GQPVCSQRGE QCSCGDCLCD PGSYGDTCEK SVKELKDTGK PDILVVLLSV T
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 7O0 741
762 754 (vat iant )
References 1 Kieffer, N. and Phillips, D.R. (1990) Annu. Rev. Cell Biol. 6, 334-357. z Lanza, F. et al. (1990) J. Biol. Chem. 18098-18103. 3 Pigott, R. and Power, C. (1993)The Adhesion Molecule FactsBook. Academic Press, London. 4 Charo, I.F. et al. (1986) Proc. Natl Acad. Sci. USA 83, 8351-8355. s van Kuppeveh, T.H.M.S.M. et al. Proc. Natl Acad. Sci. USA 86, 5415-5418. 6 Calvette, J.J. et al. (1991) Biochem. J. 274, 63-71. 7 Shattil, S.J. et al. (1995) J. Cell Biol. 131,807-816. 8 Ginsberg, M.H. et al. (1993) Thromb. Haemost. 70, 87-93. 9 Loftus, J.C. et al. (1990) Science 249, 915-918. lo Bajt, M.L. et al. (1992)J. Biol. Chem. 267, 3789-3794. 11 Lanza, F. et al. (1992) J. Clin. Invest. 89, 1995-2004. 12 Grimaldi, C.M. et al. (1996) Blood 88, 1666-1675. 13 Chen, Y.P. et al. (1992) Proc. Natl Acad. Sci. USA 89, 10169-10173. 14 Bray, P.E (1994) Thromb. Haemost. 72, 492-502. is Fitzgerald, L.A. et al. (1987)J. Biol. Chem. 262, 3936-3939. ~6 Zimrin, A.B. et al. (1988) J. Clin. Invest. 81, 1470-1475.
~_94
ELAM-1, E-selectin
Molecular weights Polypeptide
64 467
SDS-PAGE reduced
97 and 107-115 kDa
Carbohydrate N-linked sites O-linked
11 unknown
Human gene location and size 1 q23-25; -13 kb 1,2 Domaios Exon boundaries
CQQ
[s l
I
I1 LVL
CTN
CTA
ERC I
c~
I
I1 TAA
LKCI E
I
I1 QIV
PAcICDA!
c
PTC
I,
~
I
NVV
( CRA
I
c
KAV
PvcCTA
,I
c
PTCCDA ,I
psq~SS
EAV
QVV
EAF
PTC EAP
/
\
KAKASS
COOH
Tissue distribution CD62E is expressed of CD62~ in plasma ......
[
!
!
! !
expressed on endothelial cells at sites of inflammation 3. It is not on leucocytes. Various inflammatory mediators induce expression on cultured endothelial cells 3. Soluble forms of CD62F. are present 4
Structure CD62r is a member of the selectin family of cell surface molecules (along with CD62L and CD62P) 3. The genes encoding these protein lie within 300 kb of each 2,s. CD62E consists of an N-terminal C-type lectin domain, an EGFlike domain, six complement control protein domains, an eight amino acid spacer, a transmembrane sequence and a cytoplasmic domain a. The crystal structure of a fragment comprising the C-type lectin and EGF domains has been solved, revealing that these two domains are not intimately associated z.
Ligands and associated molecules Like CD62L and CD62P, CD62E binds with low affinity to oligosaccharide sequences related to sialylated Lewis x (sLex, CD15s) through its C-type lectin domain s. The carbohydrate binding site on this domain has been localized to a region surrounding the Ca ~§ binding site 7. CD62E binds particularly well to 3-sialyl di-Le x (comprising sLe x followed by an additional Lex unit), which is an unusual structure found only on N-glycans 9. One glycoprotein ligand for CD62E is a particular glycoform of the ubiquitous
Z9~
CD62E
protein ESL-1 which is restricted to myeloid cells lo. Binding requires Nglycans on ESL-1 containing both sialic acid and fucose lo. CD62E also binds to the CD62P ligand CD 162 (PSGL-1) 11. Activation of endothelial cells leads to association of CD62E with the actin cytoskeleton through its cytoplasmic portion 12.
Function Like other selectins, CD62E is an adhesion molecule that contributes to the initial tethering and rolling of leucocytes on endothelial surfaces, a prerequisite for leucocyte extravasation into tissues 3. CD62E (like CD62L) contributes to the later stages of leucocyte influx into inflamed tissues. Although no abnormality has been detected in CD62E-deficient mice, these mice are unusually susceptible to blocking of CD62P, suggesting overlapping but complementary roles for these molecules 3. In support of this, mice deficient in both CD62E and CD62P have severe defects in leucocyte recruitment to inflammatory sites and are susceptible to opportunistic bacterial infections 13. Database accession numbers Human Mouse Rat
PIR
SWISSPROT
EMBL/GENBANK
REFERENCE
A32606
P 16581 Q00690 P98105
M30640 M80778 L25527
6 14 unpublished
Amino acid sequence of human CD62E MIASQFLSAL TLVLLIKESG A WSYNTSTEAM TYDEASAYCQ QRYTHLVAIQ NKEEIEYLNS ILSYSPSYYW IGIRKVNNVW VWVGTQKPLT EEAKNWAPGE PNNRQKDEDC VEIYIKREKD
-i 50 i00
LKCEQIVNCT ALESPEHGSL VCSHPLGNFS YNSSCSISCD RGYLPSSMET
200
CEEGFELMGA AGEFTFKSSC NPERGYMNCL KPTCEAVRCD QLECTSQGQW EGWTLNGSAA APFLLWLRKC
300 350 400 450 500 550 589
VGMWNDERCS KKKLALCYTAACTNTSCSGH GECVETINNY TCKCDPGFSG MQCMSSGEWS APIPACNVVE CDAVTNPANG FVECFQNPGS FPWNTTCTFD QSLQCTSSGN NFTCEEGFML PSASGSFRYG AVHQPPKGLV TEEVPSCQVV RTCGATGHWS LRKAKKFVPA
WDNEKPTCKA QGPAQVECTT SSCEFSCEQG RCAHSPIGEF KCSSLAVPGK GLLPTCEAPT SSCQSLESDG
VTCRAVRQPQ QGQWTQQIPV FVLKGSKRLQ TYKSSCAFSC INMSCSGEPV ESNIPLVAGL SYQKPSYIL
NGSVRCSHSP CEAFQCTALS CGPTGEWDNE EEGFELHGST FGTVCKFACP SAAGLSLLTL
References
1 z 3 4 s 6 7 s 9
~.9~
Collins, T. et al. {1991) J. Biol. Chem. 266, 2466-2473. Watson, M.L. et al. (1990) J. Exp. Med. 172, 263-272. "redder, T.E et al. (1995) FASEB J. 9, 866-873. Gearing, A.J.H. and Newman, W. {1993) Immunol. Today 14, 506-512. Oakey, R.J. et al. (1992) Hum. Mol. Genet. 1,613-620. Bevilacqua, M.P. et al. (1989) Science 243, 1160-1165. Graves, B.J. et al. (1994) Nature 367, 532-538. Varki, A. (1994) Proc. Natl Acad. Sci. USA 91, 7390-7397. Patel, T.P. et al. (1994) Biochemistry 33, 14815-14824.
150
250
~o 11 12 13 ~4
Steegmaier, M. et al. (1995) Nature 373, 615-620. McEver, R.P. et al. (1995)J. Biol. Chem. 270, 11025-11028. Yoshida, M. et al. (1996) J. Cell Biol. 133, 445-455. Frenette, P.S. et al. (1996)Cell 84, 563-574. Weller, A. et al. (1992) J. Biol. Chem. 267, 15176-15183.
~.97
CD62L
LECAM-1, LAM-1, L-selectin NH2
Other n a m e s Lymph node homing receptor MEL- 14 antigen Leu-8 TQ1
~
l
E Molecular weights Polypeptide 3 7 402 SDS-PAGE reduced unreduced Carbohydrate N-linkedsites O-linked
74 kDa (lymphocytes) 95 kDa (neutrophils) 65 kDa (lymphocytes) ~~ ~~
7 nil
COOH
Human gene location and size 1 q23-25; >30 kb 1
Domains
cRo oAo OP I
[~ I1 Exon boundaries MIF CDF i
"
E
E E E
CL
~~ ~~
I I TAS
e
PocCEP pT,OEP ,[ [ I I FVl
C
I I QVl
PIC, C I [TM[C~'] I1 h QKL KGK
Tissue distribution Most haematopoietic cells express CD62L at some stage of differentiation2. The majority of B and naive T cells express CD62L but only subpopulations of memory T and NK cells are CD62L positive. Subpopulations of immature and mature thymocytes express CD62L. Most monocytes, neutrophils and eosinophils express CD62L. CD62L is rapidly lost upon activation of lymphocytes and neutrophils as a result of proteolytic cleavage 2. A soluble form of CD62L is present at high levels in the blood a.
Structure CD62L is a member of the selectin family of cell surface molecules (which includes CD62E and CD62P2). Its gene lies between the CD62P and CD62E genes, and all three loci are within 300 kb of each other 4's. It consists of an N-terminal lectin C-type domain, an EGFSF domain, two CCP domains, a 15 amino acid spacer containing the proteolytic cleavage site (between Lys283 and Ser284), a transmembrane sequence and a short cytoplasmic domain. The short cytoplasmic domain bears no sequence similarity to the cytoplasmic domains of CD62P or CD62E but is highly conserved across species. The N-terminus of mature murine CD62L has been determined by protein sequencing.
E
t98
CD62L
!
! ! I
!
! !
!
I i
/
Ligands and associated molecules Like CD62E and CD62P, CD62L binds with low affinity to anionic oligosaccharide sequences related to sialylated Lewis x (sLex, CD15s) through its C-type lectin domain 6. CD62P and CD62L also bind various structurally unrelated anionic carbohydrates such as heparan sulfate and sulfatides 6. CD62L binds particularly well to carbohydrates present on certain glycoforms of CD34, GlyCAM-1 and MAdCAM-12,7,8. The precise nature of these high-affinity carbohydrate ligands is not known but binding requires fucosylation, sialylation and sulfation, suggesting a role for structures related to sulfated sLe xS. CD62L also binds CD162 (PSGL-1)9. The 11 C-terminal residues of the 17 amino acid cytoplasmic domain, which are essential for CD62L adhesion, mediate association with a complex of cytoskeletal proteins which includes a-actinin lo. These residues are not required for localization of CD62L to microvilli (see below).
Function Like other selectins, CD62L mediates the initial tethering and rolling of leucocytes on endothelial surfaces, which is a prerequisite for leucocyte extravasation from the blood into tissues 2. More specifically, CD62L is important for the homing of naive lymphocytes via high endothelial venules to peripheral lymph nodes and Peyer's patches 11. CD62L also contributes, along with other selectins, to the recruitment of leucocytes from the blood to areas of inflammation 2. Unlike CD62P, CD62L contributes mainly to the later recruitment (after 1 h), presumably due to delay in the induction of suitable endothelial ligands 2. The localization of CD62L to the tips of surface microvilli is required for optimal CD62L-mediated attachment and rolling under flow 12. CD62L can also mediate neutrophil-neutrophil interactions through recognition of CD162 9. Ligation of CD62L can stimulate its proteolytic release from the cell, which may be important for high rolling velocities 13 Ligation of CD62L on T cells with soluble GlyCAM-1 stimulates adhesion through f12 (CD18)integrins 14 Database accession numbers Human Mouse Rat
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A33912 A 3 2 3 75 $23936
P14151 P 1833 7 P30836
M25280 M36005 D10831
1 7 is
Amino acid sequence of human CD62L MIFPWKCQST WTYHYSEKPM IGIRKIGGIW AGKWNDDACH PQCQFVIQCE TTCGPFGNWS CSEGTELIGK AVMVTAFSGL
QRDLWNIFKL NWQRARRFCR TWVGTNKSLT KLKAALCYTA PLEAPELGTM SPEPTCQVIQ KKTICESSGI AFIIWLARRL
WGWTMLCCDF DNYTDLVAIQ EEAENWGDGE SCQPWSCSGH DCTHPLGNFS CEPLSAPDLG WSNPSPICQK KKGKKSKRSM
LAHHGTDC NKAEIEYLEK PNNKKNKEDC GECVEIINNY FSSQCAFSCS IMNCSHPLAS LDKSFSMIKE NDPY
TLPFSRSYYW VEIYIKRNKD TCNCDVGYYG EGTNLTGIEE FSFTSACTFI GDYNPLFIPV
-I 50 i00 150 200 250 300 334
).9~
:
.
!
,
i
'
E
30(
References 1 0 r d , D.C. et al. (1990) J. Biol. Chem. 265, 7760-7767. 2 Tcddcr, T.F. et al. (1995) FASEB J. 9, 866-873. a Gearing, A.J.H. and Newman, W. (1993) Immunol. Today 14, 506-512. a Watson, M.L. et al. (1990)J. Exp. Med. 172, 263-272. s Oakey, R.J. et al. (1992) Hum. Mol. Genet. 1,613-620. 6 Varki, A. (1994) Proc. Natl Acad. Sci. USA 91, 7390-7397. 7 Lasky, L.A. (1995) Annu. Rev. Biochem. 64, 113-139. s Rosen, S.D. and Bertozzi, C.R. (1996) Curr. Biol. 6, 261-264. 9 Walcheck, B. et al. (1996)J. Clin. Invest. 98, 1081-1087. lo Pavalko, F.M. et al. (1995)J. Cell Biol. 129, 1155-1164. 11 Butcher, E.C. and Picker, L.J. (1996) Science 272, 60-66. 12 von Andrian, U.H. ct al. (1995) Cell 82, 989-999. la Walchcck, B. et al. (1996) Nature 380, 720-723. la Hwang, S.T. et al. (1996) J. Exp. Med. 184, 1343-1348. is Watanabe, T. et al. (1992) Biochim. Biophys. Acta 1131, 321-324.
P-selectin, PADGEM, GMP-140 Molecular weights Polypeptide
86 244
SDS-PAGE reduced 140 kDa non-reduced 140 kDa
Carbohydrate E
N-linked sites O-linked
12 unknown
Human gene location 1q21-q241; >50 kb ~ Domains Exon boundaries
CQN [
cLEHCi QD E cPECCGEI[ ll C pQcCPPI i1 [
~J02~!
YVR
LAA
pTq ?EP
PvcCQD
PEcCITP
EAI
QAL
QAI
PTcCPA ,I
PAciSE
KGI
RAV
TAS
PVC
'1
SEt_
CQH ~I KAV
CPE 1 EAI
I
11
,I
PMC
,']
PTC !1 QAG
I1 12 KDD HSH
COOH Tissue distribution CD62P is present on megakaryocytes, activated platelets and activated endothelial cells. It is stored in secretory granules in platelets and endothelial cells and is rapidly translocated to the plasma membrane upon activation of these cells 3. Expression is transient as CD62P is rapidly internalized and then degraded in lysosomes s. Soluble forms of CD62P originating from alternative splicing and possibly proteolytic cleavage are detectable in plasma at levels of 0.1-1 #g/mL a
Structure CD62P is a member of the selectin family of cellular adhesion molecules which includes CD62E (E-selectin)and CD62L (L-selectin) 3's. The extracellular region contains an N-terminal C-type lectin domain, followed by an EGF domain and nine CCP domains 6. Alternative splicing results in variants lacking the seventh CCP domain or encoding a soluble, secreted form of CD62P 2's. The extracellular portion forms an extended rod approximately 48 nm long 7. The short cytoplasmic domain, which bears no homology to the CD62E or CD62L cytoplasmic domains, is phosphorylated on Ser, Thr, Tyr and His residues
~01
CD62P
V1
following platelet activation 8. The N-terminus of CD62 has been determined by protein sequencing 6.
Ligands and associated molecules Like CD62r. and CD62L, CD62P binds with low affinity to anionic oligosaccharide sequences related to sialylated Lewis x (sLe x, CD15s) through its C-type lectin domain 9. CD62P and CD62L also bind various structurally unrelated anionic carbohydrates such as heparan sulfate and sulfatides 9. The major CD62P ligand on neutrophils is the mucin-like cell surface glycoprotein CD162 (PSGL-1). Optimal binding to CD162 requires O-linked sLeX-like structures as well as sulfated NH~-terminal tyrosines 1o. The conserved EGF domain of P-selectin is required for optimal binding, and it has been suggested that it may contain a second binding site which interacts with the sulfotyrosine-containing region on CD1621~ Mouse CD62P has been reported to bind certain glycoforms of the mucin CD2411.
Function Endothelial CD62P mediates the rolling of neutrophils on activated endothelium, a prerequisite for recruitment of neutrophils into areas of inflammation. CD62P appears to be particularly important early on (within minutes) with other selectins contributing at later stages 12. Endothelial CD62P also mediates rolling of platelets and some T lymphocyte subsets. CD62P on adherent platelets promotes leucocyte accumulation in thrombi 12. Platelet CD62P can also contribute indirectly to T lymphocyte homing to high endothelial venules (HEVs); activated CD62P + platelets attached to lymphocytes can mediate lymphocyte rolling through an interaction between platelet CD62P and endothelial peripheral node addressin la. Mice deficient in CD62P show reduced neutrophil rolling and delayed recruitment into inflamed tissues 12. CD62P and CD62~ have overlapping roles since mice deficient in both CD62P and CD62~ show much more severe defects in leucocyte extravasation and are susceptible to opportunistic infections 14 Database accession numbers Human Mouse Rat
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
A30359 A42755
P 16109 Q01102 P98106
M25322 M87861 L23088
6 is 16
Amino acid sequence of human CD62 MANCQIAILY WTYHYSTKAY IGIRKNNKTW PGKWNDEHCL PECEYVRECG LECLASGIWT CEEGFALVGP TAFAYGSSCK
~02
QRFQRVVFGI SWNISRKYCQ TWVGTKKALT KKKHALCYTA ELELPQHVLM NKPPQCLAAQ EVVQCTASGV FECQPGYRVR
SQLLCFSALI NRYTDLVAIQ NEAENWADNE SCQDMSCSKQ NCSHPLGNFS CPPLKIPERG WTAPAPVCKA GLDMLRCIDS
SELTNQKEVA NKNEIDYLNK PNNKRNNEDC GECLETIGNY FNSQCSFHCT NMICLHSAKA VQCQHLEAPS GHWSAPLPTC
A VLPYYSSYYW VEIYIKSPSA TCSCYPGFYG DGYQVNGPSK FQHQSSCSFS EGTMDCVHPL EAISCEPLES
-i 50 i00 150 200 250 300 350
CD62P PVHGSMDCSP VCQALQCQDL CLATGNWNSV EGYSLSGPER FNVGSTCHFS CPALTTPGQG WTAVTPACRA GSAQTACQEN LALLRKRFRQ
I
SLRAFQYDTN PVPNEARVNC PPECQAIPCT LDCTRSGRWT CNNGFKLEGP TMYCRHHPGT VKCSELHVNK GHWSTTVPTC KDDGKCPLNP
CSFRCAEGFM SHPFGAFRYQ PLLSPQNGTM DSPPMCEAIK NNVECTTSGR FGFNTTCYFG PIAMNCSNLW QAGPLTIQEA HSHLGTYGVF
LRGADIVRCD SVCSFTCNEG TCVQPLGSSS CPELFAPEQG WSATPPTCKG CNAGFTLIGD GNFSYGSICS LTYFGGAVAS TNAAFDPSP
NLGQWTAPAP LLLVGASVLQ YKSTCQFICD SLDCSDTRGE IASLPTPGLQ STLSCRPSGQ FHCLEGQLLN TIGLIMGGTL
400 450 500 550 600 650 700 750 789
Two alternatively spliced variants are shown. One variant lacks exon 11 which encodes the seventh CCP domain (shown in italics). The other variant lacks exon 14 (shown in bold) and encodes a soluble form of CD62P.
References 1 2 3 4 s 6 7 8 9 lo 11 lz 13 14 is 16
Watson, M.L. et al. (1990)J. Exp. Med. 172, 263-272. Johnston, G.I. et al. (1990) J. Biol. Chem. 265, 21381-21385. McEver, R.P. et al. (1995) J. Biol. Chem. 270, 11025-11028. Ishiwata, N. et al. (1994) J. Biol. Chem. 269, 23708-23715. Lasky, L.A. (1995)Annu. Rev. Biochem. 64, 113-139. Johnston, G.I. et al. (1989) Cell 56, 1033-1044. Ushiyama, S. et al. (1993) J. Biol. Chem. 268, 15229-15237. Crovello, C.S. et al. (1995) Cell 82, 279-286. Varki, A. (1994) Proc. Natl Acad. Sci. USA 91, 7390-7397. Rosen, S.D. and Bertozzi, C.R. (1996) Curr. Biol. 6, 261-264. Sammar, M. et al. (1994) Int. Immunol. 6, 1027-1036. Tedder, T.F. et al. (1995) FASEB J. 9, 866-873. Diacovo, T.G. et al. (1996) Science 273, 252-255. Frenette, P.S. et al. (1996)Cell 84, 563-574. Weller, A. et al. (1992) J. Biol. Chem. 267, 15176-15183. Auchampach, J.A. et al. (1994)Gene 145, 251-255.
~0~
ME491, MLA1, PTLGP40, granulophysin Molecular weights Polypeptide 25 505 SDS-PAGE unreduced
53 kDa
Carbohydrate N-linked sites O-linked
TTT~T ~l~
3 nil
T~II ~ ~ NH2
Human gene location and size 12q12-q13; 4 kb 1
Tissue distribution CD63 is widely distributed on the surface and interior on both haematopoietic and non-haematopoietic cells. Expression is high on monocytes, macrophages and activated platelets, and low on lymphocytes and granulocytes 2.
Structure CD63 is a member of the TM4 superfamily and is predicted to have four transmembrane regions, short cytoplasmic N- and C-termini, and two extracellular regions (reviewed in ref. 3).
Ligands and associated molecules CD63 associates non-covalently with the TM4SF molecules CD9 and CD81, and with the integrins CD29/CD49c (VLA-3), CD29/CD49d (VLA-4)and CD29/CD49f (VLA-6) 4"s. In human neutrophils, CD63 associates with the integrin CD11/CD18 and the Src family protein tyrosine kinases Lyn and Hck 6. In rat lymph node cells, CD63 is associated with a protein tyrosine phosphatase 7. No extracellular ligand has been identified for CD63.
Function CD63 may play a role as a tumour suppressor gene since expression of CD63 in human melanoma cells reduces tumour spread and metastasis 8.
Database accession numbers Human Rat Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
S01418 S16776
P08962 P28648 P41731
X07982 X61654 D 16432
9 lo 11
Amino acid sequence of human CD63 AVEGGMKCVK LLPVVIIAVG IAGYVFRDKV TDWEKIPSMS NVLVVAAAAL
304
FLLYVLLLAF VFLFLVAFVG MSEFNNNFRQ KNRVPDSCCI GIAFVEVLGI
CACAVGLIAV GVGAQLVLSQ CCGACKENYC LMITFAIFLS QMENYPKNNH TASILDRMQA NVTVGCGINF NEKAIHKEGC VFACCLVKSI RSGYEVM
TIIQGATPGS LIMLVEVAAA DFKCCGAANY VEKIGGWLRK
50 i00 150 200 223
CD63
References 1 2 3 * s 6 z 8 9 lo 11
Hotta, H. et al. (1992)Biochem. Biophys. Res. Commun. 185, 436-442. Azorsa, D.O. and Hildreth, J.E.K. (1995) Leucocyte Typing V, 1352-1353. Wright, M.D. and Tomlinson, M.G. (1994) Immunol. Today 15, 588-594. Berditchevski, F. et al. (1996) Mol. Biol. Cell 7, 193-207. Mannion, B.A. et al. (1996)J. Immunol. 157, 2039-2047. Skubitz, K.M. et al. (1996) J. Immunol. 157, 3617-3626. Carmo, A.M. and Wright, M.D. (1995) Eur. J. Immunol. 25, 2090-2095. Radford, K.J. et al. (1995)Int. J. Cancer 62, 631-635. Hotta, H. et al. (1988)Cancer Res. 48, 2955-2962. Nishikata, H. et al. (1992) J. Immunol. 149, 862-870. Miyamoto, H. et al. (1994) Biochim. Biophys. Acta 1217, 312-316.
CD64
FcTRI, high-affinity Fc7 receptor
Molecular weights Polypeptide
SDS-PAGE reduced
A form B2 form
40 833 30 434
A form
72 kDa
Carbohydrate N-linked sites O-linked sites
7 unknown
Human gene location and size
11?? 1???
Three genes, A, B and C are located on lq21.1; N 9.4 kb each 1.
GOOH
Domains Exon boundaries
CEV
Isl
I
WVP VDT
YRC
02
CHA YHG CET u
[
I
RWG
02
I
I~
KEL
02
i
ITMI
0u
I
LGL
KGL (B2 isoform)
Tissue distribution CD64 is constitutively expressed on monocytes and macrophages; expression on monocytes can be strongly upregulated by treatment with interferon 7. Expression of CD64 can be induced by interferon 7 on neutrophils and eosinophils 2-4. CD64 is also expressed on dendritic cells 5. B and C gene transcripts have been identified in monocytic cells. B2 transcripts are detected in all cells that express the A form of CD64, but a B2 protein has not been detected at the protein level 4,s.
Structure The A gene encodes the high-affinity Fc7 receptor, a type I membrane protein with three IgSF extracellular domains 6. A cDNA clone has been identified which encodes a variant of the A form (A')with a different cytoplasmic domain 6, but this alternative sequence is not present in the gene sequence reported in ref. 1. The B and C genes are very similar to the A gene except both contain stop codons in the third IgSF domain, suggesting that they encode soluble proteins with two IgSF domains (see B1 and C sequences below) 1. B and C gene transcripts have been identified in monocytic cells. An alternatively spliced form of the B gene (B2) is similar to the A form but lacks the third IgSF domain x,7. A third alternatively spliced form of the B gene has also been identified but with a cryptic leader sequence 7.
]0~
CD64
Ligand and associated molecules CD64 is the high-affinity Fc7 receptor and binds m o n o m e r i c (as well as aggregated) IgG 1-3 CD64 associates with the 7 chain h o m o d i m e r of Fc receptors 8,9, possibly through the interaction between the His in CD64 and the Asp in the ? chain in their respective transmembrane regions 1~ The kinases Hck and Lyn can also be co-immunoprecipitated with CD64 from cell lysates 11.
Function
I
:
i
CD64 plays a role in antibody-dependent cytotoxicity, clearance of i m m u n e complexes, and phagocytosis of IgG opsonized targets. It also mediates release of cytokines including IL-1, IL-6, and TNFa 2 - 4 . The association of CD64 with the Fc receptor 7 chain h o m o d i m e r is required for its signal transduction activity. It has been shown that ligation of CD64 results in an 11 increase in the Hck and Lyn kinase activities . Although a B2 form has not been identified at the protein level, transfection of the B2 cDNA into COS cells showed binding to aggregated IgG but not m o n o m e r i c IgG 7. A nonsense m u t a t i o n in the CD64 A gene has been indentified 12. Because individuals homozygous for this m u t a t i o n are apparently healthy, and the level of the B2 transcript is consistently high in their blood monocytes, it was speculated that the B2 form of CD64 may be functional 12. Database accession PIR
Human Human Human Human Human Mouse
A A' A B C
numbers SWISSPR OT
EMBL/GENBANK
REFERENCE
S03018 S03019
P12314 P12315
A46480
P26151
X14356 X14355 M91645 M91646 M91647 M31314
6 6 s s s 13,14
Amino acid sequence of human CD64 A MWFLTTLLLWVPVDG A'
-1 -1
B2 B1 C
-i -I -i
A QVDTTKAVIT LQPPWVSVFQ EETVTLHCEV LHLPGSSSTQ WFLNGTATQT A' S B2 BI C
50 50 50 50 50
A S T P S Y R I T S A SVNDSGEYRC Q R G L S G R S D P I Q L E I H R G W L L L Q V S S R V F T A' B2 M BI M C -P
I00 i00 i00 i00 i00
A EGEPLALRCH A' -B2 B1 C
AWKDKLVYNV
LYYRNGKAFK
FFHWNSNLTI
LKTNISHNGT
A YHCSGMGKHR A' B2 BI C
YTSAGISV.TV KELFPAPVLN ASVTSPLLEG NLVTLSCETK . Q Y - - -. . . . . . . . . . . . . . . . . . . . . . . . . . . . . QYQYG- I W S P
150 150 150 150 150 200 201 172 201 195
A LLLQRPGLQL YFSFYMGSKT LRGRNTSSEY QILTARREDS GLYWCEAATE A' B2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B1
250 250 172 259
A DGNVLKRSPE LELQVLGLQL A' B2 .... . .... . . . . . . .
PTPVWFHVLF
YLAVGIMFLV
NTVLWVTIRK
300 3O0 206
A ELKRKKKWDL EISLDSGHEK KVTISLQEDR A' GQA LEAPTQGCA B2 N-
HLEEELKCQE
QKEEQLQEGV
350 329 256
A HRKEPQGAT B2
359 265
The sequence of the A form is from ref. 1, and that of the A' from ref. 6. Amino acids identical to the top sequence are indicated by dashes. Gaps in the alignment are indicated by dots.
!
i! |
!
!
! !
i i i
]
30~
References 1 z 3 4 s 6 7 s 9 lo 11 lz 13 14
Ernst, L.K. et al. (1992) J. Biol. Chem. 267, 15692-15700. Unkeless, J.C. et al. (1988)Annu. Rev. Immunol. 6, 251-281. Ravetch, J.V. and Kinet, J.P. (1991)Annu. Rev. Immunol. 9, 457-492. van de Winkel, J.G.J. and Capel, EJ.A. (1994) Immunol. Today 14, 215-221. Fanger, N.A. et al. (1996)J. Immunol. 157, 541-548. Allen, J.M. and Seed, B. (1989) Science 243, 378-381. Porges, A.J. et al. (1992) J. Clin. Invest. 90, 2102-2109. Scholl, P.R. and Geha, R.S. (1993)Proc. Natl Acad. Sci. USA 90, 8847-8850. Ernst, L.K. et al. (1993) Proc. Natl Acad. Sci. USA 90, 6023-6027 Morton, H.C. et al. (1995) J. Biol. Chem. 270, 29781-29787. Wang, A.V.T. et al. (1994) J. Exp. Med. 180, 1165-1170. van de Winkel, J.G.J. et al. (1995) J. Immunol. 2896-2903. Sears, D.W. et al. (1989) J. Immunol. 144, 371-378. Osman, N. et al. (1992)J. Immunol. 148, 1570-1575.
Ceramide dodecasaccharide 4c
Tissue distribution CD65 is restricted to the myeloid lineage, with expression on most granulocytes and a proportion of monocytic cells. T cells, B cells, thymocytes, platelets and non-haematopoietic cells do not express CD651,2
Structure CD65 mAbs were originally shown to bind to a ceramide dodecasaccharide 3. The antigen consists of the oligosaccharide sequence: NeuAc~2 --. 3Gal]?l --* 4GlcNAcfl 1 --. 3Galfll --* 4GlcNAc(Fuc~l --, 3)ill --. 3Gal//--.
I
Recently, CD65 mAbs have been shown to recognize distinct epitopes of this carbohydrate 2. The mAbs VIM-8 and VIM-11 recognize the following asialylated epitope that is expressed on granulocytes only:
I
Galfll --* 4GlcNAcfll --. 3Galfll --, 4GlcNAc(Fuc~l --. 3)--*
!
In contrast, mAb VIM-2 binds to the following sialylated epitope, which is expressed on both granulocytes and monocytes:
(CD65 epitope)
NeuAc~2 --* 3Galfll --. 4GlcNAcfll --. 3Galfll --, 4GlcNAc(Fuc~l --* 3)--. (CD65s epitope)
Function Unknown.
Comments CD65 was retained as a provisional CDw antigen at the Fifth Leucocyte Typing Workshop because of the heterogeneity of the CD65 mAbs in terms of cellular and antigenic reactivity 1. Since the CD65 mAbs have now been separated into two groups and their precise epitopes determined (see above), the VIM-2 mAb has been placed in a new cluster named CD65s 2.
References 1 Knapp, W. et al. (1995)Leucocyte Typing V, 876-882.
2 Kniep, B. et al. (1996) J. Biochem. 119, 456-462. 3 Macher, B.A. et al. (1988) J. Biol. Chem. 253, 10186-10191.
30~
CD66
Carcinoembryonic antigen (CEA)family
Members CD66a CD66b
BGP, biliary glycoprotein, NCA- 160 CGM6, W272, NCA-95 (previously CD67) NCA, NCA-90 CGM1 CEA
CD66c CD66d CD66e
Molecular weights Polypeptide CD66a (BGPa) CD66b CD66c CD66d (CGMla) SDS-PAGE reduced
53 796 31493 31247 23391
CD66a CD66b CD66c CD66d
160-180 kDa 95-100 kDa 90-95 kDa -30 kDa (predicted)
CD66a . v~~ _~
CD66b t---41
CD66c
NH2
s
NH2
C2 ~
C2 ~
CD66d
YYYYYYTYTTTTTT;TTYtTTTTT;TTYYYYYYY?YYYY COOH
Carbohydrate N-linked sites
COOH
CD66a CD66b CD66c CD66d
20 11 12 2
unknown
O-linked Gene location 19q13.1-2 VHN
Domains CD66a
Isl
I
v
YTL
CEP
I
I
CYA
[ YEC c= I
YTC C2
CST
I
YWC C2
ITMIcY1
CD66
The CEA family in humans and other mammals The human CEA molecules are a family of closely related, IgSF domaincontaining glycoproteins encoded by a dense cluster of at least 18 genes (within - 1 . 2 M b ) w i t h 65-75% sequence identity. Based on sequence similarity and gene proximity this family can be divided into two subgroups, with 80-95 % sequence identity within each subgroup 1. The CEA subgroup (>7 genes)encodes predominantly cell surface molecules, whereas the pregnancy-specific glycoprotein (PSG)subgroup (> 11 genes)encodes secreted molecules 1'2. CEA and PSG subgroups have also been identified in the mouse and the rat. However molecules within the subgroups show greater intraspecies (>80%) than interspecies (-60%) identity, making it impossible to identify species homologues. Indeed homologues may not exist since sequence analysis suggests that there has been independent and parallel evolution of the CEA and PSG subgroups following the divergence of rodents and humans 1
Tissue distribution Products of four of the seven functional CEA family genes (CD66a-d) are known to be expressed by haematopoietic cells 1-3. CD66e (CEA)has not been detected on haematopoietic cells and so is not considered further herea. Expression of these molecules on haematopoietic cells is generally restricted to the myeloid lineage, particularly in the case of CD66b a. These molecules are present at low levels on resting mature granulocytes but expression increases rapidly following activation with inflammatory agonists 2, probably as a result of exocytosis from storage granules 2. CD66a, c and d are detected on some macrophages in tissue sections and CD66a has been reported on T cells and a subpopulation of activated NK cells 4. CD66a, c and d are also expressed on a variety of epithelia and CD66a had been detected on some endotheliaa'S. Soluble forms of CD66b and CD66c are present in plasma 6
Structure The extracellular portions of all CD66 molecules possess an N-terminal V-set IgSF domain which, like members of the CD2 family, lacks the canonical inter-fl-sheet disulfide, followed by a variable number of C2-set IgSF domains 1. CD66a, b and c are heavily glycosylated, with more than 60% of the mass contributed by N-linked glycans, and they bear sialylated Lex (sLe x, CD15s) structures 2. CD66b and CD66c have GPI lipid anchors, whereas CD66a and CD66d have conventional transmembrane anchors as well as cytoplasmic domains. The CD66a and CD66d cytoplasmic domains each contain two YxxL/M motifs. In CD66d they resemble conventional ITAM-like motifs (YxxLxTYxxM)whereas in CD66a they are spaced further apart (VxYxxLx)lIxYxxV) and resemble motifs which bind tyrosine phosphatases such as SHP-1 and -2. Activation of neutrophils leads to phosphorylation of tyrosine residues in the CD66a cytoplasmic domain 7. Multiple splice variants have been identified for CD66a/BGP (at least 11)
311
CD66
and CD66d/GCM1 (at least 3), the largest of which (termed BGPa and GCMla) are shown below 1'2
i
Ligands and associated molecules CD66 molecules can mediate cell-cell adhesion by homotypic interactions and/or by heterotypic interactions with other CD66 molecules. CD66a, c and e exhibit both heterotypic and homotypic interactions whereas CD66b binds only heterotypically to CD66c and CD66e s,9. The binding sites for these interactions lie within their N-terminal V-set IgSF domains s,9. Unlike most IgSF adhesion molecules, homotypic adhesion involving CD66a is reportedly both cation and temperature dependent s. Carbohydrate structures on CD66c mediate binding to E. coli type 1 fimbriae 1~ and may be important in presenting sLeX-related ligands to the endothelial cell adhesion molecule CD66e 2. CD66a associates with the cytoplasmic tyrosine kinases Src, Lyn and Hck 7,11. Phosphorylated YxxL motifs in the cytoplasmic domain of CD66a bind to the SH2 domain of Src and activate the enzyme. MAbs specific for the GPI-anchored molecules CD66b and CD66c coprecipitate Lyn and Hck z.
Function
i.................
The ability of CD66 molecules to mediate cell-cell adhesion and their rapid upregulation following activation suggests that they may contribute to the interactions of activated granulocytes with each other or with endothelia or epithelia. However, direct functional evidence for such a role is lacking. Crosslinking of CD66 molecules with antibodies can stimulate integrinmediated neutrophil adhesion to endothelial cells, suggesting a possible signalling role 12. Database accession numbers PIR
Human Human Human Human
CD66a (BGPa) CD66b CD66c CD66d (CGMla)
SWISSPR OT
A 3 2 1 6 4 P13688 S13524 P31997 P40199 $33323 P40198
EMBL/GENBANK
REFERENCE
X16354 X52378 M29541 L00692
1 1 1 1
Amino acid sequence of CD66a (BGPa) MGHLSAPLHR QLTTESMPFN GTQQATPGPA TGQFHVYPEL LPVSPRLQLS YGPDTPTISP FIPNITVNNS TTVTGDKDSV PVKREDAGTY IGVVALVALI PNKMNEVTYS
VRVPWQGLLL VAEGKEVLLL NSGRETIYPN PKPSISSNNS NGNRTLTLLS SDTYYRPGAN GSYTCHANNS NLTCSTNDTG WCEVFNPISK AVALACFLHF TLNFEAQQPT
TASLLTFWNP VHNLPQQLFG ASLLIQNVTQ NPVEDKDAVA VTRNDTGPYE LSLSCYAASN VTGCNRTTVK ISIRWFFKNQ NQSDPIMLNV GKTGRASDQR QPTSASPSLT
PTTA YSWYKGERVD NDTGFYTLQV FTCEPETQDT CEIQNPVSAN PPAQYSWLIN TIIVTELSPV SLPSSERMKL NYNALPQENG DLTEHKPSVS ATEIIYSEVK
GNRQIVGYAI IKSDLVNEEA TYLWWINNQS RSDPVTLNVT GTFQQSTQEL VAKPQIKASK SQGNTTLSIN LSPGAIAGIV NHTQDHSNDP KQ
-i 50 i00 150 200 250 300 350 400 450 492
CD66
Amino
acid sequence
MGPISAPSCR QLTIEAVPSN SNQQITPGPA TGQFSVHPET LPVSPRLQLS YGPDAPTISP FIPNITTKNS AVVQGSSPGL
9.
Amino
acid sequence
MGPPSAPPCR KLTIESTPFN GTQQATPGPA TGQLHVYPEL LPVSPRLQLS YGPDGPTISP FIPNITVNNS SAPVLSAVAT
Amino
WRIPWQGLLL AAEGKEVLLL YSNRETIYPN PKPSISSNNS NGNRTLTLLS SDTYYHAGVN GSYACHTTNS SARATVSIMI
LHVPWKEVLL VAEGKEVLLL YSGRETIYPN PKPSISSNNS NGNMTLTLLS SKANYRPGEN GSYMCQAHNS VGITIGVLAR
acid sequence
MGPPSASPHR KLTIESMPLS GTQQATPGAA TGQFHVYQEN QRDLKEQQPQ KHDTNIYCRM
ECIPWQGLLL VAEGKEVLLL YSGRETIYTN APGLPVGAVA ALAPGRGPSH DHKAEVAS
of C D 6 6 b TASLFTFWNP VHNLPQDPRG ASLLMRNVTR NPVEDKDAVA VTRNDVGPYE LNLSCHAASN ATGRNRTTVR GVLARVALI
PTTA YNWYKGETVD NDTGSYTLQV FTCEPETQNT CEIQNPASAN PPSQYSWSVN MITVSD
ANRRIIGYVI IKLNLMSEEV TYLWWVNGQS FSDPVTLNVL GTFQQYTQKL
PTTA YSWYKGERVD NDTGFYTLQV FTCEPEVQNT CEIQNPASAN PPAQYSWFIN MITVSG
GNSLIVGYVI IKSDLVNEEA TYLWWVNGQS RSDPVTLNVL GTFQQSTQEL
-i 50 i00 150 200 250 286 +29
of C D 6 6 c TASLLTFWNP AHNLPQNRIG ASLLIQNVTQ NPVEDKDAVA VKRNDAGSYE LNLSCHAASN ATGLNRTTVT VALI
-i 50 i00 150 200 250 286 +24
of C D 6 6 d ( C G M l a ) TASLLNFWNP VHNLPQHLFG ASLLIQNVTQ GIVTGVLVGV SSAFSMSPLS
PTTA YSWYKGERVD NDIGFYTLQV ALVAALVCFL SAQAPLPNPR
GNSLIVGYVI IKSDLVNEEA LLAKTGRTSI TAASIYEELL
-i 50 i00 150 200 218
References
1 Thompson, J.A. et al. (1991) J. Clin. Lab. Anal. 5, 344-366. z Nagel, G. and Grunert, F. (1995)Tumor Biology 16, 17-22. a Schlossman, S.F. et al. (eds) (1995) Leucocyte Typing V: white cell differentiation antigens. Oxford University Press, Oxford. 4 M611er, M.J. et al. (1996)Int. J. Cancer 65, 740-745. s Prall, F. et al. (1996) J. Histochem. Cytochem. 44, 35-41. 6 Grunert, F. et al. (1995)Int. J. Cancer 63, 349-355. 7 Skubitz, K.M. et al. (1995) J. Immunol. 155, 5382-5390. 8 Teixeira, A.M. et al. (1994) Blood 84, 211-219. 9 Yamanaka, T. et al. (1996)Biochem. Biophys. Res. Commun. 219, 842-847. lo Sauter, S.L. et al. (1993)J. Biol. Chem. 268, 15510-15516. 11 Br~mmer, J. et al. (1995) Oncogene 11, 1649-1655. lZ Skubitz, K.M. et al. (1996) J. Leukocyte Biol. 60, 106-117.
CD68
Macrosialin (mouse)
Molecular weights Polypeptide
35 367
SDS-PAGE reduced
110 kDa
Carbohydrate N-linked sites O-linked
9 ++
Human gene location and size 17p 13; -2 kb ~
?
COOH
T i s s u e distribution Present in the lysosomes, endosomes and, to a lesser extent, on the surface of macrophages, monocytes, neutrophils, basophils and large lymphocytes 2'3. The antigen is also expressed in the cytoplasm of a number of cells in nonhaematopoietic tissues, notably in the liver and in renal tubules and glomeruli 2. Not all anti-CD68 antibodies give the same staining patterns, possibly as a result of tissue-specific glycosylation 2.
I
Structure CD68 belongs to a family of acidic, highly glycosylated lysosomal-associated membrane proteins (lamps)that includes lamp-1 (CD107a)and lamp-2 (CD107b) 4s ' . The extracellular domains of lamps are partitioned into two related subdomains by a short linker rich in proline, threonine and serine residues s. In CD68 there is only one domain that is related to the lamps, and this is adjacent to the membrane 4. The N-terminal region, which is very rich in Thr and Ser residues, shares no homology with the lamps 4 and is likely to be heavily O-glycosylated on the basis of comparison with the CD43 antigen. Expression cloning of the CD68 cDNA has produced two sequences that differ with respect to the length of the N-terminal domain and which might arise by alternative splicing 4. The cytoplasmic domain contains a
314
CD68 tyrosine residue which might direct the protein to the lysosomes 4,s. Soluble forms of CD68 have been detected in serum and urine 2.
Ligands and a s s o c i a t e d m o l e u l e s Macrosialin from lysates of mouse peritoneal macrophages and a macrophage cell line has been shown to bind oxidized low-density lipoprotein 6. Whether macrosialin participates in the uptake of oxidized-LDL or binds following its internalization remains to be determined.
Function
=
I 1
While the function of CD68 is unknown, lamps are the major components of lysosomal membranes and may protect the membranes from attack by acid hydrolases s. It is not clear whether the surface expression of CD68 is functionally significant or due to leakage from the lysosomes, as could be the case for the lamps s. Inflammatory agents upregulate macrosialin expression and this is accompanied by changes in the pattern of its glycosylation z.
Comments Expression cloning has identified macrosialin as the mouse homologue of CD684'8. Macrosialin is known to be O-glycosylated and its expression is restricted to macrophages and, at lower levels, dendritic cells (ref. 7 and references therein).
Database accession numbers Human Mouse
PIR A48931 $28587
SWISSPR O T P34810 P31996
EMBL/GENBANK $57235 X68273
REFERENCE 4 8
A m i n o acid s e q u e n c e of h u m a n C D 6 8 MRLAVLFSGA NDCPHKKSAT HGPTTATHNP HPTSNSTATS LQAQIQIRVM HLSFGFMQDL GQSFSCSNSS LPLIIGLILL
LLGLLAAQGT LLPSFTVTPT TTTSHGNVTV PGFTSSAHPE YTTQGGGEAW QQKVVYLSYM IILSPAVHLD GLLALVLIAF
G VTESTGTTSH HPTSNSTATS PPPPSPSPSP GISVLNPNKT AVEYNVSFPH LLSLRLQAAQ CIIRRRPSAY
RTTKSHKTTT QGPSTATHSP TSKETIGDYT KVQGSCEGAH AAKWTFSAQN LPHTGVFGQS QAL
HRTTTTGTTS ATTSHGNATV WTNGSQPCVH PHLLLSFPYG ASLRDLQAPL FSCPSDRSIL
References 1 2 a 4
Greaves, D.R. et al. Manuscript in preparation. Pulford, K.A.F. et al. (1990) Int. Immunol. 2, 973-980. Stockinger, H. (1989) Leucocyte Typing IV, 841-843. Holness, C.L. and Simmons, D.L. (1993)Blood 81, 1607-1613.
s 6 7 8
Fukuda, M. (1991) J. Biol. Chem. 266, 21327-21330. Ramprasad, M.P. et al. (1995) Proc. Natl Acad. Sci. USA 92, 9580-9584. Rabinowitz, S.S. and Gordon, S. (1991)J. Exp. Med. 174, 827-836. Holness, C.L. et al. (1993)J. Biol. Chem. 268, 9661-9666.
-i 50 I00 150 200 250 300 333
CD69
AIM, EA 1, MLR-3, Leu-23
Molecular weights Polypeptide 22 559 SDS-PAGE reduced unreduced
-28 and -32 kDa 55-65 kDa
Carbohydrate N-linked sites O-linked
1 unknown
???I NH2 NH2
Human gene location and size 12p12.3-p13.2; -15 kb 1,2 Domain Exon boundaries
i
'
9
I
CSE
I
MECI
1 END SVG
I0 NWF 12 DMNF
I
Tissue distribution CD69 is widely expressed on hematopoietic cells including lymphocytes, neutrophils, eosinophils, platelets and epidermal Langerhans cells. It is not expressed on resting lymphocytes but is rapidly (within 2h) induced upon activation of B, T and NK cells. CD69 is expressed on CD4 § or CD8 § thymocytes, germinal centre T cells, and T cells in regions of inflammation, consistent with it being a marker of T cell activation 3 - 5
Structure The human and mouse genes for CD69 are encoded within the NK gene complex on chromosomes 12 and 6, respectively 1"4 Like other proteins encoded within the NK gene complex (CD95, CD161, NKG2, and Ly-49)7, CD69 is a type II membrane glycoprotein expressed as a disulfide-linked homodimer 3-s. The extracellular region has a C-type lectin domain. The cytoplasmic domain contain phosphorylation sites for serine/threonine kinases and is constitutively phosphorylated 3-s.
Function CD69 is not expressed on resting peripheral blood lymphocytes but is amongst the earliest antigens to appear upon activation of lymphocytes, appearing within 2 h after stimulation. Expression requires mRNA synthesis, and is very transient, as the result of rapid degradation of the mRNA (which has a functional AU-rich motif in the 3' untranslated region) s. Anti-CD69 mAbs can activate T and B cells, NK cells, platelets and neutrophils, suggesting that it may function as a signalling molecule 3-5.
CD69
Database accession numbers Human Mouse
PIR
SWISSPROT
EMBL/GENBANK
REFERENCE
JH0822
Q07108 P37217
L07555 L23638
3,4 5
A m i n o a c i d s e q u e n c e of h u m a n C D 6 9 MSSENCFVAE ITILIIALIA TVKRSWTSAQ HPWKWSNGKE
NSSLHPESGQ LSVGQYNCPG NACSEHGATL FNNWFNVTGS
ENDATSPHFS QYTFSMPSDS AVIDSEKDMN DKCVFLKNTE
TRHEGSFQVP HVSSCSEDWV FLKRYAGREE VS SME CE KNL
VLCAVMNVVF GYQRKCYFIS HWVGLKKEPG YWI CNKPYK
References 1 2 3 4 s 6 z
Schnittger, S. et al. (1993) Eur. J. Immunol. 23, 2711-2713. Santis, A.G. et al. (1994) Eur. J. Immunol. 24, 1692-1697. Ham an n , J. et al. (1993)J. Immunol. 150, 4920-4927. L6pez-Cabrera, M. et al. (1993)J. Exp. Med. 178, 537-547. Ziegler, S.F. et al. (1993) Eur. J. Immunol. 23, 1643-1648. Ziegler, S.F. et al. (1994) J. Immunol. 152, 1228-1236. Gumperz, J.E. and Parham, P. (1995) Nature 378, 245-248.
8 Santis, A.G. et al. (1995) Eur. J. Immunol. 25, 2142-2146.
50 i00 150 199
CD27L Molecular weights Polypeptide
21 147
T
Carbohydrate N-linked sites O-linked
2 unknown
TT
Human gene location 19p13.31,z Domains
WQG
ICu
I
I
T
FFG
I
NH2
I
Tissue distribution CD70 is expressed on most activated B cells and some activated T cells but is absent from resting lymphocytes, red blood cells, thymocytes, platelets, monocytes, neutrophils or dendritic cells 2.
Structure CD70 is a member of the TNF superfamily 1-4. Like other members of this superfamily, it is a type II membrane protein and is probably expressed as a trimer. The sequence similarity to TNF is in the C-terminal extracellular region 1-4.
Ligands and associated m o l e c u l e s CD70 binds CD27, a member of the TNFR superfamily 1.
Function Cells expressing CD70 can co-stimulate T cell proliferation, and enhance the generation of cytotoxic T cells and cytokine production 1,s. CD70 binding to CD27 on B cells co-stimulates B cell proliferation and immunoglobulin production 6.
Database accession numbers Human
FIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A40738
P32970
L08096
1
Amino acid sequence of human CD27 MPEEGSGCSV ESLGWDVAEL DGIYMVHIQV CTIVSQRLTP
RRRPYGCVLR QLNHTGPQQD TLAICSSTTA LARGDTLCTN
AALVPLVAGL PRLYWQGGPA SRHHPTTLAV LTGTLLPSRN
VICLVVCIQR LGRSFLHGPE GICSPASRSI TDETFFGVQW
FAQAQQQLPL LDKGQLRIHR SLLRLSFHQG VRP
References 1 Goodwin, R.G. et al. (1993)Cell 73, 447-456. z Gruss, H.-J. and Dower, S.K. (1995} Blood 85, 3378-3404.
50 i00 150 193
3 4 s 6
van Kooten, C. and Banchereau, J. (1996)Adv. Immunol. 61, 1-77. Armitage, R.J. (1994)Curr. Opin. Immunol. 6, 407-413. Hintzen, R.Q. et al. (1994) Immunol. Today 15, 307-311. Kobata, T. et al. (1995)Proc. Natl Acad. Sci. USA 92, 11249-11253.
Transferrin receptor, T9
Molecular weights Polypeptide 84 901 )
,
i
SDS-PAGE reduced unreduced
90-95 kDa 180-190 kDa
Carbohydrate N-linked sites O-linked
3 +
Human gene location 3q26.2-qter; 31 kb ~
NHa NH2
Tissue distribution CD71 expression is very low on resting leucocytes but is upregulated upon activation, presumably reflecting the iron dependence of proliferation. In other tissues CD71 is expressed on most dividing cells and also on brain endothelium 1
Structure CD71 is a type II membrane glycoprotein that exists as a homodimer with interchain disulfide bonds at Cys89 and Cys98. It is acylated at Cys62 and phosphorylated at Ser24 by protein kinase C (reviewed in ref. 1). The CD71 internalization signal is the tetrapeptide sequence YTRF (amino acids 2023), which structurally appears to be a tight turn 2. CD71 can be cleaved by an unidentified protease, between Argl00 and Leul01, to yield a soluble truncated form. O-linked glycosylation at Thr 104 decreases the susceptibility of the molecule to cleavage 3.
F
Ligands and associated molecules The ligand for CD71 is transferrin, the serum iron-transport protein (reviewed in ref. 1). CD71 associates non-covalently with the TCR ~ chain in-T cells, where it may play a role in signal transduction 4. CD71 monomers have been reported to form a complex with the integrin CD29/CD49c (VLA-3) in some cell lines, via a disulfide bond interaction between CD71 and CD49c s.
Function CD71 plays a critical role in cell proliferation by controlling the supply of iron, which is essential for many metabolic pathways, through the binding and endocytosis of transferrin, the major iron-carrying protein 1. The expression of CD71 is regulated at the post-transcriptional level through the control of mRNA stability and is closely linked to intracellular iron levels (reviewed in ref. 6). Upon iron deprivation, a cytoplasmic protein, known as
321:
the iron-response element binding protein (IRE-BP), stabilizes CD71 mRNA by binding to specific sequences, known as iron-response elements (IREs), within the 3' untranslated region of the CD71 mRNA. When iron levels are high, the IRE-BP affinity for IREs is lower and CD71 mRNA more susceptible to degradation. Nitric oxide can affect CD71 expression independently of intracellular iron concentration by activating IRE-BP binding to IREs and thus stabilizing CD71 mRNA 6
Comment Proteins that share sequence homology with CD71 have not previously been described. However we have noted that a 55 amino acid sequence (residues Phe237-Ile290) shares 43.6% identity with a region of streptococcal C5a peptidase (residues Tyr371-Ile425 z). The ALIGN score for this comparison is 9.33 standard deviations, making the similarity highly significant. Database accession numbers Human Rat Mouse
FIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A03259 A34549 $29548
P02786
X01060 M58040 X57349
8 9 lo
A m i n o a c i d s e q u e n c e of h u m a n C D 7 1 MMDQARSAFS NTKANVTKPK LAGTESPVRE SYVPREAGSQ IIVDKNGRLV NGSIVIVRAG AHLGTGDPYT GDCPSDWKTD HYVVVGAQRD ASWSAGDFGS LLYTLIEKTM PAVSFCFCED LTHDVELNLD RATSRLTTDF FWGSGSHTLP GDVWDIDNEF
NLFGGEPLSY RCSGSICYGT EPGEDFPAAR KDENLALYVE YLVENPGGYV KITFAEKVAN PGFPSFNHTQ STCRMVTSES AWGPGAAKSG VGATEWLEGY QNVKHPVTGQ TDYPYLGTTM YERYNSQLLS GNAEKTDRFV ALLENLKLRK
TRFSLARQVD IAVIVFFLIG RLYWDDLKRK NQFREFKLSK AYSKAATVTG AESLNAIGVL FPPSRSSGLP KNVKLTVSNV VGTALLLKLA LSSLHLKAFT FLYQDSNWAS DTYKELIERI FVRDLNQYRA MKKLNDRVMR QNNGAFNETL
GDNSHVEMKL FMIGYLGYCK LSEKLDSTDF VWRDQHFVKI KLVHANFGTK IYMDQTKFPI NIPVQTISRA LKEIKILNIF QMFSDMVLKD YINLDKAVLG KVEKLTLDNA PELNKVARAA DIKEMGLSLQ VEYHFLSPYV FRNQLALATW
AVDEEENADN GVEPKTECER TSTIKLLNEN QVKDSAQNSV KDFEDLYTPV VNAELSFFGH AAEKLFGNME GVIKGFVEPD GFQPSRSIIF TSNFKVSASP AFPFLAYSGI AEVAGQFVIK WLYSARGDFF SPKESPFRHV TIQGAANALS
50 i00 150 200 250 300 350 400 450 5O0 550 600 650 700 75O 760
References Testa, U. et al. (1993) Crit. Rev. Oncog. 4, 241-276. Trowbridge, I.S. et al. (1993)Annu. Rev. Cell Biol. 9, 129-161. Rutledge, E.A. et al. (1994)J. Biol. Chem. 269, 31864-31868. Salmeron, A. et al. (1995) J. Immunol. 154, 1675-1683. Coppolino, M. et al. (1995)Biochem. J. 306, 129-134. Hentze, M.W. and Kiihn, L.C. (1996) Proc. Natl Acad. Sci. USA 93, 8175-8182. 7 Chen, C.C. and Cleary, P.P. (1990)J. Biol. Chem. 265, 3161-3167.
1
z 3 4 s 6
~21
8 Schneider, C. et al. (1984) Nature 311,675-678. 9 Roberts, K.P. and Griswold, M.D. (1990)Mol. Cell. Endocrinol. 14, 531-542. lo Trowbridge, I.S. et al. Unpublished.
~22
CD72
Lyb-2, Ly-32.2, Ly-19.2 (mouse)
Molecular weights Polypeptide 40 224
E
SDS-PAGE reduced unreduced
39 and 43 (major)kDa 86 (major)and 90 kDa
Carbohydrate N-linked sites O-linked
1 unknown
Gene location 9p 1
CET
Domain boundaries IOYITM I
I
I
CL
ESC,
I NH2
Tissue distribution CD72 is expressed on all cells of the B cell lineage except plasma cells 2. AntiCD72 antibodies also label human tissue macrophages weakly. Structure
E
CD72 is a member of a group of related cell surface molecules which includes the asialoglycoprotein receptors, CD23, and the Kupffer cell receptor 3. These molecules are type II integral membrane glycoprotein with membraneproximal stalk regions, which are predicted to form e helical coiled coils 4, and membrane-distal C-type lectin (CL) domains3. CD72 is expressed as a disulfide-linked homodimer s. The mouse alloantigens Ly-32.2, Ly-19.2 and Lyb-2 have been shown to be identical to mouse CD72 6 , for which multiple allelic 7 and splice s variants have been identified. The CD72 CL domain has a lower level of identity to other CL domains than is usually the case for this superfamily, and is missing key residues implicated in Ca ~§ and carbohydrate binding 3, but the assignment of the CL domain is not in doubt. Ligands and associated molecules Purified CD5 has been reported to bind CD72 on cells 9but this is controversial (see CD5). Lectin-like activity has not been demonstrated. Function Unknown. Exposure of g cells to CD72 antibodies activates a variety of signalling pathways and can induce MHC Class II expression and B cell proliferation 2. However, the significance of these observations is not clear.
323
CD72
Database accession numbers Human Mouse
3L
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A43532 A32331
P21854 P21855
M54992 J04170
5 10
Amino acid sequence of human CD72 MAEAITYADL PSSLASSVLG LLTCLLLGVT GQSAEDLQGS SEEQQRRALE SKNWQESQKQ LSSNKDWKLT EMTAFRFPD
RFVKAPLKKS DKAAVKSEQP AICLGVRYLQ RRELAQSQEA QKLSNMENRL CETLSSKLAT DDTQRTRTYA
ISSRLGQDPG TASWRAVTSP VSQQLQQTNR LQVEQRAHQA KPFFTCGSAD FSEIYPQSHS QSSKCNKVHK
ADDDGEITYE AVGRILPCRT VLEVTNSSLR AEGQLQACQA TCCPSGWIMH YYFLNSLLPN TWSWWTLESE
NVQVPAVLGV TCLRYLLLGL QQLRLKITQL DRQKTKETLQ QKSCFYISLT GGSGNSYWTG SCRSSLPYIC
References 1 von Hoegen, I. et al. (1991) Eur. J. I m m u n o l . 21, 1425-1431.
2 Gordon, J. (1994) Immunol. Today 15, 411-417. 3 4 s 6 7 s 9 lo
~24
Day, A.J. (1994) Biochem. Soc. Trans. 22, 83-88. Beavil, A.J. et al. (1992) Proc. Natl Acad. Sci. USA 89, 753-757. yon Hoegen, I. et al. (1990) J. I m m u n o l . 144, 4 8 7 0 - 4 8 7 7 . Robinson, W.H. et al. (1993) J. I m m u n o l . 151, 4 7 6 4 - 4 7 7 2 . Robinson, W.H. et al. (1992) J. I m m u n o l . 149, 880-886. Ying, H. et al. (1995) J. I m m u n o l . 154, 2743-2752. Van de Velde, H. et al. (1991) N a t u r e 3 5 1 , 6 6 2 - 6 6 5 . N a k a y a m a , E. et al. (1989) Proc. Natl Acad. Sci. USA 86, 1352-1356.
50 i00 150 200 250 300 350 359
Ecto-5'-nucleotidase L-VAP-2 (lymphocyte vascular adhesion protein 2) Molecular weights Polypeptide 60 824 SDS-PAGE unreduced
70 kDa
reduced
70 kDa
Carbohydrate N-linked sites
4
O-linked
unknown
Human gene location
6q14-q21 Tissue distribution CD73 is expressed differentially on subsets of mature lymphocytes (10% of CD4 § cells, 50% of CD8 § cells and 70% of B cells)1. CD73 is also expressed on certain endothelial and epithelial cells 1. Enzyme activity in peripheral T cells is 10-fold higher than in thymocytes, and in peripheral B cells enzyme activity is 5-fold higher than in umbilical cord B cells 1,e Structure CD73 is GPI-linked. The site of GPI attachment has been confirmed by peptide analysis 3.
Function CD73 catalyses the 5' dephosphorylation of purine and pyrimidine ribo- and deoxyribonucleoside monophosphates to nucleosides that can be taken up by transport systems 2"4. Co-stimulatory effects of CD73 mAb on human T lymphocytes do not require enzyme activity of CD73 s. Inhibition of ecto-5'nucleotidase activity by biochemical inhibitors or a polyclonal antiserum suppressed the generation and cytotoxicity of alloreactive cytotoxic T lymphocytes 2. CD73 mediates binding of lymphocytes to cultured i....
endothelial cells 6 Comments Ecto-5'-nucleotidase activity is abnormally low on lymphocytes of patients with various immunodeficiency diseases, including severe combined immunodeficiency, X-linked agammaglobulinaemia, common variable immunodeficiency, selective IgA deficiency, Wiskott-Aldrich syndrome and AIDS 1 Database accession numbers Human Mouse Rat
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
Sl1032
P21589
X55740 L12059
3 7 s
~25
A m i n o acid sequence of human CD73 MCPRAARAPA WELTILHTND NVLLLDAGDQ LIEPLLKEAK TSKETPFLSN KLIAQKVRGV VVQAYAFGKY WRIKLDNYST DEMFWNHVSM KGSTLKKAFE TKCRVPSYDP VVSTYISKMK TGSHCHGSFS
TLLLALGAVL VHSRLEQTSE YQGTIWFTVY FPILSANIKA PGTNLVFEDE DVVVGGHSNT LGYLKIEFDE QELGKTIVYL CILNGGGIRS HSVHRYGQST LKMDEVYKVI VIYPAVEGRI LIFLSLWAVI
WPAAGA DSSKCVNASR KGAEVAHFMN KGPLASQISG ITALQPEVDK FLYTGNPPSK RGNVISSHGN DGSSQSCRFR PIDERNNGTI GEFLQVGGIH LPNFLANGGD KFS FVLYQ
CMGGVARLFT ALRYDAMALG LYLPYKVLPV LKTLNVNKII EVPAGKYPFI PILLNSSIPE ECNMGNLICD TWENLAAVLP VVYDLSRKPG GFQMIKDELL
KVQQIRRAEP NHEFDNGVEG GDEVVGIVGY ALGHSGFEMD VTSDDGRKVP DPSIKADINK AMINNNLRHT FGGTFDLVQL DRVVKLDVLC RHDSGDQDIN
References 1 Thomson, L.F. et al. (1990)Tissue Antigens 55, 9-19.
z Shipp, M.A. and Look, A.T. (1993) Blood 82, 1052-1070. 3 4 s 6 7 s
32(~
Misumi, Y. et al. (1990) Eur. J. Biochem. 191,563-569. Z i m m e r m a n n , H. (1992)Biochem. J. 285, 345-365. Gutensohn, W. et al. (1995)Cell. Immunol. 161,213-217. Airas, L. et al. (1995)J. Exp. Med. 182, 1603-1608. Resta, R. et al. (1993) Gene 133, 171-177. Misumi, Y. et al. (1990) J. Biol. Chem. 265, 2178-2183.
-i 50 i00 150 200 250 300 350 400 450 500 523 +25
MHC Class II-associated invariant chain (Ii or l-y) Molecular weights Polypeptides SDS-PAGE reduced
26 399 / 24 3 76 / 33 388 / 31365 COOH
33 / 35 / 41 / 43kDa
Carbohydrate N-linked sites O-linked
2 or 4 +
Human gene location and size 5q32; 12 kb 1
T i s s u e distribution
NH2
CD74 is mostly expressed intracellularly in MHC Class II-positive cells. There is moderate surface expression on B cells and weak surface expression on monocytes. CD74 is also expressed by a subset of activated T cells and some epithelial cells.
Structure CD74 is a type II integral membrane glycoprotein 2. It consists of four distinct forms which are generated by usage of two in-phase initiation codons (yielding products which differ in apparent Mr by 2 kDa) and by alternative splicing of exon 6b (yielding products which differ in apparent Mr by 8 kDa) 3'4. The latter exon encodes a sequence of 64 amino acids, containing six cysteines, which has sequence similarity to thyroglobulin type I repeats centred around the Cys-Trp-Cys-Val motif. The shorter form of CD74 is illustrated above. CD74 contains N-linked and O-linked carbohydrates and minor forms are acylated, sulfated, phosphorylated or contain glycosaminoglycans.
Function CD74 homotrimers associate in the endoplasmic reticulum (ER)with three MHC Class II aft dimers and this prevents the binding of endogenous peptides to Class II molecules s-8. The N-terminal cytoplasmic domain of CD74 contains targeting motifs which leads to its retention in the ER, or to targeting of Class II ~fl dimers into the endosomal-lysosomal pathway via the Golgi s'6. Subsequent proteolytic degradation of CD74 leaves a small fragment (Class II-associated invariant chain peptide (CLIP)) bound to Class II ~fi dimers in the peptide-binding groove. The interaction of Class II aft/CLIP complexes with HLA-DM (a Class II-related eft dimer) in a specialized compartment releases the CLIP and allows the Class II molecules to bind peptides derived from exogenous proteins 8. CD74-deficient mice have reduced levels of MHC Class II at the cell surface and these molecules lack
327
CD74
the compact conformation indicative of bound peptide 9. Consequently these mice have reduced numbers of peripheral CD4 § T cells and their antigenpresenting cells present protein antigens very poorly.
Comments Expression of CD74 and the MHC Class II a and fl chains is co-regulated and induced by IFN71~ Coordinate expression of these three genes may be directed by a segment of sequence homology at the 5' region of each gene 1. Database accession numbers PIR SWISSPR OT Human A30060 P04233 Mouse A02244 P04441 Rat S02182 P10247
EMBL/GENBANK X03339, X03340 X00496 X14254
REFERENCE 1-3 11 12
Amino acid sequence of human CD74 MHRRRSRSCR TGFSILVTLL PKPVSKMRMA PLKVYPPLKG KPTDAPP_KES
EDQKPVM_DDQ LAGQATTAYF TPLLMQALPM SFPENLRHLK LELEDPSSGL
RDLISNNEQL LYQQQGRLDK GALPQGPMQN NTMETIDWKV GVTKQDLGPV
PMLGRRPGAP LTVTSQNLQL ATKYGNMTED FESWMHHWLL PM
ESKCSRGALY ENLRMKLPKP HVMHLLQNAD FEMSRHSLEQ
50 i00 150 200 232
Notes 1 The alternative N-terminal Met residue is bold and underlined. 2 Usage of exon 6b generates the following in-frame sequence inserted after Lys208 which is indicated bold and underlined above: VLTKCQEEVS HIPAVHPGSF RPKCDENGNY LPLQCYGSIG YCWCVFPNGT EVPNTRSRGH HNCS
258 272
References 1 O'Sullivan, D.M. et al. (1986) Proc. Natl Acad. Sci. USA 83, 4484-4488. 2 Claesson, L. et al. (1983) Proc. Natl Acad. Sci. USA 80, 7395-7399. 3 Strubin, M. et al. (1986) EMBO J. 5, 3483-3488. 4 O'Sullivan, D.M. et al. (1987)J. Exp. Med. 166, 444-460. 5 Lotteau, V. et al. (1990) Nature 348, 600-605. 6 Lamb, C.A. et al. (1991) Proc. Natl Acad. Sci. USA 88, 5998-6002. 7 Marks, M.S. et al. (1990) J. Cell Biol. 111, 839-855. s Cresswell, P. (1996)Cell 84, 505-507. 9 Viville, S. et al. (1993) Cell 72, 635-648. lo Paulnock-King, D. et al. (1985)1. Immunol. 135, 632-636. 11 Koch, N. et al. (1987) EMBO J. 6, 1677-1683. le McKnight, A.J. et al. (1989) Nucleic Acids Res. 17, 3983-3984.
E
E
~2~
Tissue distribution CDw75 is expressed on sIg § mature B cells but not on plasma cells l'z. Expression is particularly high in germinal centres. The level of expression increases on activation of B cells. It is also expressed weakly on a fraction of T cells, on erythrocyte precursors in the bone marrow and on a broad range of epithelial cell types.
Structure CDw75 is an ill-defined carbohydrate structure present on N-glycans which contains ~2,6-1inked sialic acid 1,2. Some (but not all) cells lacking CDw75 express the antigen when transfected with the Golgi enzyme fl-galactoside a2,6-sialyltransferase (EC 2.4.99.1) 1'2.
Ligands and associated m o l e c u l e s No ligand has been identified. CDw75 is not a ligand for the lectin CD22, which binds the sialoglycoconjugate NeuAc~2 - , 6Galfll ~ 4GlcNAc 3.
References 1 Bast, B.J.E.G. et al. (1992) J. Cell Biol. 116, 423-435. z Schlossman, S.F. et al. (1995) Leucocyte Typing V: white cell differentiation antigens. Oxford University Press, Oxford. 3 Engel, P. et al. (1993) J. Immunol. 150, 4719-4732.
~2~
CDw76 Molecular weights SDS-PAGE reduced
85, 67 kDa
Tissue distribution CDw76 is a provisional CD cluster that represents two closely related but distinct carbohydrate epitopes, designated CD76.1 and CD76.2, recognized by the mAbs CRIS-4 and HD66, respectively 1. It is likely that CDw76 will be subdivided in future to distinguish the CD76.1 and CD76.2 epitopes 1. Both epitopes are strongly expressed on mature peripheral blood B cells and mantle zone B cells, but expression is weak or absent on germinal centre B cells and plasma cells. CD76.1 is expressed on resting T cells and some granulocytes but CD76.2 is absent on these cells. On activated T cells, expression of both CD76.1 and CD76.2 is upregulated. Amongst non-haematopoietic cells, CDw76 epitopes are expressed on various epithelial and some endothelial cells 1,2. Structure The CDw76 epitope is present on a2,6-sialylated polylactosamine structures and is thus not restricted to one particular glycoprotein or glycolipid. Both CDw76 mAbs have been shown to react with a2,6-sialylated type 2 chain oligosaccharide moieties of glycosphingolipids. The CD76.1 mAb detects sialylated glycosphingolipids with higher charge than does CD76.2. The CD76.1 and CD76.2 epitopes may be present on polylactosamine sequences of N-linked carbohydrate structures, but only CD76.1 may be present on O-linked structures 2. The CD76.2 mAb immunoprecipitates two uncharacterized glycoproteins of 85 kDa and 67 kDa 2. Function Unknown.
References 1 Engel, P. and Tedder, T.F. (1995) Leucocyte Typing V, 577-580. z Schwartz-Albiez, R. et al. (1995) Leucocyte Typing V, 580-586.
~3~
CD77 Other names Globotriaocylceramide (Gb3) Ceramide trihexoside pk blood group antigen Burkitt's lymphoma-associated antigen (BLA)
Tissue distribution Expressed on a subset of germinal centre tonsillar B cells 1,2. CD77 is a useful marker for Burkitt's lymphoma 1,3 and it has been postulated that B cells expressing this antigen are the normal counterparts of Burkitt's lymphoma tumour cells. Lymphoblastoid cell lines, obtained by transformation of normal B cells with Epstein-Barr virus, do not express CD77 but the antigen is expressed on endothelium and some epithelial cells, as well as follicular dendritic cells and macrophages in lymphoid tissue 4.
Structure CD77 mAbs react with Gal~l --, 4Galfll ~ 4Glc-ceramide (Gb3), which belongs to the globo-series of neutral glycosphingolipids s
Ligands and a s s o c i a t e d m o l e c u l e s Co-capping experiments suggest that CD77 is associated with CD19 on the surface of Daudi B cells, while CD77-deficient cells show concomitant decreased expression of CD19, suggesting that CD77 influences the surface expression of CD 19 6
Function The function of CD77 is unknown. However, CD77 is a receptor for lectins on the pili of a certain strain of Escherichia coli. The vero toxin of E. coli and the Shiga toxin of Shigella dysenteriae specifically bind to Gb3 7. Germinal centre B cells which express CD77 have been shown to be engaged in programmed cell death (apoptosis) 2, but it is not known whether the glycolipid is involved in this process.
References 1 Gregory, C.D. et al. (1987)J. Immunol. 139, 313-318.
2 Mangeney, M. et al. (1991) Eur. J. Immunol. 21, 1131-1140. 3 Wiels, J. et al. (1981) Proc. Natl Acad. Sci. USA 78, 6485-6488. 4 M611er, P. and Mielke, B. (1989) Leucocyte Typing IV, 175-177. s Nudelman, E. et al. (1983) Science 220, 509-511.
6 Maloney, M.D. and Lingwood, C.A. (1994) J. Exp. Med. 180, 191-201. 7 Karlsson, K.-A. (1989)Annu. Rev. Biochem. 58, 309-350.
331
CD79/BCR Other names CD79a (rob-l, Ig~) CD79b (B29, Igfl) -B cell antigen receptor (mIg) complex Molecular weights Polypeptide CD79a CD79b
,••
21 841 23 083
~/
IgM ~
__cD~gb
V
cD79.
s sC2
~a4--t,~,
s
"
s
COOH
SDS-PAGE reduced
mIgM heavy chain mIgD heavy chain Ig light chain CD79a CD79b
Carbohydrates N-linked sites CD79a CD79b O-linked Human gene location and size Ig heavy chain: 14q32.33 Ig g light chain: 2p12 Ig 2 light chain: 22ql 1.1-q 11.2 CD79a: 19q13.2; 4.8 kb 1 CD79b: 17q23; 3.9 kb 2
73 kDa 67 kDa 28 kDa 32-33 kDa 37-39 kDa 6 3 unknown CD79a Domains Exon boundaries
ISl!
I
u
ITM I0u
c2
I1 I0 \1 RQP | EGL FRKR
LGP
CD79b Domains Exon boundaries
~32
CPH
ISll SAE
CYM I YFC I
v
I1 KGS
I I0u 10 \0 TM
I1 MGF
IYEGL DKDD
-~
The structure and function of immunoglobulins are well reviewed and will not be considered in detail here.
Tissue distribution CD79a and CD79b are restricted to B lymphocytes, first appearing on the surface at the pre-B cell stage and remaining through all stages of B cell differentiation prior to plasma cells 3. The surrogate 2 chain and a truncated heavy chain D~ are expressed on pre-B cells 4,s
Structure
I I P
CD79a and CD79b exist at the surface as a disulfide-linked heterodimer noncovalently associated with membrane Ig. It is proposed that each BCR complex contains two heterodimers of CD79a and CD79b 3. Both CD79a and CD79b are composed of single IgSF domains 3. All classes of heavy chains can associate with CD79a and CD79b 3. The disulfide bond between CD79a and CD79b is likely to be between Cys residues that are found in identical sequences of Ser-Cys-Gly-Thr in both molecules. In each case this sequence is predicted to be part of fl strand G of the Ig fold, and thus this disulfide bond probably stabilizes a domain:domain interaction rather than being between two hinge regions, as for example in CD8. The cytoplasmic domains of CD79a and CD79b contain ITAM motifs. In pre-B cells, CD79a and CD79b form a complex with a "surrogate" light chain built up from VpreB (V-like) and 2s (C-like)which associate non-covalently and either a heavy chain or a truncated D~ chain 4,s
Ligands and associated molecules Membrane Ig binds antigen. Different pairs of BAP (BCR-associated proteins) molecules have been identified which specifically associate with IgM or IgD 6. As with CD3/TCR, phosphorylated ITAM motifs of CD79a and CD79b bind to SH2 domains of B cell intracellular signalling molecules 7,8
Function Antigen binding leads to internalization and presentation of antigen to T cells via MHC molecules, or can signal the B cell directly, in the case of multivalent antigens which crosslink several mIg molecules 9. Crosslinking of BCR leads to activation of B cells. This is dependent on CD79a and CD79b for signal transduction. Transmembrane signalling through surface Ig leads to rapid activation of a phospholipase C and tyrosine kinases 6-8. CD79b and CD79a are both phosphorylated on tyrosine as a result of B cell activation 6-8
Database accession numbers Human CD79a Human CD79b Mouse CD79a Mouse CD79b
PIR
SWISSPROT
EMBL/GENBANK
REFERENCE
A46477 A46527 A43540 A31403
Pl1912 P40259 Pl1911 P15530
L32754 L27587 X13450 J03857
1 2 lo 11
~3~
CD79/BCR
Amino acid sequence of human CD79a MPGGPGVLQA LWMHKVPASL PGEDPNGTLI FLDMGEGTKN EDENLYEGLN
LPATIFLLFL MVSLGEDAHF IQNVNKSHGG RIITAEGIIL LDDCSMYEDI
LSAVYLGPGC QCPHNSSNNA IYVCRVQEGN LFCAVVPGTL SRGLQGTYQD
QA NVTWWRVLHG ESYQQSCGTY LLFRKRWQNE VGSLNIGDVQ
NYTWPPEFLG LRVRQPPPRP KLGLDAGDEY LEKP
-i 50 I00 150 194
Amino acid sequence of human CD79b MARLALSPVP ARSEDRYRNP KQEMDENPQQ TSEVYQGCGT LLLDKDDSKA E
SHWMVALLLL KGSACSRIWQ LKLEKGRMEE ELRVMGFSTL GMEEDHTYEG
LSAEPVPA SPRFIARKRG SQNESLATLT AQLKQRNTLK LDIDQTATYE
FTVKMHCYMN IQGIRFEDNG DGIIMIQTLL DIVTLRTGEV
SASGNVSWLW IYFCQQKCNN IILFIIVPIF KWSVGEHPGQ
References 1 2 3 4 s 4 7 s 9 lo 11
334
Hashimoto, S. et al. (1994) Immunogenetics 40, 287-295. Hashimoto, S. et al. (1994)Immunogenetics 40, 145-149. Reth, M. (1992)Annu. Rev. Immunol. 10, 97-121. Melchers, F. et al. (1993)Immunol. Today 14, 60-68. Horne, M.C. et al. (1996) Immunity 4, 145-158. Reth, M. (1995)Immunol. Today 16, 310-313. Cambier, J.C. et al. (1994)Annu. Rev. Immunol. 12, 457-486. DeFranco, A.L. (1995) Curr. Opin. Cell Biol. 7, 163-175. DeFranco, A.L. (1996) Curr. Biol. 6, 548-550. Kashiwamura, S-I. et al. (1990) J. Immunol. 145, 337. Hermanson, G.G. et al. (1988) J. Immunol. 85, 6890-6894.
-i 50 i00 150 200 201
B7, B7-1, B1 Molecular weights Polypeptides
30 048
SDS-PAGE reduced unreduced
60 kDa 60 kDa
Carbohydrate N-linked sites O-linked
8 unknown
Human gene location and size 3q21; 32 kb 1,2 CGH [ YECI
Domains
ISIh Exon boundaries
SGV
CST
I
v
COOH
FMO
C2
I
ITMIcu
I1
I1
TTK TCF
KAD
Tissue distribution 3 CD80 is expressed at low levels on resting peripheral blood monocytes and dendritic cells. Expression is increased upon activation of B cells, T cells, and peripheral blood monocytes, and with culture of Langerhans cells. Signalling through the MHC Class II cytoplasmic domain may induce CD80 expression on B cells 4
Structure CD80 is structurally related to CD86 (25 % identity). The extracellular portion contains two IgSF domains which are highly glycosylated 5. In mice a splice variant has been identified which lacks the C2-set domain 6. The sequence of the transmembrane domain is unusual in containing three cysteine residues, two of which are also present in CD86. The short cytoplasmic domain, which bears no similarity to the CD86 cytoplasmic domain, has a preponderance (9/19)of arginine residues and contains a potential site for calmodulin-dependent phosphorylation (RRES).
Ligands and associated molecules CD80 binds to CD28 and CD 152 (CTLA-4)using the same site on the GFCC'C" //sheet of the V-set domain, though the C2-set domain may also contribute to binding 7-9. CD80 binds CD 152 with a slightly higher affinity than CD28 (Kd 0.4 and 4 #M, respectively)with both interactions having fast dissociation rate constants (koff >__0.4and _>1.6 s -1 respectively) lo, ll !
Function The interaction of CD80 or CD86 with CD28 provides a potent co-stimulatory signal for T cells activated through the CD3 complex 3,12. CD80 is expressed
335
CD80
later than CD86 and appears to be less important in the primary i m m u n e response 3. Mice deficient in CD8013 are less severely affected than CD86deficient mice and have nearly normal THI and TH2 responses 3. Early reports that CD80 and CD86 ligation promote THI and TH2 responses, respectively, remain controversial 3 Database accession numbers Human Mouse Rat
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A45803 S 17291
P33681 Q00609
M27533 X60958 U05593
s 14 is
Amino acid sequence of human CD80 MGHTRRQGTS VIHVTKEVKE PEYKNRTIFD TLSVKADFPT INTTVSQDPE KQEHFPDNLL VRPV
PSKCPYLNFF VATLSCGHNV ITNNLSIVIL PSISDFEIPT TELYAVSSKL PSWAITLISV
QLLVLAGLSH SVEELAQTRI ALRPSDEGTY SNIRRIICST DFNMTTNHSF NGIFVICCLT
FCSG YWQKEKKMVL ECVVLKYEKD SGGFPEPHLS MCLIKYGHLR YCFAPRCRER
TMMSGDMNIW AFKREHLAEV WLKREHLAEV VNQTFNWNTT RRNERLRRES
-i 50 i00 150 200 250 254
References 1 z 3 4 s 6 7 8 9 lo 11 lz 13 14 is
33(
Freeman, G.J. et al. (1992) Blood 79, 489-494. Selvakumar, A. et al. (1992) Immunogenetics 36, 175-181. Lenschow, D.J. et al. (1996) Annu. Rev. Immunol. 14, 233-258. Nabavi, N. et al. (1992) Nature 360, 266-268. Freeman, G.J. et al. (1989) J. Immunol. 143, 2714-2722. Inobe, M. et al. (1994) Biochem. Biophys. Res. C o m m u n . 200, 443-449. Peach, R.J. et al. (1995) J. Biol. Chem. 270, 21181-21187. Linsley, P.S. et al. (1991) J. Exp. Med. 174, 561-569. Linsley, P.S. et al. (1991)J. Exp. Med. 173, 721-730. Greene, J.L. et al. (1996)J. Biol. Chem. 271, 26762-26771. van der Merwe, P.A. et al., (1997) J. Exp. Med. (in press). Linsley, P.S. and Ledbetter, J.A. (1993)Annu. Rev. Immunol. 11, 191-211. Freeman, G.J. et al. (1993)Science 262, 907-909. Freeman, G.J. et al. (1991) J. Exp. Med. 174, 625-631. Judge, T.A. et al. (1995)Int. Immunol. 7, 171-178.
CD81
TAPA-1
Molecular weights Polypeptide
25 810
SDS-PAGE reduced
26kDa
Carbohydrate E
N-linked sites O-linked
TTT
~~ ~~
0 nil
NH2
Human gene location 11p15.5
Tissue distribution CD81 is expressed by most cell types i and consistent with this the CD81 gene has a housekeeping type promoter 2. Amongst haematopoietic cells, CD81 is expressed by B and T cells, macrophages, dendritic cells, NK cells and eosinophils, but not by neutrophils, platelets and erythrocytes 3.
Structure CD81 is a member of the TM4 superfamily and is predicted to have four transmembrane regions, short cytoplasmic N- and C-termini, and two extracellular regions. Proteolysis studies of the CD81 protein, translated in vitro in microsomes, are consistent with the predicted topology of TM4SF molecules (reviewed in ref. 4).
Ligands and associated m o l e c u l e s Co-immunoprecipitation and flow cytometric energy transfer studies have shown that CD81 associates non-covalently with a number of other molecules: CD19, CD21 and leu-13 s, MHC Class I and II, CD20, CD37, CD53 and CD82 in B cells6 ; CD4, CD8 and CD82 in T cells z, and CD9, CD63, and the integrins CD29/CD49c (VLA-3), CD29/CD49d (VLA-4)and CD29/CD49f (VLA-6) in several cell types 8,9. No extracellular ligand has been identified for CD81.
Function Mouse CD81 appears to play a role in early T cell development lo. Engagement of CD81 with the mAb TAPA- 1 induces B cell adhesion via the integrin CD29/ CD49d (VLA-4)11
Database accession numbers Human Mouse
PIR A35649 A46472
SWISSPR OT P 18582 P35762
EMBL/GENBANK M33680 $45012
REFERENCE 1 2
337
CD81
Amino acid sequence of human CD81 MGVEGCTKCl DKPAPNTFYV LFACEVAAGI HETLDCCGSS KLYLIGIAAI
KYLLFVFNFV GIYILIAVGA WGFVNKDQIA TLTALTTSVL VVAVIMIFEM
FWLAGGVILG VMMFVGFLGC KDVKQFYDQA KNNLCPSGSN ILSMVLCCGI
VALWLRHDPQ YGAIQESQCL LQQAVVDDDA IISNLFKEDC RNSSVY
TTNLLYLELG LGTFFTCLVI NNAKAVVKTF HQKIDDLFSG
50 i00 150 200 236
References 1 0 r e n , R. et al. (1990) Mol. Cell. Biol. 10, 4007-4015. z Andria, M.L. et al. (1991) J. Immunol. 147, 1030-1036. 3 Tedder, T.F. et al. (1995)Leucocyte Typing V, 684-688. 4 Wright, M.D. and Tomlinson, M.G. (1994) Immunol. Today 15, 588-594. s Fearon, D.T. and Carter, R.H. (1995) Annu. Rev. Immunol. 13, 127-149. 6 Sz611osi, J. et al. (1996) J. Immunol. 157, 2939-2946. 7 Imai, T. et al. (1995) J. Immunol. 155, 1229-1239. s Berditchevski, F. et al. (1996) Mol. Biol. Cell 7, 193-207. 9 Mannion, B.A. et al. (1996) J. Immunol. 157, 2039-2047. lo Boismenu, R. et al. (1996) Science 271, 198-200. 11 Behr, S. and Schriever, E (1995) J. Exp. Med. 182, 1191-1199.
~3~
R2, C33, IA4, 4F9, KAI1 Molecular weight Polypeptide
29 626
SDS-PAGE reduced
50-53 kDa
Carbohydrate N-linked sites O-linked
3 nil
Human gene location
NH2
llpll.2
Tissue distribution CD82 is expressed by most cell types. Amongst haematopoietic cells, CD82 is expressed by B and T cells, NK cells, monocytes, granulocytes and platelets, but not by erythrocytes. CD82 is strongly upregulated on lymphocyte activation 1.
Structure CD82 is a member of the TM4 superfamily and is predicted to have four transmembrane regions, short cytoplasmic N- and C-termini, and two extracellular regions (reviewed in ref. 2).
Ligands and associated molecules Co-immunoprecipitation and flow cytometric energy transfer studies have shown that CD82 associates non-covalently with a number of other molecules: MHC Class I and II, CD20, CD37, CD53 and CD81 in B cells a; CD4, CD8 and CD81 in T cells 4; and integrins including CD29/CD49d (VLA4) in some cell lines s. No extracellular ligand has been identified for CD82.
Function Expression of CD82 suppresses metastasis in tumour cells 6 In vitro experiments have shown that CD82 can transduce signals in B cells, T cells and monocytes 7,8
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
S16156
P27701 P40237
X53795 D 14883
9 10
Amino acid sequence of human CD82 MGSACIKVTK SLRMGAYVFI TAGALFYFNM GWVSFYNWTD QSGNHPEDWP CLCRHVHSED
YFLFLFNLIF GVGAVTMLMG GKLKQEMGGI NAELMNRPEV VYQEGCMEKV YSKVPKY
FILGAVILGF FLGCIGAVNE VTELIRDYNS TYPCSCEVKG QAWLQENLGI
G V WI L A D K S S VRCLLGLYFA SREDSLQDAW EEDNSLSVRK ILGVGVGVAI
FISVLQTSSS FLLLILIA~V DYVQAQVKCC GFCEAPGNRT IELLGMVLSI
50 i00 150 200 250 267
33~
CD82
References 1 z 3 n s 6 7 8 9 lo
~4C
Engel, P. et al. (1995) Leucocyte Typing V, 691-693. Wright, M.D. and Tomlinson, M.G. (1994) Immunol. Today 15, 588-594. Sz611osi, J. et al. (1996) J. Immunol. 157, 2939-2946. Imai, T. et al. (1995)J. Immunol. 155, 1229-1239. Mannion, B.A. et al. (1996) J. Immunol. 157, 2039-2047. Dong, J.-T. et al. (1995) Science 268, 884-886. Lebel-Binay, S. et al. (1995)J. Leukocyte Biol. 57, 956-963. Lebel-Binay, S. et al. (1995) J. Immunol. 155, 101-110. Gaugitsch, H.W. et al. (1991) Eur. J. Immunol. 21,377-383. Nagira, M. et al. (1994) Cell. Immunol. 157, 144-157.
HB15 Molecular weights Polypeptide
23 041
SDS-PAGE reduced unreduced
45 kDa 45 kDa
Carbohydrate N-linked sites O-linked
3 unknown COOH
Human gene location 6p23-p21.3 CTA
Domains
YRC
Istol v
Exons
PATP
I"o
VKLL
[
I ITM I I1
TGC
Io
1
TCKP
Tissue distribution CD83 is a marker for a subset of dendritic cells which include Langerhans cells in the skin, peripheral blood dendritic cells, and interdigitating reticulum cells within the T cell zones of lymphoid organs 1. Expression is not restricted to dendritic cells since CD83 is also expressed on some germinal centre B cells and weakly on activated peripheral lymphocytes 2.
Structure CD83 comprises a single V-set IgSF domain, a transmembrane region and a 39 amino acid C-terminal cytoplasmic tail 2. The gene for CD83 has an unusual structure, with respect to most other IgSF molecules, in that the intron/exon boundaries encompassing the IgSF domain are of phase 0 and phase 1 rather than both being of phase 12. There is also an intron within the coding sequence for the IgSF domain, but this is relatively common.
Function Unknown.
Database accession numbers Human
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
$23066
Q01151
Zl1697
2
A m i n o acid sequence of h u m a n CD83 MSRGLQLLLL TPEVKVACSE GQHYHQKGQN GKVILRVTGC SIFPDFSKAG
SCAYSLAPA DVDLPCTAPW GSFDAPNERP PAQRKEETFK MERAFLPVTS
DPQVPYTVSW YSLKIRNTTS KYRAEIVLLL PNKHLGLVTP
VKLLEGGEER METPQEDHLR CNSGTYRCTL QDPDGQRNLS ALVIFYLTLI IFTCKFARLQ HKTELV
-i 50 i00 150 186
~41
References 1 Zhou, L.-J. and T.F. Tedder (1995) J. Immunol. 154, 3821-3835. z Zhou, L.-J. et al. (1992) J. Immunol. 149, 735-742.
34~
CDw84 Tissue distribution CDw84 is highly expressed on macrophages and platelets. Expression is lower on T and B lymphocytes and absent from plasma cells 1.
Structure Unknown.
Function Unknown.
Reference 1 Engel, P. et al. (1995) Leucocyte Typing V, 699-700.
~4~
Molecular weights SDS-PAGE reduced unreduced
83 kDa 72 kDa
Tissue distribution CD85 is highly expressed on plasma cells and monocytes. Expression is lower on mature B cells and not detectable on early lineage B cells, T cells, NK cells and non-haematopoietic cells. On tumour cells, CD85 is a marker for all acute lymphoblastic leukaemia (ALL) cells and hairy cell leukaemias, most B cell chronic lymphocytic leukaemia (B-CLL) and B cell non-Hodgkin's lymphoma (B-NHL), and some acute myeloid leukaemias 1,2. Structure Unknown.
Function Unknown.
Comments The molecular weight of the CD85 antigen on SDS-PAGE is controversial. The original value of 83 kDa under reducing conditions 2 was not confirmed by the Leucocyte Typing V Workshop, in which one laboratory reported a weight of 120 kDa and another laboratory reported weights of 90 kDa and 18 kDa 1. A possible susceptibility of the protein to proteolysis was suggested as an explanation for this discrepancy 1.
References 1 Engel, P. et al. (1995) Leucocyte Typing V, 701-702. 2 Pulford, K. et al. (1991)Clin. Exp. Immunol. 85, 429-435.
344
CD86
B7-2, B70
Molecular weights Polypeptide 35 254
I
,
SDS-PAGE reduced unreduced
70 kDa 70 kDa
Carbohydrate N-linked sites O-linked
8 unknown
Human gene location and size 3q 13-q231; >22 kb 2
Domains Exon boundaries
Isl SGA
COOH CQF [ YQCl
v
css I IFCI [ c2 [TM I CY I I1 I1 I1 12 LAN IEL CGTKR K
Tissue distribution E
CD86 is expressed at high levels on resting peripheral blood monocytes and dendritic cells and at very low levels on resting B and T cells a. Activation of B cells, T cells, and monocytes and culture of Langerhans cells leads to increased expression of CD86. CD86 expression is induced more rapidly than CD80 expression (peaks at - 2 0 h versus -60h) and reaches higher levels. Expression is increased by IL-4 (B cells) and IFN? (peripheral blood monocytes) and decreased by IL-10 (peripheral blood dendritic cells). Structure CD86 is an IgSF molecule structurally related to CD80 (25% amino acid identity). The extracellular domain contains two IgSF domains and is highly glycosylated 4's The transmembrane domain contains two of the three cysteines seen in CD80. The cytoplasmic domain is completely unrelated to the CD80 cytoplasmic domain and contains three potential protein kinase C phosphorylation sites. Ligands and associated molecules Like CD80, the extracellular portion of CD86 binds to CD28 and CD152 (CTLA-4). Like CD80, CD86 binds CD28 with a lower avidity than CD152 6, and binding involves residues in the V-domain z. CD86 binds CD152 with a lower avidity than CD80 8 Function The interactions of CD80 and CD86 with CD28 provide important costimulatory signals for T cells activated through the CD3/TCR. CD86 is
~4~
CD86
expressed earlier than CD80 and appears to dominate in the primary i m m u n e response3. Mice deficient in CD86 have more profound defects in T cell function (including T cell-dependent antibody responses) than CD80deficient mice3. Early reports that CD86 and CD80 ligation favour TH2 and TH1 responses, respectively, remain controversial3.
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A48754
P42081 P42082
U404343 L25606
9
4,5
Amino acid sequence of human CD86 MDPQCTMGLS APLKIQAYFN KFDSVHSKYM HQMNSELSVL RTKNSTIEYD KTRLLSSPFS KRPRNSYKCG KSDTCF
NILFVMAFLL ETADLPCQFA GRTSFDSDSW ANFSQPEIVP GIMQKSQDNV IELEDPQPPP TNTMEREESE
SGA NSQNQSLSEL TLRLHNLQIK ISNITENVYI TELYDVSISL DHIPWITAVL QTKKREKIHI
VVFWQDQENL DKGLYQCIIH NLTCSSIHGY SVSFPDVTSN PTVIICVMVF PERSDEAQRV
VLNEVYLGKE HKKPTGMIRI PEPKKMSVLL MTIFCILETD CLILWKWKKK FKSSKTSSCD
References 1 2 3 4 s 6 z s 9
34(
Fernandez-Ruiz, E. et al. (1995) Eur. J. Immunol. 25, 1453-1456. Jellis, C.L. et al. (1995) Immunogenetics 42, 85-89. Lenschow, D.J. et al. (1996) Annu. Rev. Immunol. 14, 233-258. Azumo, M. et al. (1993) Nature 366, 76-79. Freeman, G.J. et al. (1993) Science 262, 909-911. Linsley, P.S. et al. (1994) Immunity 1,793-801. Peach, R.J. et al. (1995) J. Biol. Chem. 270, 21181-21187. Greene, J.L. et al. (1996) J. Biol. Chem. 271, 26762-26771. Freeman, G.J. et al. (1993)J. Exp. Med. 178, 2185-2192.
-i 50 i00 150 200 250 300 306
.CD87
Urokinase plasminogen activator receptor (uPAR), Mo3
Molecular weights
[
~ 2
Polypeptide
31460
SDS-PAGE reduced
55-60 kDa
Carbohydrate N-linked sites 5 O-linked sites unknown
Human gene location and size
~ ~ ~ ~
19q13; -23 k b ' CMQ
CIS L6
PAS
EEG
I
CYS
~~~ ~ ~ SGC I
IGI
L6
SGR EGR
PIL
,,,/
HEP
EVL
Tissue distribution CD87 is expressed on monocytes and granulocytes and the level of expression is generally upregulated following activation of the cells 2. It is also found on the activated subset of NK cells characterized as large granular lymphocytes 3. Only 0-5% of resting T lymphocytes from normal donors are CD87 § but expression can be upregulated following treatment with phorbol esters, mitogens, anti-CD3 antibodies and certain combinations of cytokines 3. Upregulation of CD87 is correlated with the expression of CD25 on T cells 4. T cells from patients with viral infections are found to express high levels of CD87 4
Structure CD87 is a heavily glycosylated, GPI-linked protein s,6. The extracelluar region consists of three Ly-6 domains 5-8. A cDNA lacking exon 5 has been reported 1 The N-terminal sequence has been determined 6
Ligands and associated molecules The ligand for CD87 is the urokinase plasminogen activator (uPA) 2'7. CD87 has been shown to associate in monocytes with CD1 la/CD18, CD1 lb/CD18 and Src-related kinasesg. Its association with CDllb/CD18 has also been demonstrated on resting neutrophils 9,~o
Function CD87 is the receptor for uPA, a serine protease that converts plasminogen to plasmin. The major function of CD87 is to retain and concentrate uPA at the
347
cell surface for the local conversion of plasminogen to plasmin. This process is important for the pericellular proteolysis during extravasation of leucocytes and cancer cells 4"11'12. The C D 8 7 - C D l l b / C D 1 8 complex on resting neutrophils dissociates when the cells become polarized following stimulation. CD87 accumulates at the lamellipodium and C D l l b / C D 1 8 at the uropod. This reversible interaction may be important in directing polarized uPA proteolytic activity during cell extravasation and migration lo,13. Database accession numbers PIR
SWISSPROT
EMBL/GENBANK
REFERENCE
Human
A39743
Q03405
Mouse
A55356
P35456
M83246 X51675 X62700 U12235
s 7 x4 is
Amino acid sequence of human CD87 MGHPPLLPLL LRCMQCKTNG TNRTLSYRTG SDMSCERGRH PGCPGSNGFH STHGCSSEET HLGDAFSMNH AAPQPGPAHL
LLLHTCVPAS DCRVEECALG LKITSLTEVV QSLQCRSPEE NNDTFHFLKC FLIDCRGPMN IDVSCCTKSG SLTITLLMTA
WG QDLCRTTIVR CGLDLCNQGN QCLDVVTHWI CNTTKCNEGP QCLVATGTHE CNHPDLDVQY RLWGGTLLWT
LWEEGEELEL SGRAVTYSRS QEGEEGRPKD ILELENLPQN PKNQSYMVRG RSG
VEKSCTHSEK RYLECISCGS DRHLRGCGYL GRQCYSCKGN CATASMCQHA
References 1 z 3 4 s 6 7 8 9 lo 11 12 13 14 is
34[
Casey, J.R. et al. (1994) Blood 84, 1151-1156. Miles, L.A. et al. (1987)Thromb. Haemost. 58, 936-942. Nykj~er, A. et al. (1992) FEBS Lett. 300, 13-17 Nykja~r, A. et al. (1994) J. Immunol. 152, 505-516. Min, H.Y. et al. (1992) J. Immunol. 148, 3636-3642. Behrendt, N. et al. (1990) J. Biol. Chem. 6453-6460. Roldan, A.L. et al. (1990) EMBO J. 9, 467-474. Ploug, M. et al. (1993) J. Biol. Chem. 268, 17539-17546. Bohuslav, J. et al. (1995) J. Exp. Med. 181, 1381-1390. Xue, W. et al. (1994)J. Immunol. 152, 4630-4640. Ellis, V. et al. (1991) J. Biol. Chem. 266, 12752-12758. Bianchi, E. et al. (1994) Cancer Res. 54, 861-866. Kindzelskii, A.L. et al. (1996) J. Immunol. 156, 297-309. Kristensen, P. et al. (1991) J. Cell Biol. 115, 1763-1771. Suh, T.T. et al. (1994) J. Biol. Chem. 269, 25992-25998.
-i 50 i00 150 200 250 283 +30
CD88
C5a receptor
I
Molecular weights Polypeptide 39 321
i
SDS-PAGE reduced
40 kDa
Carbohydrate N-linked sites O-linked sites
1 unknown
i
i I
~ ~
,~~ ~
~ ~
,~~ ~-" GOOH
Human gene location and size 19q13.3-13.4; N10 kb 1.
T i s s u e distribution CD88 is expressed on cells of the myeloid lineage and smooth muscles 2,3. It is also expressed on hepatocytes and on many cell types in the lung, including bronchial and alveolar epithelial cells, alveolar macrophages and vascular endothelial cells 3. CD88 mRNA has also been detected in the heart, spleen, kidney and intestine but the cell types expressing CD88 have not been identified 3
Structure CD88 is a typical G protein-coupled receptor with seven transmembrane domains 4-6
Ligands and a s s o c i a t e d m o l e c u l e s CD88 binds C5a which is the N-terminal fragment (74 residues in the human) of the a chain of the complement component C5. This fragment is proteolytically cleaved from C5 during complement activation. CD88 is coupled to heterotrimeric G proteins inside the cell. In neutrophils, CD88 appears to couple exclusively to the G proteins Gia2 and Gi~3, which are pertussis toxin sensitive z
Function The binding of C5a to CD88 on neutrophils leads to a number of effects including (with increasing C5a concentrations) cytoskeletal remodelling, shedding of selectins, upregulation of adhesion molecules, chemotaxis, granule exocytosis and activation of NADPH oxidase z. A role in mucosal defense to infection is indicated by the failure of CD88 knockout mice to clear Pseudornonas aeruginosa introduced directly into the trachea 8. Curiously, neutrophil migration into the lung was elevated. In contrast migration into the peritoneum in response to glycogen-induced inflammation was decreased 8
~4~
CD88
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A37963
P21730
X57250 M62505 $46665
4 s 9
P30993
Amino acid sequence of human CD88 MNSFNYTTPD GVLGNALVVW HWPFGGAACS GLAWIACAVA AVAIVRLVLG FFIFWLPYQV VVAGQGFQGR
YGHYDDKDTL VTAFEAKRTI ILPSLILLNM WGLALLLTIP FLWPLLTLTI TGIMMSFLEP LRKSLPSLLR
DLNTPVDKTS NAIWFLNLAV YASILLLATI SFLYRVVREE CYTFILLRTW SSPTFLLLNK NVLTEESVVR
NTLRVPDILA ADFLSCLALP SADRFLLVFK YFPPKVLCGV SRRATRSTKT LDSLCVSFAY ESKSFTRSTV
LVIFAVVFLV ILFTSIVQHH PIWCQNFRGA DYSHDKRRER LKVVVAVVAS INCCINPIIY DTMAQKTQAV
References 1 z 3 4 s 6 7 s 9
35C
Gerard, N.P. et al. (1993) Biochemistry 32, 1243-1250. Hugli, T.E. and Mt~ller-Eberhard, H.J. (1978) Adv. I m m u n o l . 26, 1-53. Haviland, D.L. et al. (1995) J. Immunol. 154, 1861-1869. Gerard, N.E and Gerard, C. (1990) Nature 349, 614-617. Boulay, F. et al. (1991) Biochemistry 30, 2993-2999. Rollins, T.E. et al. (1991) Proc. Natl Acad. Sci. USA 88, 971-975. Gerard, C. and Gerard, N.P. (1994) Annu. Rev. Immunol. 12, 758-808. H6pken, U.E. et al. (1996) Nature 383, 86-89. Gerard, C. et al. (1992) J. Immunol. 149, 2600-2606.
50 i00 150 200 250 300 350
CD89
Fc~ receptor, IgA receptor
Molecular weights Polypeptide
al isoform
al a2 a3
29938 27356 19367
SDS-PAGE reduced
50-75 kDa
Carbohydrate N-linked sites
al a2 a3
6 5 2 unknown
O-linked sites
Human gene location and size
COOH
19q13.4; -12 kb / CQA v n r Domains Exon boundaries
Isl LVL
I
EGD
CSS v D r
"~Y I c~ I
""Y c~ ITMI
TGL
~TDI
isoforms
TDS a3
CY
I
YDS a2
Tissue distribution CD89 is expressed on most phagocytic cells in blood and mucosal tissues. It is also found on subpopulations of T and B lymphocytes e-4. The a l and a2 products are differentially expressed. Whereas the a l form is found on monocytes and neutrophils, the a2 form is expressed by alveolar macrophages s. Only the a 1 and a2 forms have been detected at the protein level s
Structure CD89 is a type I m e m b r a n e protein with two IgSF domains e. However, at least six cDNA species have been identified. The a 1, a2 and a3 clones encode transm e m b r a n e proteins and are generated by alternative splicing of exons 1,s. An a' form, also derived from alternative splicing, has been found to have a cryptic leader sequence 6. Two other forms have alternative sequences to the transm e m b r a n e segment and may code for soluble products. Genomic analysis suggests that CD89 is most closely related to the two Fc7 receptors CD16 and CD641. CD89 has an Arg within the t r a n s m e m b r a n e region; it is notable that there is a His residue at the equivalent position in CD64.
Ligands and associated molecules CD89 binds both IgA1 and IgA2 via their Fc regions 7,8. CD89 associates with the 7 chain h o m o d i m e r of Fc receptors 9. This interaction involves the
t51
transmembrane segment of CD89 as shown by mutation of Arg209, which presumably forms a salt bridge with the Asp residue in the transmembrane segment of the 7 chain lo.
Function CD89 binds serum and secretory IgA1 and IgA2 with an affinity of about 5x107M -1 (see refs 3 and 4). Binding can trigger a number of cellular responses including phagocytosis, superoxide production, release of inflammatory mediators and antibody-dependent cellular cytotoxicity 7"8. These responses require the interaction of CD89 with the FcR 7 chain homodimer, possibly through tyrosine phosphorylation 9,1o. The differential expression of the al and a2 isoforms in blood leucocytes and alveolar macrophages may suggest their different roles in blood and mucosal defense s.
Database accession numbers Human al Human a2 Human a3
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
JH0332
P24071
X54150 U43774 U43677
1,s s s
Amino acid sequence of human CD89 (al isoform) MDPKQTTLLC QEGDFPMPFI GRRLKFWNET LYGKPFLSAD GEHPANFSLG DYTTQNLIRM MCQPGLTFAR
LVLCLGQRIQ SAKSSPVIPL DPEFVIDHMD RGLVLMPGEN PVDLNVSGIY AVAGLVLVAL TPSVCK
A DGSVKIQCQA ANKAGRYQCQ ISLTCSSAHI RCYGWYNRSP LAILVENWHS
IREAYLTQLM YRIGHYRFRY PFDRFSLAKE YLWSFPSNAL HTALNKEASA
IIKNSTYREI SDTLELVVTG GELSLPQHQS ELVVTDISHQ DVAEPSWSQQ
-i 50 i00 150 200 245 266
The a2 and a3 isoforms differ from the al isoform by lacking residues 154-195 and residues 100-195, respectively, through alternative splicing.
References 1 z 3 4 s 6 7 s 9 lo
~52
de Wit, T.P.M. et al. {1995) J. Immunol. 155, 1203-1209. Maliszewski, D.R. et al. (1990) J. Exp. Med. 172, 1665-1672. Monteiro, R.C. et al. (1990) J. Exp. Med. 171,597-613. Mazengera, R.L. et al. (1990)Biochem. J. 272, 159-165. Patry, C. et al. (1996) J. Immunol. 156, 4442-4448. Morton, H.C. et al. (1996) Immunogenetics 43, 246-247. Mestecky, J. and McGhee, J.R. (1987) Adv. Immunol. 40, 153-245. Kerr, M.A. (1990)Biochem. J. 271,285-296. Pfefferkorn, L.C. and Yeaman, G.R. (1994) J. Immunol. 153, 3228-3236. Morton, H.C. et al. (1995) J. Biol. Chem. 270, 29781-29787.
CD90
Thy-1, theta NH2
Molecular weights Polypeptide
12 576
SDS-PAGE reduced unreduced
18 kDa 18 kDa
v
Carbohydrate N-linked sites O-linked
3 nil
Human gene location and size 11 q23.3; -8 kb 1
Domains Isl I1 Exon boundaries TVL
CRH I YTCI v
I1 RDK
Tissue distribution CD90 expression varies considerably among different species e'3. It is expressed on the prothymocyte subpopulation of human leucocytes, on all mouse and rat thymocytes, and on mouse but not rat T cells. Rat bone marrow cells also express CD90. In all three species the antigen is expressed abundantly in the brain and at varying levels in other nonlymphoid tissues 2,3
Structure I I
CD90 is a GPI-anchored molecule consisting of a single IgSF V-set domain 2,3 The N-terminus and site of addition of the GPI anchor of rat CD90 have been confirmed 3,4. Although the protein sequence of rat CD90 is invariant, the brain and thymocyte forms of the protein have tissue- and site-specific glycosylation patterns s. The Thy-1 gene maps to a region that also encodes the CD3 antigens and CD56 (NCAM)1. The mouse Thy-1 alloantigens Thy1.1 and Thy-l.2 differ by a single amino acid at residue 89, namely Arg in Thy- 1.1 and Gln in Thy- 1.2 3
Ligands and associated molecules CD90 associates non-covalently with the Src family tyrosine kinase Fyn 6.
Function The function of CD90 is unknown. CD90 mAbs can activate T cells in vitro, which is a feature of other GPI-anchored proteins. The tyrosine kinase Fyn, that associates with Thy-1, is selectively required for activation through Thy-1 6. Thy-1 mAbs can also induce bcl-2-resistant thymocyte apoptosis
{53
--7
!
i i
I i
and a ligand for Thy-1 on thymic epithelial cells has been proposed z. Thy-1 expression on a neural cell line selectively inhibits neurite outgrowth on mature astrocytes in vitro and a Thy-1 ligand on astrocytes is suggested8 Thy-1 knockout mice have provided few clues concerning Thy-1 function. Such mice appear normal apart from having a slight defect in brain function: an impairment of long-term potentiation in the hippocampus is accompanied by normal spatial learning 9. Database accession numbers Human Rat Mouse
A
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A02106 A02107 A02108
P04216 P01830 P01831
Ml1749 X03152 X03151
lo 11 12
Amino acid sequence of human CD90 MNLAISIALL QKVTSLTACL PEHTYRSRTN VTVLRDKLVK EGISLLAQNT
LTVLQVSRG VDQSLRLDCR HENTSSSPIQ YEFSLTRETK KHVLFGTVGV FTSKYHMKVL YLSAFTSKDE GTYTCALHHS GHSPPISSQN C SWLLLLLLSL SLLQATDFMS L
-1 50 i00 111 +31
References
1 Silver, J. (1989)In Cell Surface Antigen Thy-1 (Reif, A.E. and Schlesinger, M., eds). Marcel Dekker, New York, pp. 241-269. 2 Williams, A.F. (1989)In Cell Surface Antigen Thy-1 (Reif, A.E. and Schlesinger, M., eds). Marcel Dekker, New York, pp. 49-69. 3 Williams, A.F. and Gagnon, J. (1982) Science 216, 696-703. 4 Tse, A.G. et al. (1985) Science 230, 1003-1008. s Williams, A.F. et al. (1993) Glycobiology 3, 339-348. 6 Lancki, D.W. et al. (1995) J. Immunol. 154, 4363-4370. 7 Hueber, A.O. et al. (1994)J. Exp. Med. 179, 785-796. 8 Tiveron, M.C. et al. (1992) Nature 355, 745-748. 9 Nosten-Bertrand, M. et al. (1996) Nature 379, 826-829. lo Seki, T. et al. (1985) Proc. Natl Acad. Sci. USA 82, 6657-6661. 11 Seki, T. et al. (1985) Fedn. Proc. 44, 2865-2869. 12 Seki, T. et al. (1985) Science 227, 649-651.
~5~
CD91
~2-Macroglobulin receptor/LDL receptor-related protein
Molecular weights Polypeptide 502 671
[
2 2
SDS-PAGE reduced unreduced
N500 + 85 kDa -500 + 85 kDa
8 2
Carbohydrate N-linked sites O-linked sites
4[
53 unknown
10 2
Human gene location 12q13.1-q13.3 11
g
2 7
G G
COOH
Tissue distribution CD91 is expressed on the phagocytes of liver, lung and lymphoid tissues, but not on antigen-presenting cells. It is also expressed on neurons and astrocytes in the central nervous system, epithelial cells of the gastrointestinal tract, smooth muscle cells, fibroblasts, Leydig cells in the testis, granulosa cells in the ovary and dendritic cells of the kidney 1. Structure CD91 is a type I membrane protein. Its single chain precursor is cleaved to yield an N-terminal fragment of N500 kDa and a C-terminal 85 kDa transmembrane fragment. The two fragments remain non-covalently associated with each other. The extracellular region is composed of 31 LDLR type A domains, 16 EGF type B1 domains, 40 LY domains containing the YWTD sequence and 6 EGF type B2 domains, arranged as shown z3 The N-termini of both fragments have been determined by protein sequencing 2. The cleavage site is shown indented in the sequence below. Highly homologous molecules are found in the mouse 4 (97% homology to h u m a n ) a n d chicken s (83% homology to human). CD91 is a member of the LDLR supergene family, the other members are the LDLR, VLDLR and a glycoprotein of 330kDa (gp330) 6. In the figure, the domains of CD91 are not drawn out in full. The numbers appearing on either side indicate the number of repeats of that
~5~
CD91
domain or group of domains (containing five LY and one EGF domains)present at that position. The gap between the two LY domains near the membrane indicates the cleavage site between the N- and C-terminal fragments.
Ligands and associated molecules CD91 binds protease-inhibitor complexes such as protease/a2-macroglobulin complexes and complexes between the tissue-type and urinary-type plasminogen activators and plasminogen activator inhibitor type 1. It also binds a variety of other molecules including lipoproteins, toxins, viruses and lactoferrin 6. Several of its ligands do not cross-compete for binding, suggesting that there are multiple binding sites on CD91 7. CD91 associates at a ratio of 1"2 with receptor-associated protein (RAP), a soluble 39 kDa protein which is localized to the rough endoplasmic reticulum 2,4,7.
i
Function CD91 appears to be involved in the regulation of proteolytic activity and lipoprotein metabolism, amongst other things. The presence of the two NPXY motifs in its cytoplasmic segment suggests that it may bind to and remove its many ligands from the circulation by receptor-mediated endocytosis. RAP interferes with CD91 binding to its ligands, suggesting its role in preventing CD91 binding to intracellular proteins in the endoplasmic reticulum while en route to the cell surface 7.
Database accession numbers PIR
Human Mouse Chicken
SWISSPR O T
EMBL/GENBANK
REFERENCE
Q07954
X13916 X67469 X74904
3 4 s
P98157
Amino acid sequence of human CD91 MLTPPLLLLL AIDAPKTCSP QRCQPNEHNC CQHHCVPTLD SFICGCVEGY STITPTSTRQ EHTINISLSL LELYNPKGIA HGITLDLVSR ENYLYATNSD QPRVRSHACE KKPEHELFLV TGFIYFADTT DDGPKKTISV DPKDSRRGRL YDRIETILLN RGVGGAPPTV ATPGSRQCAC
35~
PLLSALVAA KQFACRDQIT LGTELCVPMS GPTCYCNSSF LLQPDNRSCK TTAMDFSYAN HHVEQMAIDW LDPAMGKVFF LVYWADAYLD NANAQQKTSV NDQYGKPGGC YGKGRPGIIR SYLIGRQKID ARLEKAAQTR ERAWMDGSHR GTDRKIVYEG TLLRSERPPI AEDQVLDADG
CISKGWRCDG RLCNGVQDCM QLQADGKTCK AKNEPVDRPP ETVCWVHVGD LTGNFYFVDD TDYGQIPKVE YIEVVDYEGK IRVNRFNSTE SDICLLANSH GMDMGAKVPD GTERETILKD KTLIEGKMTH DIFVTSKTVL PELNHAFGLC FEIRMYDAQQ VTCLANPSYV
ERDCPDGSDE DGSDEGPHCR DFDECSVYGT VLLIANSQNI SAAQTQLKCA IDDRIFVCNR RCDMDGQNRT GRQTIIQGIL YQVVTRVDKG KARTCRCRSG EHMIPIENLM GIHNVEGVAV PRAIVVDPLN WPNGLSLDIP HHGNYLFWTE QQVGTNKCRV PPPQCQPGEF
APEICPQSKA ELQGNCSRLG CSQLCTNTDG LATYLSGAQV RMPGLKGFVD NGDTCVTLLD KLVDSKIVFP IEHLYGLTVF GALHIYHQRR FSLGSDGKSC NPRALDFHAE DWMGDNLYWT GWMYWTDWEE AGRLYWVDAF YRSGSVYRLE NNGGCSSLCL ACANSRCIQE
-i 5O i00 150 200 250 300 350 400 450 5OO 550 600 650 7O0 750 800 85O
RWKCDGDNDC CGNSEDESNA SCAYPTCFPL FKCNSGRCIP RLDGLCIPLR SKAWVCDGDN DDCGDGSDEG TCQIQSYCAK FIIFSNRHEI KIYRGKLLDN VAKLDGTLRT GAGRRTVHRE EVLRGHEFLS TQPFDLQVYH KLHKDNTTCY LDYDAREQRV RNLFWTSYDT DGDNISMANM NLDGSGLEVI SVVLRNSTTL SCMCTAGYSL GTSLAVGIDF IAVDWIAGNI GYLFWTEWGQ DARTDKIERI IKRGSKDNAT CLYRGRGQRA LNAPVQPFED DDGSRRITIV FERETVITMS TLIEKDIRTP VHPFGLAVYG VANDTNSCEL VNSSCRAQDE FRQCSNGRCV NSSRCNQFVD SWVCDGANDC DCEHGEDETH EGKTCGPSSF DREFMCQNRQ CLSSRQWECD EALLCNGQDD GFRLKDDGRT SCKAVTDEEP MIYWTDVTTQ DKGRDTIEVS IGRIGMDGSS GSNRHVVLSQ STLHRPMDLH NFYLGSDGRT PEFKCRPGQF CTNTNRCIPG VWVCDRDNDC CGDGSDEPKE
LDNSDEAPAL TCSARTCPPN TQFTCNNGRC EHWTCDGDND WRCDGDTDCM DCEDNSDEEN ELCDQCSLNN HLKCSQKCDQ RRIDLHKGDY GALTSFEVVI TLLAGDIEHP TGSGGWPNGL HPFAVTLYGG PSRQPMAPNP EFKKFLLYAR YWSDVRTQAI NKKQINVARL DGSNRTLLFS DAMRSQLGKA VMHMKVYDES RSGQQACEGV HAENDTIYWV YWTDQGFDVI YPRIERSRLD DLETGENREV DSVPLRTGIG CACAHGMLAE PEHMKNVIAL ENVGSVEGLA GDDHPRAFVL NGLAIDHRAE EHIFWTDWVR SPCRINNGGC FECANGECIN SNMLWCNGAD CEDASDEMNC GDYSDERDCP CNKFCSEAQF SCPGTHVCVP CIPKHFVCDH GENDCHDQSD CGDSSDERGC CADVDECSTT FLIFANRYYL GSMIRRMHLN KLNGAYRTVL RSVIVDTKIT DIPHIFALTL VFHALRQPDV CVSNCTASQF QCSTGICTNP IFRCNGQDNC VDGSDEPANC ECDERTCEPY
CHQHTCPSDR FKCENNRCIP QFSCASGRCI PISWTCDLDD ININWRCDND NDCGDNSDEA CGDYSDETHA NCTNQATRPP DSSDEKSCEG VTHVCDPSVK CESLACRPPS HPCANNTSVC GGCSHNCSVA PGEGIVCSCP NKFSVKCSCY EGWVLEPDGE S V L V P G L R N T IALDFHLSQS QYGLATPEGL AVDWIAGNIY R A I A L D P R D G ILFWTDWDAS TVDYLEKRIL WIDARSDAIY EVYWTDWRTN TLAKANKWTG CEANGGQGPC SHLCLINYNR QMEIRGVDLD APYYNYIISF KRAFINGTGV ETVVSADLPN DGSFKNAVVQ GLEQPHGLVV GQKGPVGLAI DFPESKLYWI TALAIMGDKLWWADQVSEKM IQLDHKGTNP CSVNNGDCSQ GSFLLYSVHE GIRGIPLDPN DMGLSTISRA KRDQTWREDV EVARLNGSFR YVVISQGLDK GTERVVLVNV SISWPNGISV VLSSNNMDMF SVSVFEDFIY VQLKDIKVFN RDRQKGTNVC DGASCREYAG YLLYSERTIL AFDYRAGTSP GTPNRIFFSD YHRGWDTLYW TSYTTSTITR DECQNLMFWT NWNEQHPSIM KLYFSDATLD KIERCEYDGS RAVQRANKHV GSNMKLLRVD QDLCLLTHQG HVNCSCRGGR FSLTCDGVPH CKDKSDEKPS DCGDGSDEIP CNKTACGVGE SATDCSSYFR LGVKGVLFQP GVKRPRCPLN YFACPSGRCI ECQNHRCISK QWLCDGSDDC ERWLCDGDKD CADGADESIA DRDCADGSDE SPECEYPTCG EAPKNPHCTS PEHKCNASSQ HINECLSRKL SGCSQDCEDL FPCSQRCINT HGSYKCLCVE RKLNLDGSNY TLLKQGLNNA GSNVQVLHRT GLSNPDGLAV VSSGLREPRA LVVDVQNGYL WPNGLTLDYV TERIYWADAR FEDYVYWTDW ETKSINRAHK PNHPCKVNNG GCSNLCLLSP VCKNDKCIPF WWKCDTEDDC AFICDGDNDC QDNSDEANCD GDGEDERDCP EVTCAPNQFQ TQMTCGVDEF RCKDSGRCIP QFRCKNNRCV PGRWQCDYDN
NRWLCDGDND DCGDRSDESA GCSHSCSSTQ GGCHTDEFQC FGCKDSARCI LPPDKLCDGN LGMELGPDNH SCRSLDPFKP ALYWTDVVED WVESNLDQIE LPRIEAASMS SARYDGSGHM HNVTVVQRTN TVSCACPHLM TVPDIDNVTV AHGLAVDWVS HPLRGKLYWT SSGNHTINRC GTCSKADGSG LCLPTSETTR DKSDALVPVS VTNGIGRVEG PRAITVHPEK DYQDGKLYWC WSDRTHANGS AVANGGCQQL KSIHLSDERN IHFGNIQQIN HTVDQTRPGA RAALSGANVL HRYVILKSEP IPQQPMGIIA ILQDDLTCRA YCNSRRCKKT FRCRDGTCIG CERTSLCYAP PMSWTCDKED GDGSDEAAHC AGCLYNSTCD PSEFRCANGR FLCSSGRCVA KIGFKCRCRP GYAPRGGDPH VALDFDYREQ DWVGGNLYWC YWTDWGDHSL EDYIEFASLD TTGTNKTLLI GGGHKCACPT GDHSDEPPDC IHVCLPSQFK CSITKRCIPR ARWKCDGEDD DCGDNSDEES
900 950 i000 1050 ii00 1150 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 2OO0 2050 2100 2150 2200 2250 2300 2350 2400 2450 250O 2550 2600 2650 2700 2750 2800 2850 2900 2950 3000 3050 3100 3150 3200 3250 3300 3350 3400 3450 3500 3550
~5~
CD91
CTPRPCSESE CKSGHCIPLR AWKCDGEDDC DNCGDGTDEE GSDEEDCSID PGCQDINECL LYIADDNEIR HTGTISYRSL
FSCANGRCIA WRCDADADCM GDNSDENPEE DCEPPTAHTT PKLTSCATNA RFGTCSQLCN SLFPGHPHSA PPAAPPTTSN
GNVYWTDSGR WGNHPKIETA GSIRLNGTDP HSPLVNLTGG TCPNGKRLDN CRCQPRYTGD QVCAGYCANN MAADGSRQCR VAPSCLTCVG HIASILIPLL NPTYKMYEGG SLASTDEKRE
DVIEVAQMKG AMDGTLRETL IVAADSKRGL LSHASDVVLY GTCVPVPSPT KCELDQCWEH STCTVNQGNQ CTAYFEGSRC HCSNGGSCTM LLLLLVLVAG EPDDVGGLLD LLGRGPEDEI
GRWKCDGDHD DGSDEEACGT CARFVCPPNR HCKDKKEFLC SICGDEARCV NTKGGHLCSC YEQAFQGDES RHRR QIDRGV ENRKTLISGM VQDNIQWPTG SHPFSIDVFE HQHKQPEVTN PPPDAPRPGT CRNGGTCAAS PQCRCLPGFL EVNKCSRCLE NSKMMPECQC VVFWYKRRVQ ADFALDPDKP GDPLA
CADGSDEKDC GVRTCPLDEF PFRCKNDRVC RNQRCLSSSL RTEKAAYCAC ARNFMKTHNT VRIDAMDVHV
TPRCDMDQFQ QCNNTLCKPL LWIGRQCDGT RCNMFDDCGD RSGFHTVPGQ CKAEGSEYQV KAGRVYWTNW
3600 3650 3700 3750 3800 3850 3900
THLNISGLKM IDEPHAIVVD LAVDYHNERL DYIYGVTYIN PCDRKKCEWL CNLQCFNGGS PSGMPTCRCP GDRCQYRQCS GACVVNKQSG PPHMTGPRCE GAKGFQHQRM TNFTNPVYAT
PRGIAIDWVA PLRGTMYWSD YWADAKLSVI NRVFKIHKFG CLLSPSGPVC CFLNARRQPK TGFTGPKCTQ GYCENFGTCQ DVTCNCTDGR EHVFSQQQPG TNGAMNVEIG LYMGGHGSRH
3950 4000 4050 4100 4150 4200 4250 4300 4350 4400 4450 4500 4525
References 1 2 3 4 s 6 7
Moestrup, S.K. et al. (1992) Cell Tissue Res. 269, 375-382. Kristensen, T. et al. (1990) FEBS Lett. 276, 151-155. Herz, J. et al. (1988) EMBO J. 7, 4119-4127. van Leuven, F. et al. (1993) Biochim. Biophys. Acta 1173, 71-74. Nimpf, P. et al. (1994) J. Biol. Chem. 269, 212-219. Williams, S.E. et al. (1994) Ann. N.Y. Acad. Sci. 737, 1-13. Williams, S.E. et al. (1992) J. Biol. Chem. 267, 9035-9040.
Molecular weights SDS-PAGE reduced unreduced
70 kDa 70 kDa
Tissue distribution CDw92 is expressed primarily by myeloid cells. Expression is stronger on monocytes than on granulocytes. Lymphocytes, endothelial cells, epithelial cells and fibroblasts are weakly positive for CDw921
Structure Unknown.
Function Unknown.
Reference 1 Majdic, O. et al. (1995) Leucocyte Typing V, 984-985.
35~
CD93 Molecular weights SDS-PAGE reduced unreduced
120 kDa 110 kDa
Tissue distribution CD93 is expressed on granulocytes, monocytes and endothelial cells, but absent from lymphocytes, red blood cells and platelets 1
Structure The structure of CD93 is unknown. However, treatment of KGla cells (CD93 § with O-sialoglycoprotease, which selectively cleaves O-sialylated peptides, reduces the binding of all four anti-CD93 mAbs by greater than 90%. This suggests that CD93 is an O-sialoglycoprotein 1. The anti-CD93 mAbs are not dependent on sialic acid for their reactivity, since neuraminidase treatment of KGla cells does not abolish their binding.
i
Ligands and associated molecules Unknown.
Function Unknown.
Reference 1 Mai, I. et al. (1995) Leucocyte Typing V, 986-988.
t6C
CD94
Kp43
CD94
Molecular weights Polypeptide 20 497
N KG2
SDS-PAGE 1 non-reduced 70 kDa (CD94/NKG2 heterodimer) reduced 30 kDa (plus N40-43 kDa NKG2 subunit) ~ ~ Carbohydrate N-linked sites 2 O-linked none
~~
~
~ ~ ~
NH2
Human gene location 12p12.3-p13.1
<S
NH2
Domain
I Ou
I]
$ s
CAS [
EYC
I
,,
I
Tissue distribution Expressed on most freshly-isolated NK cells but with wide variation in levels of expression 2,3. Also present on a subset of 7~ and ~fl T cells 4 Structure CD94 is a type II transmembrane glycoprotein with a C-type lectin domain in its extracellular portion and a very short cytoplasmic domain. CD94 is structurally related to several other molecules with C-type lectin domains (Ly-49, CD161, NKG2 and CD69)which are encoded within the NK gene complex (human chromosome 12/mouse chromosome 6 ) a n d which all contribute to NK cell function s. Indeed, CD94 is expressed as a disulfide-linked heterodimer with a 40-43 kDa NKG2 subunit 1,6. Based on transfection studies CD94 may also be expressed as a disulfide-linked homodimer z L i g a n d s and a s s o c i a t e d m o l e c u l e s Functional studies suggest that CD94/NKG2 heterodimers may be involved in recognition of HLA-A, -B, and -C ligands, but there is currently no evidence for direct binding 1,6. Function CD94/NKG2 receptors play a role in recognition of MHC Class I molecules by NK cells and some cytotoxic T cells 1,6,8. The ligation of CD94 on NK cells inhibits killing of target cells which express suitable MHC Class I molecule ligands. However, CD94 ligation appears to activate some NK cells, suggesting that there may be stimulatory and inhibitory forms of CD94 8, probably due to the association of CD94 with different NKG2 subunits 6
~61
Database accession numbers PIR
Human
f<
SWISSPR OT
EMBL/GENBANK
REFERENCE
U30610
7
Amino acid sequence of human CD94 MAVFKTTLWR LISGTLGIIC LSLMATLGIL LKNSFTKLSI EPAFTPGPNI
ELQKDSDCCS CQEKWVGYRC NCYFISSEQK TWNESRHLCA SQKSSLLQLQ NTDELDFMSS SQQFYWIGLS YSEEHTAWLW ENGSALSQYL FPSFETFNTK NCIAYNPNGN ALDESCEDKN RYICKQQLI
References
. . . . . . . . . . .
36~
1 2 3 4 s 6 7 8
Phillips, J.H. et al. (1996) I m m u n i t y 5, 163-172. Aramburu, J. et al. (1990) J. Immunol. 144, 3238-3247. Moretta, A. et al. (1996) Annu. Rev. Immunol. 14, 619-648. Mingari, M.C. et al. (1995)Int. Immunol. 7, 697-703. Gumperz, J.E. and Parham, P. (1995) Nature 378, 245-248. Lazetic, S. et al. (1996) J. I m m u n o l . 157, 4 7 4 1 - 4 7 4 5 . Chang, C. et al. (1995) Eur. J. Immunol. 25, 2433-2437. P4rez-Villar, J.J. et al. (1995) J. Immunol. 154, 5779-5788.
50 I00 150 179
Fas, Apo-1
Molecular weights Polypeptide 36023 SDS-PAGE reduced
43 kDa
Carbohydrate N-linked sites O-linked
2 unknown
Tr
Human gene location
10q24.1 NLE
Domains I sl
!I
Zr
PDciQE TKcCKP[ TKC, Tr
I
tr
ITMiOYI
l
COOH
Tissue distribution CD95 is expressed by activated lymphocytes, monocytes, neutrophils, fibroblasts and cell lines 1,2. Mouse CD95 mRNA is also found in liver, heart, lung and ovary 1.
Structure Fas is a member of the TNFR superfamily and contains three cysteine-rich repeats l'z The cytoplasmic sequence contains a "death domain" motif which has similarity to the cytoplasmic domain of tumour necrosis factor receptor 12,3
Ligands and associated molecules The extracellular region of CD95 binds to FasL. Intracellular molecules isolated by the yeast two hybrid technique as interacting with the CD95 cytoplasmic domain contain a "death domain" and include MORT/FADD 2-4 FasL binding to CD95 recruits an ICE (interleukin-converting enzyme)-related protease via its association with MORT s. The C-terminal 15 amino acids of CD95 associate with a protein tyrosine phosphatase 2-4
Function FasL binding to CD95 induces apoptosis in activated mature lymphocytes, thus has a role in maintaining peripheral tolerance but does not appear critical in development 4'6. Autoimmune disease in lpr mouse and in humans is associated with mutations in CD95 3,4,6. The mechanism of killing by CD95 is thought to involve ICE and ICE-like proteases 4. FasL on cytotoxic T cells can induce cytolysis of target cells expressing CD95 2,6,7
36~
Database accession numbers Human Mouse
A.
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A40036 A46484
P25445 P25446
M67454 M83649
8 9
Amino acid sequence of human CD95 MLGIWTLLPL RLSSKSVNAQ ERKARDCTVN INCTRTQNTK EEGSRSNLGW PETVAINLSD VQDTAEQKVQ DITSDSENSN
VLTSVA VTDINSKGLE GDEPDCVPCQ CRCKPNFFCN LCLLLLPIPL VDLSKYITTI LLRNWHQLHG FRNEIQSLV
LRKTVTTVET EGKEYTDKAH STVCEHCDPC IVWVKRKEVQ AGVMTLSQVK KKEAYDTLIK
QNLEGLHHDG FSSKCRRCRL TKCEHGIIKE KTCRKHRKEN GFVRKNGVNE DLKKANLCTL
QFCHKPCPPG CDEGHGLEVE CTLTSNTKCK QGSHESPTLN AKIDEIKNDN AEKIQTIILK
References 1 2 3 4 s 6 7 8 9
}64
Gruss, H.-J. and Dower, S.K. (1995) Blood 85, 3378-3404. Nagata, S. and Golstein, P. (1995) Science 267, 1449-1456. Cleveland, J.L. and Ihle, J.N. (1995) Cell 81,479-482. van Parijs, L. and Abbas, A.K. (1996) Curr. Opin. Immunol. 8, 355-361. Fraser, A. and Evan, G. (1996) Cell 85, 781-784. Lynch, D.H. et al. (1995)Immunol. Today 16, 569-574. Takayama, H. et al. (1995) Adv. Immunol. 60, 289-321. Itoh, N. et al. (1991) Cell 66, 233-243. Watanabe-Fukunaga, R. et al. (1992) J. Immunol. 148, 1274-1279.
-i 50 i00 150 200 250 300 319
Tactile Molecular weights Polypeptide
61 258
SDS-PAGE reduced unreduced
160 kDa 160, 180, 240 kDa
$ s
Carbohydrate N-linked sites O-linked probable
15 ++++
Domains CQT
CFQ
I sl
I
v
YEC,
I
I
v
FSC,
CLL
I
I
v
mC
I
I T~ I'~u
OOOH
Tissue distribution CD96 is expressed at low levels on peripheral T cells and NK cells and expression on both is increased on activation and reaches a m a x i m u m at 6-9 days 1,2. CD96 is not expressed on B cells or monocytes 1,2. CD96 is expressed on T cell lines 1,2
Structure The extracellular domain is made up of three IgSF domains followed by a membrane-proximal serine/threonine/proline-rich stalk region which is probably O-glycosylated. The N-terminal IgSF domain contains three Cys residues in addition to the pair predicted to form the intersheet disulfide. The cytoplasmic domain has a basic/proline-rich region. Some CD96 may exist as a h o m o d i m e r 1,2.
7]
Function Unknown.
Database accession numbers Human
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A46462
P40200
M88282
1
~6E
CD96
A
Amino acid sequence of human CD96 MEKKWKYCAV VWEKTVNTEE HPQYGFYCAY LYPEGIQTKI SSEFTYAWSV IFDDGRKFSC FTCLLKNVFP TRVHSNKPAQ PLSVTESTLD MTTRGFNYPW AFSEVPTTAN LGVRKWCQYQ
YYIIQIHFVK NVYATLGSDV GRPCESLVTF YNLLIQTHVT EDNGTQETLI HIRVGPNKIL KANITWFIDG SDNLTIWCMA TQPSPASSVS TSSGTDTKKS GSTKTNHVHI KEIMERPPPF
G NLTCQTQTVG TETPENGSKW ADEWNSNHTI SQNHLISNST RSSTTVKVFA SFLHDEKEGI LSPVPGNKVW PARYPATSSV VSRIPSETYS TGIVVNKPKD KPPPPPIKYT
FFVQMQWSKV TLHLRNMSCS EIEINQTLEI LLKDRVKLGT KPEIPVIVEN YITNEERKGK NISSEKITFL TLVDVSALRP SSPSGAGSTL GMSWPVIVAA CIQEPNESDL
TNKIDLIAVY VSGRYECMLV PCFQNSSSKI DYRLHLSPVQ NSTDVLVERR DGFLELKSVL LGSEISSTDP NTTPQPSNSS HDNVFTSTAR LLFCCMILFG PYHEMETL
-i 50 i00 150 200 250 300 350 400 450 500 548
References 1 Wang, P.L. et al. (1992) J. Immunol. 148, 2600-2608.
z Wang, EL. and Krensky, A.M. (1995) In Leucocyte Typing V (Schlossman, S.E et al., ed.) Oxford University Press, Oxford, pp. 1149-1150.
36{
Molecular weights Polypeptide 79 665 SDS-PAGE reduced unreduced
78-85 kDa 78-85 kDa
Carbohydrate N-linked sites O-linked
8 unknown
Human gene location and size 19p13.12-p13.2; 12kb 1
GOOH
Tissue distribution CD97 is constitutively expressed on granulocytes, monocytes and at low levels on resting T cells and B cells 2,3. Expression is rapidly upregulated following activation of T cells and B cells 2,3
I
I !
I
Structure CD97 has seven potential membrane spanning domains and three extracellular EGF domains4 There is a single RGD sequence. CD97 shows sequence similarity to EMR1 and F4/80 and is a member of the EGF-TM7 family s. The seven-span transmembrane region also shows approximately 25% sequence identity to members of the recently described glucagon/vasoactive intestinal peptide/calcitonin receptor family 6.
Ligands and associated molecules COS cells expressing CD97 adhere to lymphocytes and erythrocytes. This interaction is blocked by mAbs specific for CD55, a proposed cellular ligand for CD97 7. Immunoprecipitation studies have identified a 58 kDa protein non-covalently associated with CD97 on the cell surface 2.
Function Unknown.
|67
Database accession numbers Human
Amino
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
$54875
P48960
X84700
4
acid sequence
MGGRVFLAFC QDSRGCARWC PSKVSCGKFS SSGQHQCDSS WTPPPGVHSQ DVEALAPPVR IQERGDKNVT ASLNLHSKKQ FAFSHLESSD QVLGSKNGST LCILTFLLVR LVAGLLHYCF LLIVGVSAAI TVWKLTQKFS SLVLTYVFTI TSTTSGTGHN
VWLTLPGAET PQNSSCVNAT DCWNTEGSYD TVCFNTVGSY TLSRFFDKVQ HLIATQLLSN MGQSSARMKL AELEEIYESS GEAGRDPPAK TCQCSHLSSF PIQGSRTTIH LAAFCWMSLE YSKGYGRPRY EINPDMKKLK LNCLQGAFLY QTRALRASES
of h u m a n
CD97
ACRCNPGFSS CVCSPGYEPV SCRCRPGWKP DLGRDSKTSS LEDIMRILAK NWAVAAGAED IRGVQLRRLS DVMPGPRQEL TILMAHYDVE LHLCICLFVG GLELYFLVVR CWLDFEQGFL KARALTITAI LLHCLLNKKV GI
FSEIITTPTE SGAKTFKNES RHGIPNNQKD AEVTIQNVIK SLPKGPFTYI PGPAVAGILS AVNSIFLSHN LCAFWKSDSD DWKLTLITRV STIFLAGIEN VFQGQGLSTR WSFLGPVTFI AQLFLLGCTW REEYRKWACL
TCDDINECAT ENTCQDVDEC TVCEDMTFST LVDELMEAPG SPSNTELTLM IQNMTTLLAN NTKELNSPIL RGGHWATEVC GLALSLFCLL EGGQVGLRCR WLCLIGYGVP ILCNAVIFVT VFGLFIFDDR VAGGSKYSEF
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 700 722
References 1 e 3 4 s 6 7
Hamann, J. et al. (1996) Genomics 32, 144-147. Eichler, W. et al. (1994) Scand. J. Immunol. 39, 111-115. Pickl, W.F. et al. (1995) Leucocyte Typing V, 1151-1153. Hamann, J. et al. (1995) J. Immunol. 155, 1942-1950. McKnight, A.J. and Gordon, S. (1996)Immunol. Today 17, 283-287. Segre, G.V. and Goldring, S.R. (1993)Trends Endocrinol. Metab. 4, 309-314. Hamann, J. et al. (1996) J. Exp. Med. 184, 1185-1189.
N o t e a d d e d in p r o o f J
36~
The isolation of further human CD97 cDNA clones indicates that there are five extracellular EGF domains (GENBANK accession number U76764), which may be alternatively spliced to generate the form of the molecule shown here (Gray, J.X. et al. (1996) J. Immunol. 157, 5438-5447). Further, the CD97 antigen is shown to be processed into two non-covalently associated subunits (CD97 a and fl)prior to expression on the cell surface.
4F2, FRP-1, RL-388 (mouse) Molecular weights Polypeptides
57 039
SDS-PAGE reduced unreduced
80 kDa, 40 kDa 120 kDa
~.light ~ ,~,7
chain (40kD)
Carbohydrate 80 kDa chain: N-linked sites 4 O-linked unknown 40 kDa chain: non-glycosylated
Human gene location and size llq; 8kb 1
NH2
Tissue distribution CD98 is expressed at high levels on monocytes but very low levels on peripheral blood T and B cells, splenocytes, NK cells and granulocytes 2. Con A and alloantigen-activated lymphocytes and activated NK cells express high levels of this antigen 2. It is also expressed on some non-haematopoietic cells 2. All human cell lines studied express CD98 2. Studies in mouse show CD98 is expressed at the beginning of haematopoiesis a.
Structure CD98 is a heterodimer consisting of a disulfide-linked glycosylated heavy chain (80-90 kDa) and a non-glycosylated light chain (40 kDa). The heavy chain is a type II integral membrane protein containing a 73 amino acid Nterminal cytoplasmic domain, a single transmembrane sequence and a large extracellular domain 4. The N-terminus has been determined by amino acid sequencing s.
Ligands and associated molecules Unknown.
Function A role in cell growth and death for CD98 is suggested by studies with mAbs a. CD98 mAb inhibits lectin-induced mitogenesis of human T cells by 50%, but does not inhibit the mixed lymphocyte reaction (MLR), antibody-dependent cellular cytotoxicity, T cell-mediated cytotoxicity or NK cell activity 2'3 CD98 mAb had inhibitory effects on early haematopoietic progenitors 3. CD98 mAbs induce cell fusion s, homotypic aggregation and tyrosine phosphorylation 3. Experiments in Xenopus oocytes suggesting CD98 is an amino acid transporter are controversial a,7
~6c,
CD98
Database accession numbers Human Mouse Rat
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A28455 S03600
P08195 P 10852
J02939 X 14309 X89225
4 6 7
Amino acid sequence of human CD98 heavy chain MKEVELNELE KFTGLSKEEL CRELPAQKWW VLGPIHKNQK PNYRGENSWF LAEWQNITKG STGEHTKSLV LPGTPVFSYG KGQSEDPGSL NERFLVVLNF ELERLKLEPH
PEKQPMNAAS LKVAGSPGWV HTGALYRIGD DDVAQTDLLQ STQVDTVATK FSEDRLLIAG TQYLNATGNR DEIGLDAAAL LSLFRRLSDQ GDVGLSAGLQ EGLLLRFPYA
GAAMSLAGAE RTRWALLLLF LQAFQGHGAG IDPNFGSKED VKDALEFWLQ TNSSDLQQIL WCSWSLSQAR PGQPMEAPVM RSKERSLLHG ASDLPASASL A
KNGLVKIKVA WLGWLGMLAG NLAGLKGRLD FDSLLQSAKK AGVDGFQVRD SLLESNKDLL LLTSFLPAQL LWDESSFPDI DFHAFSAGPG PAKADLLLST
EDEAEAAAAA AVVIIVRAPR YLSSLKVKGL KSIRVILDLT IENLKDASSF LTSSYLSDSG LRLYQLMLFT PGAVSANMTV LFSYIRHWDQ QPGREEGSPL
References 1 2 3 4 s 6 7
37(
Gottesdiener, K.M. et al. (1988) Mol. Cell. Biol. 8, 3809-3819. Haynes, B.F. et al. (1981) J. I m m u n o l . 126, 1409-1414. Warren, A.P. et al. (1996) Blood 87, 3676-3687. Q u a c k e n b u s h , E. et al. (1987) Proc. N a t l Acad. Sci. USA 84, 6526-6530. Ohgimoto, S. et al. (1995)J. I m m u n o l . 155, 3 5 8 5 - 3 5 9 2 . Parmacek, M.S. et al. (1989) Nucleic Acids Res. 17, 1915-1931. Broer, S. et al. (1995) Biochem. J. 312, 863-870.
50 i00 150 200 250 300 350 400 450 500 521
M I C 2 , E2, 12E7, H u L y - m 6 , F M C 2 9
Molecular weights Polypeptide 16 728 SDS-PAGE reduced unreduced
32 kDa 32 kDa
Carbohydrate N-linked sites O-linked
nil + abundant
ttttt
Human gene location and size MIC2X: Xp22.32-pter MIC2Y: Yp 11.2-pter; 52 kb i
COOH I sI
Exon boundaries
,
ITMI cvl
DGG'X~DD'X~SF'~DA/EQ X DNE DDP GKG
VQR
Tissue distribution CD99 is expressed on all leucocyte lineages e. Expression is highest on thymocytes and falls on maturation 2. Expression on CD4 § and CD8 § T cells is bimodal and within the CD4 § population, expression is higher on the CD45RA- subset2. NK cells and monocytes express high levels of CD99 2 Expression of CD99 and Xga appears to be linked 3. CD99 is highly expressed at the surface of Xg(a+)red blood cells and shows low expression on Xg(a-) red blood cells 3. CD99 expression is high on Ewing's tumours 2.
Structure The extracellular domain of CD99 contains five Gly-X-Y repeats, such as found in collagen and collagen-like proteins 5 and also contains three repeats of 16 amino acids 4 CD99 contains a high content of Pro4'5 and is highly glycosylated with O-linked sugars 6. The N-terminus has been established by protein sequencing 6. CD99 shares 48% amino acid identity (including conservative changes) to Xga antigen, originally defined as a blood group polymorphism 3. The CD99 gene locus is in the pairing region of the human X and Y chromosomes and was the first pseudoautosomal gene to be described in man/,3. CD99X escapes X inactivation 1,3. The CD99 gene is closely linked (within 10 kb)to the Xga gene 3.
Function CD99 on thymocytes and T cells is involved in rosette formation with sheep or human erythrocytes 5,7. CD99 mAbs induce homotypic adhesion of CD4+CD8 § thymocytes 7.
371
Database accession numbers Human
ft
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
SO6786
P14209
X16996
5
Amino acid sequence of human CD99 MARGAALALL DGGFDLSDAL PKPMPNPNPN APGVIPGIVG NAEPAVQRTL
LFGLLGVLVA PDNENKKPTA HPSSSGSFSD AVVVAVAGAI LEK
AP IPKKPSAGDD FDLGDAVVDG ENDDPRPPNP ADLADGVSGG EGKGGSDGGG SHRKEGEEAD SSFIAYQKKK LCFKENAEQG EVDMESHRNA
References 1 z a 4 s 6 7
37~
Smith, M.J. et al. (1993) Hum. Mol. Genet. 2, 417-422. Tippett, P. (1995) Immunol. Invest. 24, 173-186. Ellis, N.A. et al. (1994) N a t u r e Genet. 8, 285-289. San&in, M.S. et al. (1992) Immunogenetics 35, 283-285. Gelin, C. et al. (1989) EMBO J. 11, 3253-3259. Aubrit, F. et al. (1989) Eur. J. Immunol. 19, 1431-1436. Bernard, G. et al. (1995)J. Immunol. 154, 26-32.
-I 50 i00 150 163
i
Molecular weights Polypeptide
93 996
SDS-PAGE reduced unreduced
150 kDa 300 kDa
SEMA
IIA E5
I
Carbohydrate N-linked sites O-linked
9 + probable
9"
Tissue distribution CD 100 is expressed by B, T and NK cells, most myeloid cells, but is absent from bone marrow, erythrocytes, eosinophils and endothelial cells. The level of expression is low on resting cells but increased following activation 1'2 CD 100 mRNA is expressed in most tissues 1.
Structure The CD 100 molecule is a type I membrane glycoprotein that is a member of the semaphorin (sema) superfamily. CD100 consists of an N-terminal sema domain, followed by a C2-set IgSF domain, a 104 amino acid stalk, a hydrophobic transmembrane region and a 110 amino acid C-terminal cytoplasmic tail. The cytoplasmic region has potential sites for tyrosine and serine phosphorylationl CD100 exists at the cell surface as a disulfidelinked homodimer. The cysteine residues involved in dimerization are not known 2.
Function CD100 is thought to play a role in lymphocyte activation. Cells transfected with CD 100 modify B cell signalling through CD40 by downregulating CD23 expression and by augmenting B cell aggregation and survival 1. Antibody crosslinking of CD100 on T cells increases both CD2- and CD3-induced T cell proliferation 3. The identification of CD100 as a semaphorin suggests a possible role in neuronal guidance 1.
Database accession numbers PIR Human
SWISSPR OT
EMBL/GENBANK
REFERENCE
U60800
1
37~
CD100
Amino acid sequence of human CD 100 MRMCTPIRGL MAFAPIPRIT AVNALNISEK SLYVCGTNAF ELYSGTSYNF DSPDGEDDRV TSFLKARLIC LSAVCAYNLS DSEARAANYT QIVVDRTQAL PVQTLLLSSK WSPPTATCVA AELKCSQKSN YQCLSEERVK VLVASTQGSS SEKTMYLKSS LIGKKKPKSD ITSKVPTDRE
LMALAVMFGT WEHREVHLVQ QHEVYWKVSE QPACDHLNLT LGSEPIISRN YFFFTEVSVE SRPDSGLVFN TAEEVFSHGK SSLNLPDKTL DGTVYDVMFV KGNRFVYAGS LHQTESPSRG LARVFWKFQN NKTVFQVVAK PPTPAVQATS DNRLLMSLFL FCDREQSLKE DSQRIDDLSA
A FHEPDIYNYS DKKAKCAEKG SFKFLGKNED SSHSPLRTEY YEFVFRVLIP VLRDVFVLRS YMQSTTVEQS QFVKDHPLMD STDRGALHKA NSGVVQAPLA LIQEMSGDAS GVLKAESPKY HVLEVKVVPK SGAITLPPKP FFFVLFLCLF TLVEPGSFSQ RDKPFDVKCE
ALLLSEDKDT KSKQTECLNY GKGRCPFDPA AIPWLNEPSF RIARVCKGDQ PGLKVPVFYA HTKWVRYNGP DSVTPIDNRP ISLEHAVHII FCGKHGTCED VCPDKSKGSY GLMGRKNLLI PVVAPTLSVV APTGTSCEPK FYNCYKGYLP QNGEHPKPAL LKFADSDADG
LYIGAREAVF IRVLQPLSAT HSYTSVMVDG VFADVIRKSP GGLRTLQKKW LFTPQLNNVG VPKPRPGACI RLIKKDVNYT EETQLFQDFE CVLARDPYCA RQHFFKHGGT FNLSEGDSGV QTEGSRIATK IVINTVPQLH RQCLKFRSAL DTGYETEQDT DI
References 1 Hall, K.T. et al. (1996) Proc. Natl Acad. Sci. USA 93, 11780-11785. 2 Herold, C. et al. (1995) Leucocyte Typing V, 288-289. 3 Herold, C. et al. (1995)Int. Immunol. 7, 1-8.
}74
-i 5O i00 150 2OO 250 300 350 400 450 5OO 55O 600 650 7OO 75O 8OO 842
CD101 F
V7, p126
Molecular weights Polypeptide 115 042
E
SDS-PAGE reduced unreduced
126 kDa 200 kDa
Carbohydrate N-linked sites O-linked
7 nil
Human gene location
lp13
GOOH Domains CNV
Isl
I
v
YECI
CEA
I
I
v
LFG
CLV
I
I
v
YRG
CKA
I
[
v
YEIO
CVV
I
I
v
YQq
CSI
I
I
v
YHC,
CSL
I
I
u
v
ITMIcu
Tissue distribution
U
CD101 is expressed on monocytes, granulocytes, mucosal T cells, and on activated peripheral blood T cells. Expression is weak on resting T, B and NK cells, absent from platelets and weak or absent from most haematopoietic cell lines 1-a. Northern blot analyses show CD101 mRNA to be present in PBLs, thymus, spleen, lung and small intestine, but not in other tissues 4. Structure The extracellular portion of the molecule contains seven V-set IgSF domains. The C-terminal cytoplasmic region is relatively short 4. CD101 exists at the cell surface as a disulfide-linked homodimer. The cysteine residues involved are not known 1, 3. Function CD 101 is thought to play a co-stimulatory role in TCR/CD3-mediated T cell activation 1-a. The anti-CD 101 mAb V7.1 inhibits the proliferative response
37~
CD101
of CD4 § and CD8 § T cells in vitro to alloantigens and immobilized CD3 mAbs, but not to mitogenic lectins 2. Other CD101 mAbs are co-stimulatory for intraepithelial T lymphocytes w h e n administered in vitro with suboptimal doses of CD3 mAb 3.
Database accession numbers PIR
Human
SWISSPR OT
EMBL/GENBANK
REFERENCE
Z33642
4
A m i n o acid sequence of h u m a n CD 101 MAGISYVASF FLLLTKLSIG QREVTVQKGP LFRAEGYPVS IGCNVTGHQG PSEQHFQWSV YLPTNPTQEV QIISTKDAAF SYAVYTQRVR GGDVYVERVQ GNSVLLHISK LQMKDAGEYE CHTPNTDENY YGSYRAKTNL IVIPDTLSAT MSSQTLGKEE GEPLALTCEA SKATAQHTHL SVTWYLTQDG GGSQATEIIS LSKDFILVPG PLYTERFAAS DVQLNKLGPT TFRLSIERLQ SSDQGQLFCE ATEWIQDPDE TWMFITKKQT DQTTLRIQPA VKDFQVNITA DSLFAEGKPL ELVCLVVSSG RDPQLQGIWF FNGTEIAHID AGGVLGLKND YKERASQGEL QLSKLGPKAF SLKIFSLGPE DEGAYRCVVA EVMKTRTGSW QVLQRKQSPD SHVHLRKPAA RSVVVSTKNK QQVVWEGETL AFLCKAGGAE SPLSVSWWHI PRDQTQPEFV AGMGQDGIVQ LGALLWGTSY HGNTRLEKMD WATFQLEITF TAITDSGTYE CRVSEKSRNQ ARDLSWTQKI SVTVKSLESS LQVSLMSRQP QVMLTNTFDL SCVVRAGYSD LKVPLTVTWQ FQPASSHIFH QLIRITHNGT IEWGNFLSRF QKKTKVSQSL FRSQLLVHDA TEQETGVYQC EVEVYDRNSL YNNPPPRASA ISHPLRIAVT LPESKLKVNS RSQGQELSIN SNTDIECSIL SRSNGNLQLA IIWYFSPVST NASWLKILEM DQTNVIKTGD EFHTPQRKQK FHTEKVSQDL FQLHILNVED SDRGKYHCAV EEWLLSTNGT WHKLGEKKSG LTELKLKPTG SKVRVSKVYW TENVTEHREV AIRCSLESVG SSATLYSVMW YWNRENSGSK LLVHLQHDGL LEYGEEGLRG HLHCYRSSST DFVLKLHQVE MEDAGMYWCR VAEWQLHGHP SKWINKHPMS HSGWCSPCCL QSPRFLPGSA PRPPLLYFLF ICPFVLLLLL LISLLCLYWK ARKLSTLRSN TRKEKALWVD LKEAGGVTTN RREDEEEDEG N
References 1 Gouttefangeas, C. et al. (1994) Int. Immunol. 6, 423-430. Rivas, A. et al. (1995) J. Immunol. 154, 4423-4433.
3 Russell, G.J. et al. (1996) J. Immunol. 157, 3366-3374. 4 Ruegg, C.L. et al. (1995) J. Immunol. 154, 4434-4443.
~7~
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 i000 i001
ICAM-2
Molecular weights Polypeptide
28 393
02
SDS-PAGE reduced unreduced
55-65 kDa 54-68 kDa
C2
I'I'I'I"1
Carbohydrate N-linked sites O-linked
s
6 unknown
Human gene location
COOH
17q23-q251 Domains
Isl
CST I LQCI
CRV
I
]
FSCI
[ TM Icu
Tissue distribution CD 102 is broadly expressed on leucocytes (with the exception of neutrophils) and is expressed constitutively and at high levels on vascular endothelium 2. Expression is more restricted than CD54 (ICAM-1) and, unlike CD54, there is little or no induction of CD102 on lymphocytes and endothelial cells by inflammatory mediators 2.
Structure CD 102 is a highly glycosylated cell surface protein. The extracellular portion contains two IgSF C2-set domains and shares -36% sequence identity with the two membrane-distal IgSF domains of CD54 (ICAM-1) and CDS0 (ICAM-3).
Ligands and associated molecules CD 102 is a ligand for the leucocyte integrin CD 11 a/CD 18 (LFA-1)a, like the related proteins CD54 (ICAM-1) and CD50 (ICAM-3). CD102 has also been reported to bind CD 11b/CD 18 (Mac- 1) 4.
Function CD102 is the major LFA-1 ligand on endothelial cells that have not been activated by inflammatory stimuli, suggesting a possible role in lymphocyte recirculation 2. Recombinant CD102 can co-stimulate T cell in vitro, suggesting a role in T cell activation s.
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
S03967 A46510
P13598 P35330
X15606 X65493
6
3
377
CD102
Amino acid sequence of human CD102 MSSFGYRTLT SDEKVFEVHV QAQWKHYLVS TLVAVGKSFT TATFNSTADR SQMVIIVTVV AFRP
VALFTLICCP RPKKLAVEPK NISHDTVLQC IECRVPTVEP EDGHRNFSCL SVLLSLFVTS
G GSLEVNCSTT HFTCSGKQES LDSLTLFLFR AVLDLMSRGG VLLCFIFGQH
CNQPEVGGLE MNSNVSVYQP GNETLHYETF NIFHKHSAPK LRQQRMGTYG
TSLNKILLDE PRQVILTLQP GKAAPAPQEA MLEIYEPVSD VRAAWRRLPQ
References 1 2 3 4 s 6
~7~
Sansom, D. et al. (1991) Genomics 11,462-464. de Fougerolles, A.R. et al. (1991) J. Exp. Med. 174, 253-267. Staunton, D.E. et al. (1989) Nature 339, 61-64. Xie, J. et al. (1995)J. Immunol. 155, 3619-3628. Damle, N.K. et al. (1992) J. Immunol. 148, 665-671. Xu, H. et al. (1992) J. Immunol. 149, 2650-2655.
-I 50 i00 150 200 250 254
CD103
Integrin c~E subunit, HML-1 antigen
Molecular weights Polypeptide 127 746 (~
SDS-PAGE reduced unreduced
150 + 25 kDa 175 kDa
Carbohydrate N-linked sites O-linked sites
11 unknown
TTTI
CD103((xE)/~7
Tissue distribution
E
aEfl7 is expressed primarily on intraepithelial lymphocytes and on 1-2% of peripheral blood lymphocytes ~. Its level of expression can be upregulated by lymphocyte mitogens 2. CD 103 mRNA is found in the thymus and spleen as well as tissues such as lung, small intestine, and colon, which contain large populations of intraepithelial lymphocytes. It is also expressed in the prostate and ovary and in tissues such as the pancreas and testis, where//7 mRNA transcripts are not detectable 3. Structure CD103 is the integrin aE subunit (E for epithelial-associated), which usually forms a heterodimer with the integrin//7 subunit. It belongs to the subgroup of integrin a subunits with an I-domain. Unlike the other members of this group, it is cleaved into two chains, an N-terminal chain of 25 kDa and a Cterminal chain of 150 kDa, which remain linked by disulfide bonds. An extra segment of 55 residues (X-domain) is found immediately N-terminal to the Idomain. This segment (shown in bold below) contains the intrachain cleavage site (shown on new line) and a continuous stretch of 18 charged residues, 16 of which are acidic. The N-terminal sequences of both cleaved chains were determined at the protein level 3. Ligands and associated molecules The integrin ~Efl7 binds E-cadherin 1,4. Function aEfl7 binds to E-cadherin on epithelial cells. This interaction may be of importance in the homing and retention of aEfl7-expressing lymphocytes in the intestinal epithelium 4-6.
37~
Database accession numbers PIR
Human
SWISSPROT
EMBL/GENBANK
REFERENCE
P38570
L25851
3
Amino acid sequence of human CD 103 MWLFHTLLCI FNVDVARPWL RCSLVQDEIL SELTGTCSLL DVNTARQRR A RAKDFISNMM QNITQVGSVT LNLTTVINSP TNYMALDGLL AVGAFDWSGG KTCSLSYVAG PVDIDMDGST PGFTNARFGF GHWDGLSASP QAVVFRSRPV LREALLNFTL TEGELCEEDC EKACKNKLFC MALNYPRNLQ HVSVVWQLEE SKPSIMYVNT AVKKLTRTQA AEISWDHSEE EKYHSLPIII KSENLLEEEN
ASLALLAA TPKGGAPFVL SSLLHQDPST NQTWLLVTSP RTKRTPGPLH CHPVEHVPIQ GEAPGSDRCP EPPRCFDMHS SAGPAPHSLS GPDLRPQAQA NFFDLENLLD PDARVDTGDC Y S N K E G G G E D LEKEEEEDKEEEEDEEEEEA RNFYEKCFEC NFALVQYGGV KTASAMQHVL DSIFTSSHGS KMQGVERFAI GVGEEFKSAR SKLRYNIISM EGTVGDALHY ALLYDTRSRR GRFLNQTAAA APQYKHHGAV FELQKEGREA DFLLVAAPFY HVHGEEGRVY AMAAMGDLSQ DKLTDVAIGA SQRIRASTVA PGLQYFGMSM VRLKVSMAFT PSALPIGFNG DVDVGKQRRR LQCSDVRSCL FSNASVKVSY QLQTPEGQTD VAELQLATTV SQQELVVGLT LKRMQKPPSP NIQCDDPQPV NAFPNRTADI TVTVTNSNER GQGLSHHKEF LFHVHGENLF STVCTWSQER ACAYSSVQHV LLKDVTELQI LGEISFNKSL KGSVGGLLVL IVILVILFKC
GTEIAIILDG IQTEFDLRDS RRKASKVMVV TARELNLIAS QLAQIGFSAQ AADAEAAQYS SFLPVLEGEQ VYRLSEQDGS PLEGFGADDG AGGFDISGDG VVNVRLCFEI GCLREWSSGS HPQPILDRYT KELTLNINLT ASVLIMNCRI RSLANETHTL GAEYQLQICV EEWHSVSCVI YEGLNAENHR GFFKRKYQQL
SGSIDPPDFQ QDVMASLARV LTDGGIFEDP DPDETHAFKV ILDERQVLLG YLGYAVAVLH MGSYFGSELC FSLARILSGH ASFGSVYIYN LADITVGTLG SSVTTASESG QLCEDLLLMP EPFAIFQLPY NSGEDSYMTS GHPVLKRSSA QFRHGFVAVL PTKLRGLQVA ASDKENVTVA TKITVVFLKD NLESIRKAQL
References !
~8(
1 z 3 4 s 6
Parker, C. M. et al. (1992) Proc. Natl Acad. Sci. USA 89, 1924-1928. Schieferdecker, H.L. et al. (1990) J. Immunol. 144, 2541-2549. Shaw, S. K. et al. (1994) J. Biol. Chem. 269, 6016-6025. Cepek, K. L. et al. (1994) Nature 372, 190-193. Cepek, K. L. et al. (1993) J. Immunol. 150, 3459-3470. Shaw, S. K. and Brenner, M. B. (1995) Semin. Immunol. 7, 335-342.
-i 50 i00 150 200 250 300 350 400 450 5O0 550 600 650 700 750 800 850 9OO 95O I000 1050 ii00 1150 1160
Integrin f14 subunit Molecular weights Polypeptide
(no insert) (insert-l) (insert-2)
SDS-PAGE reduced unreduced
192267 197979 199599
220 kDa 210 kDa
Carbohydrates N-linked sites O-linked sites
5 unknown
Human gene location 17ql 1-qter
IF31
IF3 I I F3 1 CD49f/CD104
Tissue distribution CD104 is expressed in conjunction with CD49f as the ~6//4 integrin on the hemidesmosomes of stratified epithelia 1. It is also expressed on cells which do not have hemidesmosomes including simple epithelia and Schwann cells 2,3, and a subset of endothelial cells in the mouse 4.
Structure CD 104 is the integrin f14 subunit, which forms a heterodimer with the integrin a6 subunit (CD49f). It has only 48 out of the 56 cysteine residues normally conserved in the extracellular portion of integrin ]1 subunits. It also has an unusually large cytoplasmic domain of about 1000 residues which includes four fibronectin type III domains. Three alternative forms have been identified by cDNA analysis; they contain either no inserts, or insert-1 at position 1343, or insert-2 at position 1413 (refs 5-7). Both inserts are between the second and third fibronectin type III domains and are shown in bold indented lines in the sequence below, cDNA isolated from the placenta contains a mixture of the short form (no inserts) and the one having only insert-2, whereas carcinoma cell lines appear to express only the form with insert-1 6. Immunoprecipitation studies suggest that free f14 subunit, i.e. not in association with any a subunit, may be present on carcinoma cell lines 2. The N-terminus of CD 104 has been determined s.
~81
CD104
Ligands and associated molecules CD 104 combines with CD49f to form the a6f14 integrin, which is an integral component of hemidesmosomes in stratified epithelia 1,9.
Function
i
Hemidesmosomal CD49f/CD104 (~6fl4)plays an important role in the adhesion of epithelia to basement membranes, via interactions with laminin and/or kalinin anchoring filaments 1'9"10. Unlike other integrins, CD49f/ CD104 interacts with intracellular keratin filaments rather than the actin cytoskeleton. Frameshift and deletion mutations have been identified in a patient with junctional epidermolysis bullosa and pyloric atresia 11. Database accession numbers
Human
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
S08465 A36429 S12380
P 16144
X51841 X53587 X52186 L04678
s 6 7 12
Mouse
Amino acid sequence of human CD 104 MAGPRPSPWA NRCKKAPVKS VVMESSFQIT SPVDLYILMD VSVPQTDMRP NLDAPEGGFD AGIMSRNDER YSYSYYEKLH PRGLRTEVTS EDQKGNIHLK QCVCSEGWSG EGRYEGQFCE SNATCIDSNG DLRSCVQCQA DDCTYSYTME LLLLCWKYCA MLRSGNLKGR CTENLLKPDT KQDHTIVDTV DARGMVEFQE LVNITIIKEQ TQDGTAQGNR RFHVQLSNPK GAPQNPNAKA VPSVELTNLY FNVVSSTVTQ PKNRMLLIEN IIPDIPIVDA
RLLLAALISV CTECVRVDKD EETQIDTTLR FSNSMSDDLD EKLKEPWPNS AILQTAVCTR CHLDTTGTYT TYFPVSSLGV KMFQKTRTGS PSFSDGLKMD QTCNCSTGSL YDNFQCPRTS GICNGRGHCE WGTGEKKGRT GDGAPGPNST CCKACLALLP DVVRWKVTNN RECAQLRQEV LMAPRSAKPA GVELVDVRVP ARDVVSFEQP DYIPVEGELL FGAHLGQPHS AGSRKIHFNW PYCDYEMKVC LSWAEPAETN LRESQPYRYT QSGEDYDSFL
SLSGTLA CAYCTDEMFR RSQMSPQGLR NLKKMGQNLA DPPFSFKNVI DIGWRPDSTH QYRTQDYPSV LQEDSSNIVE FHIRRGEVGI AGIICDVCTC SDIQPCLREG GFLCNDRGRC CGRCHCHQQS CEECNFKVKM VLVHKKKDCP CCNRGHMVGF MQRPGFATHA EENLNEVYRQ LLKLTEKQVE LFIRPEDDDE EFSVSRGDQV FQPGEAWKEL TTIIIRDPDE LPPSGKPMGY AYGAQGEGPY GEITAYEVCY VKARNGAGWG MYSDDVLRSP
DRRCNTQAEL VRLRPGEERH RVLSQLTSDY SLTEDVDEFR LLVFSTESAF PTLVRLLAKH LLEEAFNRIR YQVQLRALEH ELQKEVRSAR EDKPCSGRGE SMGQCVCEPG LYTDTICEIN VDELKRAEEV PGSFWWLIPL KEDHYMLREN ASINPTELVP ISGVHKLQQT QRAFHDLKVA KQLLVEAIDV ARIPVIRRVL QVKLLELQEV LDRSFTSQML RVKYWIQGDS SSLVSCRTHQ GLVNDDNRPI PEREAIINLA SGSQRPSVSD
LAAGCQRESI FELEVFEPLE TIGFGKFVDK NKLQGERISG HYEADGANVL NIIPIFAVTN SNLDIRALDS VDGTHVCQLP CSFNGDFVCG CQCGHCVCYG WTGPSCDCPL YSAIHPGLCE VVRCSFRDED LLLLLPLLAL LMASDHLDTP YGLSLRLARL KFRQQPNAGK PGYYTLTADQ PAGTATLGRR DGGKSQVSYR DSLLRGRQVR SSQPPPHGDL ESEAHLLDSK EVPSEPGRLA GPMKKVLVDN TQPKRPMSIP DT
GCGWKFEP L L G E E L D L R R V T W R L P P E L I PRLSASSGRS S D A E A P T A P R T T A A R A G R A A
38~
-i 5O i00 150 200 25O 3OO 350 4OO 450 5OO 55O 600 65O 7O0 750 8OO 85O 9OO 95O i000 1050 ii00 1150 1200 1250 1300 1350 1400
CD104
A V P R S A T P G P PG
EHLVNGRM DFAFPGSTNS LHRMTTTSAA AYGTHLSPHV PHRVLSTSST LTRDYNSLTR SEHSHSTTLP RDYSTLTSVS SH
1450
DSRLT QLLNGGELHR VITIESQVHP RRPNGDIVGY KVQARTTEGF TTHTSATEPF MDQQFFQT
1550 1600 1650 1700 1750 1800 1848
GLPPIWEH GRSRLPLSWA LGSRSRAQMK GFPPSRGPRD SIILAGRPAAPSWGP
AGVPDTPTRL LNIPNPAQTS QSPLCPLPGS LVTCEMAQGG GPEREGIITI LVDGPTLGAQ
VFSALGPTSL VVVEDLLPNH AFTLSTPSAP GPATAFRVDG ESQDGGPFPQ HLEAGGSLTR
RVSWQEPRCE SYVFRVRAQS GPLVFTALSP DSPESRLTVP LGSRAGLFQH HVTQEFVSRT
RPLQGYSVEY QEGWGREREG DSLQLSWERP GLSENVPYKF PLQSEYSSIT LTTSGTLSTH
1500
References
, !
1 2 3 4 s 6 7 8 9 lo 11 12
Garrod, D.R. (1993) Curr. Opin. Cell Biol. 5, 30-40. Sonnenberg, A. et al. (1990) J. Cell Sci. 96, 207-217. Natali, P.G. et al. (1992) J. Cell Sci. 103, 1243-1247. Kennel, S.J. et al. (1992)J. Cell Sci. 101, 145-150. Suzuki, S. and Naitoh, Y. (1990) EMBO J. 39, 757-763. Tamura, R.N. et al. (1990) J. Cell Biol. 111, 1593-1604. Hogervorst, F. et al. (1990) EMBO J. 9, 765-770. Kajiji, S. (1989)EMBO J. 8, 673-680. Dowling, J. et al. (1996) J. Cell Biol. 134, 559-572. Niessen, C.M. et al. (1994) Exp. Cell Res. 2 1 1 , 3 6 0 - 3 6 7 . Vidal, F. et al. (1995) Nature Genet. 10, 229-234. Kennel, S.J. et al. (1993) Gene 130, 209-216.
~8~
CD105
Endoglin
Molecular weights Polypeptide
68 095
SDS-PAGE reduced unreduced
95 kDa 170 kDa
Carbohydrate N-linked sites O-linked
5 +
Human gene location 9q34.1
I TT NH2
Tissue distribution CD 105 is present on endothelial cells. It is absent from most normal B and T cells, but is present on some leukaemic cells of B lymphoid and myeloid origin and on a subset of bone marrow cells 1,2. CD105 expression can be induced on the surface of in vitro activated macrophages and macrophage cell lines a,4.
Structure CD 105 is present at the cell surface as a disulfide-linked homodimer. O-glycosylation of CD 105 has been demonstrated by digestion with O-glycanase. The O-glycosylation is likely to be on the membrane-proximal region that contains a high proportion of Thr, Ser and Pro residues 1. The N-terminus has been established by protein sequencing. The sequence shows some patches of similarity with another receptor for TGFfl namely TGFfl receptor type III (flglycan). The highest similarity is between the cytoplasmic domains at 70%, with an overall sequence identity of 30% s.
Ligands and associated molecules CD 105 is a receptor for transforming growth factor ]/(TGFfl). CD 105 binds with high affinity both the fll isoform (Ka= 50pM) and 113 isoform but not the f12 isoform of TGFfl s. CD 105 exist in two forms (L and S isoforms) derived from alternative splicing in the cytoplasmic domain and both these isoforms bind TGFfl 6.
Function CD 105 is one of several receptors for the various isoforms of TGFfl, which in turn is one of a family of proteins involved in regulation of cell differentiation, migration of cells and control of the immune response. The level of CD105 increases in response to TGFfl. Cell lines transfected with CD105 show
38~
-~
modulation of some of the effects of TGFfl although the mechanism is u n c l e a r 4.
Comments Mutations in the gene for CD105 have been found in the disease hereditary haemorrhagic telangiectasia which is characterized by multisystemic vascular dysplasia and recurrent haemorrhage 7,8 Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
$37628
P17813"
X72012 X77952
9
$42844
1
* Partial sequence (see EMBL entry for full length).
Amino acid sequence of human CD105 (L-endoglin isoform) MDRGTLPLAV ETVHCDLQPV SQLELTLQAS TFQEPPGVNT QGSLSFCMLE GHSAGPRTVT EYSFKIFPEK HASSCGGRLQ LVAHLKCTIT VVNILSSSSP QVRVSPSVSE GDPRFSFLLH SPDLSGCTSK VVAVAAPASS
ALLLASCSLS GPERGEVTYT KQNGTWPREV TELPSFPKTQ ASQDMGRTLE VKVELSCAPG NIRGFKLPDT TSPAPIQTTP GLTFWDPSCE QRKKVHCLNM FLLQLDSCHL FYTVPIPKTG GLVLPAVLGI ESSSTNHSIG
PTSLA TSQVSKGCVA LLVLSVNSSV ILEWAAERGP WRPRTPALVR DLDAVLILQG PQGLLGEARM PKDTCSPELL AEDRGDKFVL DSLSFQLGLY DLGPEGGTVE TLSCTVALRP TGGAFLIGAL STQSTPCSTS
QAPNAILEVH FLHLQALGIP ITSAAELNDP GCHLEGVAGH PPYVSWLIDA LNASIVASFV MSLIQTKCAD RSAYSSCGMQ LSPHFLQASN LIQGRAAKGN KTGSQDQEVH LTAALWYIYS SMA
VLFLEFPTGP LHLAYNSSLV QSILLRLGQA KEAHILRVLP NHNMQIWTTG ELPLASIVSL DAMTLVLKKE VSASMISNEA TIEPGQQSFV CVSLLSPSPE RTVFMRLNII HTRSPSKREP
-i 50 i00 150 200 250 300 350 400 450 500 550 600 633
The C-terminal amino acid sequence of the human S-endoglin isoform 6 is as follows: SPDLSGCTSK GLVLPAVLGI TGGAFLIGAL LTAALWYIYS HTREYPRPPQ
600
References 1 z 3 4
s 6 !
7 8 9
Gougos, A. and Letarte, M. (1990) J. Biol. Chem. 265, 8361-8364. Rokhlin, O.W. et al. (1995)J. Immunol. 154, 4456-4465. Lastres, P. et al. (1992) Eur. J. Immunol. 22, 393-397. Lastres, P. et al. (1996) J. Cell Biol. 133, 1109-1121. Cheifetz, S. et al. (1992) J. Biol. Chem. 267, 19027-19030. Bellon, T. et al. (1993) Eur. J. Immunol. 23, 2340-2345. McAUister, K.A. et al. (1994) Nature Genet. 8, 345-351. http://www3.ncbi.nlm.nih.gov:80/htbin-post/Omim/dispmim? 131195. Ge, A.Z. and Butcher, E.C. (1994) Gene 138, 201-206.
~8~
VCAM-1, INCAM-110
Molecular weights Polypeptide
78 745
SDS-PAGE reduced
100-110 kDa
Carbohydrate N-linked sites O-linked
6 unknown
Human gene location and size l p31-p32; -25 kb 1 Domains
is! ASQ
Exon boundaries CSV
I
YLC,
QEK
I
CST
CKV
I
YLC, CSV I LVCi CSS I YVC,
I,
CLS
LVC,
YSF
I1
YIS
YSF
I
NVA
YLC,
CTC
I
QVT
I
C ~2s
YEC, QGR
COOH
!
!
! i I
! i
Tissue distribution CD 106 is expressed predominantly on vascular endothelium but has also been identified on follicular and interfollicular dendritic cells, some macrophages, bone marrow stromal cells, and non-vascular cell populations within joints, kidney, muscle, heart, placenta and brain a,a. Expression on endothelial cells as well as many other cells is induced by inflammatory stimuli and cytokines 2,4. Activated endothelial cells can release soluble forms of CD106 which can be detected in the blood s.
Structure CD 106 has seven IgSF C2-set domains in its extracellular portion. Domains 1-3 show substantial sequence similarity to domains 4-5, consistant with an internal duplication event during CD106 evolution. Alternative splicing can produce a six IgSF domain form lacking domain 4 or, in the mouse, a GPIanchored form of CD106 comprising domains 1-3 2'3. The structures of domains 1 and 2 have been determined by X-ray crystallography; as a result domain 1 was classified as an I-type IgSF domain on structural grounds 6,7. The domains 1+2 and 4+5 fragments are closely related to the two IgSF domains of MAdCAM-1, which also binds the integrin a4//7 7,8. These portions of CD 106 and MAdCAM-1 are also related to domains 1+2 of the [/2 (CD18) integrin binding molecules CDS0, CD54 and CD 102. The membrane-distal domains of
386
all these molecules have atypical disulfides between the B-C and F-G loops as well as an (I/L)(D/E)(S/T)xL motif in the C - D loop, which forms part of the integrin binding site 7. CD106 is highly conserved between humans and rodents (-76% overall, 100% within cytoplasmic domain).
Ligands and associated molecules CD106 binds the integrins ~4fll (CD49d/CD29, VLA-4) and a4f17 9. VLA-4 is the dominant ligand in cells expressing both integrins 9. CD106 has two independent binding sites for VLA-4 in domains 1 and 4, respectively 3. Both sites lie on the GFC //-sheets of these domains and include the conserved C - D loop motif IDSPL 3.
Function Endothelial CD106 contributes to the extravasation of lymphocytes, monocytes, basophils and eosinophils (but not neutrophils) from blood vessels, particularly at sites of inflammation 2-4. Unlike the f12 integrins, the CD106/VLA-4 interaction can mediate both the initial tethering and rolling of lymphocytes on endothelium as well as their subsequent arrest and firm adhesion lo. CD 106 expressed in non-vascular tissues has been implicated in the interaction of haematopoietic progenitors with bone marrow stromal cells, B cell binding to follicular dendritic cells, co-stimulation of T cells, and embryonic development 2-4. Mice deficient in CD106 have defects in the development of the placenta and heart and die during embryogenesis 11. Database accession numbers (full-length form) Human Mouse Rat
PIR A41288 JN0581 JS0675
SWISSPR O T P19320 P29533 P29534
EMBL/GENBANK M73255 M84487 M84488
REFERENCE 1 12 12
Amino acid sequence of human CD 106 MPGKMVVILG FKIETTPESR GTTSTLTMNP GPLEAGKPIT TKSLEVTFTP
ASNILWIMFA YLAQIGDSVS VSFGNEHSYL VKCSVADVYP VIEDIGKVLV
ASQA LTCSTTGCES CTATCESRKL FDRLEIDLLK CRAKLHIDEM
PFFSWRTQID EKGIQVEIYS GDHLMKSQEF DSVPTVRQAV
SPLNGKVTNE FPKDPEIHLS LEDADRKSLE KELQVYISPK
NTVISVNPST KnQEGGSVCM TCSSEGLPAP ~IrWSKKnnN GNLQHnSGNA TLTLIAMRME D S G I Y V C E G V N L I G K N R K E V ELIVQEKPFT.VEI.S.PGPRIA
~Q~.9~!~.~~~.~.~.~~~.!~~~.~!~~
~EN/Hw167 RDPEIEMSGG LVNGSSVTVS C K V P S V Y P L D RLEIELLKGE TILENIEFLE D T D M K S L E N K SLEMTFIPTI EDTGKALVCQ AKLHIDDMEF EPKQRQSTQT LYVNVAPRDT TVLVSPSSIL E E G S S V N M T C LSQGFPAPKI LWSRQLPNGE L Q P L S E N A T L T L I S T K M E D S G V Y L C E G I N Q A G R S R K E V E L IIQVTPKDIK L T A F P S E S V K E G D T V I I S C T CGNVPETWII L K K K A E T G D T V L K S I D G A Y T I R KA Q L K D A G V Y E C E S K N K V SQLRSLTLD V Q G R E N N K D Y FSPELLVLYF A S S L I I P A I G M I I Y F A R K A N M K G S Y S L V E A QKSKV
-i 5O i00 150 200 25O 3OO 350 400 45O 5O0 550 600 650 7OO 715
~87
CD106
l/
Note: In the six-domain form domain 4 (dotted underlined) is replaced with an A.
References
L
38~
z 2 3 4 s 6 7 8 9 zo zz z2
Cybulsky, M.I. et al. (1991) Proc. Natl Acad. Sci. USA 88, 7859-7863. Bevilacqua, M.P. (1993)Annu. Rev. Immunol. 11, 767-804. Vonderheide, R.H. et al. (1994) J. Cell. Biol. 125, 215-222. Carlos, T.M. and Harlan, J.M. (1994) Blood 84, 2068-2101. Gearing, A.J.H. and Newman, W. (1993)Immunol. Today 14, 506-512. Harpaz, Y. and Chothia, C. (1994) J. Mol. Biol. 238, 528-539. Jones, E.Y. et al. (1995) Nature 373, 539-544. Shyjan, A.M. et al. (1996) J. Immunol. 156, 2851-2857. Berlin, C. et al. (1993) Cell 74, 185-195. Butcher, E.C. and Picker, L.J. (1996) Science 272, 60-66. Kwee, L. et al. (1995) Development 121,489-503. Hession, C. et al. (1992) Biochem. Biophys. Res. Commun. 183, 163-169.
CD 107a, CD 107b Other names CD 107a: lysosome-associated membrane protein- 1 (lamp- 1), lgp- 120 (rat) CD107b: lamp-2, lgp-110 (rat) Molecular weights Polypeptide CD 107a CD107b SDS-PAGE reduced unreduced
Carbohydrate N-linked sites O-linked
CDlO7a
CDlO7b
41955 42012
CD107a CD 107b CD 107a CD 107b
120 kDa 120 kDa 120 kDa 120 kDa
CD107a CD 107b CD107a CD107b
19 17 + +
Human gene location and size CD107a: 13q34 CD 107b: Xq24; >40 kb 1
COOH
COOH
Tissue distribution CD107a is expressed by granulocytes, T cells, macrophages, dendritic cells, activated platelets, endothelial cells, tonsillar epithelium and melanoma cells 2'3. CD107b is expressed by granulocytes, activated platelets, TNFaactivated endothelial cells, tonsillar epithelium and melanoma cells 2,3. Structure CD107a and CD107b are both type I membrane glycoproteins, with 39% amino acid sequence identity, and constitute the major sialoglycoproteins on lysosomal membranes &4. A smaller proportion of CD107a and CD107b can be detected on the plasma membrane. Both molecules are heavily glycosylated, containing 19 (CD107a)or 17 (CD107b) N-glycans, some of which are composed of very complex poly-N-acetyllactosamines s. Carbohydrates constitute 55-65% of the total mass in both CD107a and CD107b. The major portion of each molecule is located on the luminal side of lysosomes, anchored by a transmembrane region with a short cytoplasmic tail 3,4. In both molecules the intraluminal portion comprises two internally related domains, separated by a hinge region rich in Pro and Ser residues (CD107a), or Pro and Thr residues (CD107b), which is O-glycosylated 6. Both CD107a and CD107b contain the sequence GYXX in their cytoplasmic tail. This motif is also present in the cytoplasmic tail of CD63 and lysosomal acid phosphatase (LAP), both of which are transported to lysosomes 3. The Ntermini of CD107a and CD107b mature proteins have been determined by amino acid sequence analysis 4.
]8g
CD107a, CD107b
Ligands and associated molecules Both CD107a and CD107b have been identified as ligands for galaptin, an S-type lectin (galectin) present in extracellular matrix, through its recognition of Nacetyllactosamine oligosaccharide chains 7. In addition, CD62E binds to the sialyl-Lewis x structures displayed by poly-N-acetyllactosamines on CD 107a 8,9.
Function It has been suggested that CD107a and CD107b protect the inner surface of the lysosomal membrane by forming a barrier to soluble lysosomal hydrolases 3. The upregulated expression of CD107a and CD107b on the surface of several tumour cell lines has been associated with their enhanced metastatic potential, where they may increase adhesion to extracellular matrix and endothelium 7-9.
Comments Alternatively spliced forms of the CD107b gene are expressed in a tissuespecific manner lo.
Database accession numbers Human CD107a Human CD107b Mouse CD107a Mouse CD107b Rat CD107a Rat CD107b
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A23656 B23656 A28067 A35560 A30200 A36288
P11279 P13473 P11438 P17047 P14562 P17046
J04182 J04183 J03881, M32015 J05287 M34959 D90211
4 4 11 13 12 14
Amino acid sequence of human CD107a MAPRSARRPL AMFMVKNGNG SCGKENTSDP PNASSKEIKT LSNSSFSRGE GTCLLASMGL HSEGTTVLLF QATVGNSYKC ENSTLIPIAV
LLLLPVAAAR TACIMANFSA SLVIAFGRGH VESITDIRAD TRCEQDRPSP QLNLTYERKD QFGMNASSSR NAEEHVRVTK GGALAGLVLI
PHALSSA AFSVNYDTKS TLTLNFTRNA IDKKYRCVSG TTAPPAPPSP NTTVTRLLNI FFLQGIQLNT AFSVNIFKVW VLIAYLVGRK
GPKNMTFDLP T R YS V Q L M S F TQVHMNNVTV SPSPVPKSPS NPNKTSASGS ILPDARDPAF VQAFKVEGGQ RSHAGYQTI
SDATVVLNRS VYNLSDTHLF TLHDATIQAY VDKYNVSGTN CGAHLVTLEL KAANGSLRAL FGSVEECLLD
-i 50 i00 150 200 250 300 350 389
HGTVTYNGSI NTGDNTTFPD WDVLVQAFVQ EAGTYSVNNG TALLRLNSST SYWDAPLGSS SADDDNFLVP
-i 50 i00 150 200 250 300 350 382
Amino acid sequence of human CD107b MVCFRLFPVP LELNLTDSEN CGDDQNGPKI AEDKGILTVD NGTVSTNEFL NDTCLLATMG IKYLDFVFAV YMCNKEQTVS IAVGAALAGV
~9C
GSGLVLVCLV ATCLYAKWQM AVQFGPGFSW ELLAIRIPLN CDKDKTSTVA LQLNITQDKV KNENRFYLKE VSGAFQINTF LILVLLAYFI
LGAVRSYA NFTVRYETTN IANFTKAAST DLFRCNSLST PTIHTTVPSP ASVININPNT VNISMYLVNG DLRVQPFNVT GLKHHHAGYE
KTYKTVTISD YSNDSVSFSY LEKNDVVQHY TTTPTPKEKP THSTGSCRSH SVFSIANNNL QGKYSTAQDC QF
CD 107a, CD 107b
References
1 Sawada, R. et al. (1993)J. Biol. Chem. 268, 9014-9022 (erratum: J. Biol. Chem. 268, 13010). z Azorsa, D.O. et al. (1995) Leucocyte Typing V, 1351. 3 Fukuda, M. (1991) J. Biol. Chem. 266, 21327-21330. 4 Fukuda, M. et al. (1988) J. Biol. Chem. 263, 18920-18928. 5 Carlsson, S.R. et al. (1988)J. Biol. Chem. 263, 18911-18919. 6 Carlsson, S.R. et al. (1993)Arch. Biochem. Biophys. 304, 65-73. 7 Woynarowska, B. et al. (1994)J. Biol. Chem. 269, 22797-22803. s Sawada, R. et al. (1993) J. Biol. Chem. 268, 12675-12681. 9 Sawada, R. et al. (1994) J. Biol. Chem. 269, 1425-1431. lO Konecki, D.S. et al. (1995)Biochem. Biophys. Res. Commun. 215, 757-767. 11 Chen, J.W. et al. (1988)J. Biol. Chem. 263, 8754-8758. lz Howe, C.L. et al. (1988) Proc. Natl Acad. Sci. USA 85, 7577-7581. 13 Cha, Y. et al. (1990)J. Biol. Chem. 265, 5008-5013. ~4 Noguchi, Y. etal. (1989)Biochem. Biophys. Res. Commun. 164, 1113-1120.
~91
Molecular weights SDS-PAGE reduced unreduced
80 kDa 75 kDa
Tissue distribution CDwl08 is expressed weakly on some lymphoid, myeloid and stromal cells, and highly expressed on the leukaemic T cell line HPB-ALL 1.
Structure CDwl08 is a GPI-linked glycoprotein that contains 20% of N-linked carbohydrate by mass 1.
Function Unknown.
Comments The CDw108 antigen is identical to the JMH erythrocyte blood group antigen 2.
h
~9.~
References 1 Klickstein, L.B. and Springer, T.A. (1995) Leucocyte Typing V, 1477-1478. 2 Mudad, R. et al. (1995)Transfusion 35, 566-570.
CD109
Gov a/b alloantigen
Molecular weights SDS-PAGE reduced unreduced
175 kDa 175 kDa
Carbohydrate N-linked sites O-linked sites
i
yes no
T i s s u e distribution CD 109 carries the epitopes for the G o v a/b alloantigen on platelets 1. It is also expressed on activated T cells, human umbilical vein endothelial cells and several tumor cell lines. It is not expressed on resting lymphocytes, neutrophils, or erythrocytes 1,2. ..................
Structure
I
!
i...........I
CD 109 is a GPI-linked protein on T cells, tumour cells and endothelial cells 3. However, only about 50% of the antigen can be released from platelets by phosphatidylinositol-specific phospholipase C 1. Multiple chains are observed by immunoprecipitation but the lower bands are proteolytic products of the single chain at 175 kDa. Two N-linked sites have been found by peptide mapping 4. No change in polypeptide chain size is observed with O-glycanase 1.
Function Unknown. Alloantibodies against CD109 have been identified in patients following multiple platelet transfusions s. The alloantigenicity of CD109 has been implicated in post-transfusion purpura and alloimmune neonatal thrombocytopenia 1.
References 1 Smith, J.W. et al. (1995) Blood 86, 2807-2814. 2 3 4 s
Brashem-Stein, C. et al. (1988) J. Immunol. 140, 2330-2333. Haregewoin, A. et al. (1994) Cell. Immunol. 156, 357-370. Sutherland, D.R. et al. (1991) Blood 77, 84-93. Kelton, J.G. et al. (1990) Blood 75, 2172-2176.
~93
c-kit, mast/stem cell growth factor receptor, steel factor receptor
CDll7
Molecular weights
i
NH2
Polypeptide
107 350
SDS-PAGE reduced
145 kDa
Carbohydrate N-linked sites O-linked
10 unknown
Human gene location and size 4ql 1-q12; >70 kb 1
TT )T TT
Domains CTD I YTC
02
CPL
I
['"SI 1 I1 TGS RDP Exon boundaries
LHC
02
CTI
I
FMC
02
I1 I0 PAF SQTK
VEY
I
I1 VDK
YTF
02
12 IRY
CVA
I
I1 NTK
v
VEC
12 QRC
II~MI...,.,.~J
KEQ
encoded by 12 exons
Tissue distribution Human CD117 is expressed on 1-4% of bone marrow cells, the majority of which (50-70%)also express CD34 and comprise pluripotent haematopoietic progenitor cells 2. CD117 is also expressed on human mast cells and acute myeloid leukaemic cells, while CD117 mRNA has been detected in melanocytes, primordial germ cells, small cell lung cancers and seminomas 2. In the mouse, C D l l 7 is expressed on almost all haematopoietic progenitor cells including colony-forming cells reactive to IL-3, GM-CSF and M-CSF, but not on B-lineage progenitors which form colonies in response to IL-7 a. It is also expressed on the surface of mouse mast cells, melanocytes, spermatogonia and oocytes a.
Structure CD 117 belongs to subclass III within the family of growth factor receptors with tyrosine kinase activity, that also includes CD115 (M-CSFR)and the PDGF receptors type A and B (CD140a and CD140b) 4's. The extracellular domain of CD117 consists of five IgSF domains (four C2-set and one V-set), followed by a transmembrane sequence and a cytoplasmic region containing a
~94
-~
tyrosine kinase domain, which is interrupted by an insertion sequence of 77 amino acids 4.
Ligands and associated molecules The natural ligand for CD117 is c-kit ligand (also known as stem cell growth factor, steel factor and mast cell growth factor), a growth factor which is biologically active in both membrane-bound and soluble form (see c-kitL)6. C D l l 7 transduces transmembrane signals by interaction with phosphatidylinositol 3-kinase, Rafl and, to some extent, phospholipase C-y 7.
Function
i
,,
The interaction of c-kit ligand with CD 117 is crucial for the development of haematopoietic, gonadal and pigment stem cells (reviewed in ref. 8). c-kit ligand also increases the sensitivity of human mucosal mast cells to crosslinking of the high-affinity IgE receptor 9. In mice, mutations of the W and S1 loci, which encode C D l l 7 and c-kit ligand respectively, lead to alterations of coat colour (white spotting), anaemia and defective gonad development s. In humans, naturally occurring mutations within the CD 117 gene have been identified as the cause of piebaldism, an autosomal dominant disorder of pigmentation. These mutations are reviewed in OMIM entries 164920 and 172800 (see Chapter 1 for methods to access OMIM).
Comment The feline v-kit oncogene has lost the extracellular and transmembrane domains as well as the proximal cytoplasmic and the C-terminal amino acids. The last 49 amino acids have been replaced by five amino acids due to fusion with the feline leukaemia polymerase gene 4. Database accession numbers Human Mouse
PIR SO1426 S00474
SWISSPR OT P 10721 P05532
EMBL/GENBANK X06182 Y00864
REFERENCE 4 10
A m i n o acid s e q u e n c e of h u m a n C D 1 1 7 MRGARGAWDF GSSQPSVSPG ETNENKQNEW SLYGKEDNDT KSVKRAYHRL LREGEEFTVT ATLTISSARV VFVNDGENVD YVSELHLTRL NGMLQCVAAG VVQSSIDSSA TPLLIGFVIV
LCVLLLLLRV EPSPPSIHPG ITEKAEATNT LVRCPLTDPE CLHCSVDQEG CTIKDVSSSV NDSGVFMCYA LIVEYEAFPK KGTEGGTYTF FPEPTIDWYF FKHNGTVECK AGMMCIIVMI
QT KSDLIVRVGD GKYTCTNKHG VTNYSLKGCQ KSVLSEKFIL YSTWKRENSQ NNTFGSANVT PEHQQWIYMN LVSNSDVNAA CPGTEQRCSA AYNDVGKTSA LTYKYLQKPM
EIRLLCTDPG LSNSIYVFVR GKPLPKDLRF KVRPAFKAVP TKLQEKYNSW TTLEVVDKGF RTFTDKWEDY IAFNVYVNTK SVLPVDVQTL YFNFAFKGNN YEVQWKVVEE
FVKWTFEILD DPAKLFLVDR IPDPKAGIMI VVSVSKASYL HHGDFNYERQ INIFPMINTT PKSENESNIR PEILTYDRLV NSSGPPFGKL KEQIHPHTLF INGNNYVYID
-i 50 I00 150 200 250 300 350 400 450 500 550
~95
PTQLPYDHKW KMLKPSAHLT CCYGDLLNFL DMKPGVSYVV SYQVAKGMAF YVVKGNARLP MPVDSKFYKM LIEKQISEST HDDV
EFPRNRLSFG EREALMSELK RRKRDSFICS PTKADKRRSV LASKNCIHRD VKWMAPESIF IKEGFRMLSP NHIYSNLANC
KTLGAGAFGK VLSYLGNHMN KQEDHAEAAL RIGSYIERDV LAARNILLTH NCVYTFESDV EHAPAEMYDI SPNRQKPVVD
VVEATAYGLI IVNLLGACTI YKNLLHSKES TPAIMEDDEL GRITKICDFG WSYGIFLWEL MKTCWDADPL HSVRINSVGS
KSDAAMTVAV GGPTLVITEY SCSDSTNEYM ALDLEDLLSF LARDIKNDSN FSLGSSPYPG KRPTFKQIVQ TASSSQPLLV
References 1 2 a 4 s 6 7 8 9 lo
396
Vandenbark, G.R. et al. (1992) Oncogene 7, 1259-1266. Bflhring, H-J. et al. (1995) Leucocyte Typing V, 1882-1888. Ogawa, M. et al. (1991) J. Exp. Med. 174, 63-71. Yarden, Y. et al. (1987) EMBO J. 6, 3341-3351. Ullrich, A. and Schlessinger, J. (1990)Cell 61,203-212. Callard, R.E. and Gearing, A.J.H. (1994)The Cytokine FactsBook. Academic Press, London. Lev, S. et al. (1991) EMBO J. 10, 647-654. Witte, O.N. (1990) Cell 63, 5-6. Bischoff, S.C. et al. (1992)J. Exp. Med. 175, 237-244. Qiu, F. et al. (1988) EMBO J. 7, 1003-1011.
600 650 700 750 800 850 900 950 954
Other names TNFRI and TNFRII (tumour necrosis factor receptors I and II) TNF-R55 and TNF-R75 CD120b
Molecular weights CD120a Polypeptide CD 120b
48 307 46 091 CD120a
SDS-PAGE reduced
50-60 kDa 75-85 kDa
CD120a CD120b CD 120a CD 120b
3 2 nil probable +
T
Carbohydrate N-linked sites O-linked
, , ,
CD 120a CD120b
Human gene locations and size CD 120a: 12p 13.2 CD120b: lp36.3-p36.2; 43 kb 1
CD120a CPQ
Domains
i sl I
COOH
COOH
TDcCES
,I ]
TR
TvcCRK
,iI
TR
TvcCHA
,I I
TR
TR
KLC, [TM I 0u
I
CD120b
Oo.a,os Exonboundaries
TvoOEO, [
TR
AQVA
I'
[
SPG
TR
I'
I
SDQ
TR
I'
[
PGT
TR
I~
GIG
'l,
[TMI
CY
1
/I I, I o I, VGL KKK] SDS KVPH
Tissue distribution Most cell types express both of the two receptors for TNF. CD120a is constitutively expressed at a low level and CD 120b expression is inducible 2,a.
Structure Both receptors are members of the TNFR superfamily, and contain four cysteine-rich repeats, but are <25% identical to each other, i.e. no more similar to each other than to the other members of this family of receptors 4-6. The structure of a complex between the trimeric lymphotoxin (LTa) and three molecules of the extracellular domain of CD120a has been determined by X-ray crystallography[ The structure of the extracellular domain of CD 120a has also been determined in the absence of ligand, and it
397
CD120a, CD120b
forms a dimer but it is not known whether this occurs at the membrane 8. CD120a contains a 'death domain' within the cytoplasmic region 9.
Ligands and associated molecules i
L ...............
Both receptors bind TNF~ and LT~ with relatively high affinity (Kd < 10-9 M-1) lo. Membrane bound and soluble TNF and LTa bind to CD120a and CD120b 2. A third TNF-like protein, lymphotoxin // (LTfl)forms heterocomplexes with LTa at the cell surface but these do not bind CD120a or CD120b but to a separate lymphotoxin-beta receptor 11. Several different proteins can associate directly or indirectly with the cytoplasmic parts of CD120a and b, including TNF receptor-associated factors (TRAF), TNF receptorassociated proteins (TRAP), TNF receptor-associated kinase (TRAK) and TRADD (TNFRl-associated death domain protein) a'12. The 'death domain' within the cytoplasmic region of CD120a self-associates lo
Function CD120a and CD120b are receptors for both TNFa and TNFfl (LTa)which are cytokines produced primarily by macrophages/monocytes, activated T cells and NK cells in response to bacterial, viral and parasitic infections la. TNF mediates a wide variety of effects including tumour necrosis, anorexia, fever, induction of other cytokines, cell differentiation and apoptosis. The most effective cell killing occurs when membrane bound TNF crosslinks both CD120a and CD120b 2. Crosslinking of the TNF receptors by soluble or membrane bound TNF leads to signalling which is mediated by a variety of associated molecules leading to different pathways 2,3,14. Signalling is mainly mediated through CD120a a. Database accession numbers Human CD120a Human CD 120b Rat CD 120a Mouse CD120a Mouse CD 120b
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A34899 A35356 B36555
P19438 P20333 P22934
M33294 M35857 M63122, M75862 M60468, X59238 M60469
5,6 4 is 16 ~6
Amino acid sequence of human CD120a MGLSTVPDLL IYPSGVIGLV CPGPGQDTDC DTVCGCRKNQ FFLRENECVS LCLLSLLFIG SFSPTPGFTP DPILATALAS KEFVRRLGLS GRVLRDMDLL
~98
LPLVLLELLV PHLGDREKRD RECESGSFTA YRHYWSENLF CSNCKKSLEC LMYRYQRWKS TLGFSPVPSS DPIPNPLQKW DHEIDRLELQ GCLEDIEEAL
G SVCPQGKYIH SENHLRHCLS QCFNCSLCLN TKLCLPQIEN KLYSIVCGKS TFTSSSTYTP EDSAHKPQSL NGRCLREAQY CGPAALPPAP
PQNNSICCTK CSKCRKEMGQ GTVHLSCQEK VKGTEDSGTT TPEKEGELEG GDCPNFAAPR DTDDPATLYA SMLATWRRRT SLLR
CHKGTYLYND VEISSCTVDR QNTVCTCHAG VLLPLVIFFG TTTKPLAPNP REVAPPYQGA VVENVPPLRW PRREATLELL
-i 50 i00 150 200 250 300 350 400 434
CD120a, CD120b
A m i n o acid s e q u e n c e of h u m a n C D 1 2 0 b MAPVAVWAAL
LPAQVAFTPY TVCDSCEDST GWYCALSKQE TSSTDICRPH RSQHTQPTPE LLIIGVVNCV APSSSSSSLE GGHGTQVNVT FSKEECAFRS
AVGLELWAAA
APEPGSTCRL YTQLWNWVPE GCRLCAPLRK QICNVVAIPG PSTAPSTSFL IMTQVKKKPL SSASALDRRA CIVNVCSSSD QLETPETLLG
HA
REYYDQTAQM CLSCGSRCSS CRPGFGVARP NASMDAVCTS LPMGPSPPAE CLQREAKVPH PTRNQPQAPG HSSQCSSQAS STEEKPLPLG
CCSKCSPGQH DQVETQACTR GTETSDVVCK TSPTRSMAPG GSTGDFALPV LPADKARGTQ VEASGAGEAR STMGDTDSSP VPDAGMKPS
AKVFCTKTSD EQNRICTCRP PCAPGTFSNT AVHLPQPVST GLIVGVTALG GPEQQHLLIT ASTGSSDSSP SESPKDEQVP
-I
50 I00 150 200 250 300 350 400 439
References 1 Santee, S. and Owen-Schaub, L. (1996)J. Biol. Chem. 271, 21151-21159. 2 Grell, M. et al. (1995)Cell 83, 793-802.
3 Vandenabeele, P. et al. (1995) Trends Cell Biol. 5, 392-399. 4 s 6 z s 9 lo 11 12 13
Smith, C.A. et al. (1990) Science 248, 1019-1023. Schall, T.J. et al. (1990)Cell 61,631-370. Loetscher, H. et al. (1990)Cell 61,351-359. Banner, D.W. et al. (1993)Cell 73, 431-45. Naismith, J. et al. (1995)J. Biol. Chem. 270, 13303-13307. Boldin, M. et al. (1995) J. Biol. Chem. 270, 387-391. Brockhaus, M. et al. (1990) Proc. Natl Acad. Sci. USA 87, 3127-3131. Crowe, P. et al. (1994) Science 264, 707-710. Darnay, B. et al. (1995) J. Biol. Chem. 270, 14867-14870. Tracey, K. and Cerami, A. (1993) Annu. Rev. Cell Biol. 9, 317-343. ~4 Liu, Z.-g. et al. (1996) Cell 87, 565-576. is Himmler, A. et al. (1990) DNA Cell Biol. 9, 705-715. ~6 Lewis, M. et al. (1991) Proc. Natl Acad. Sci. USA 88, 2830-2834.
~9~
CD134
OX40, MRC OX40, ACT35 antigen
Molecular weights Polypeptide
26 602
SDS-PAGE reduced unreduced
47-51 kDa 51-55 kDa
Carbohydrate N-linked sites O-linked
2 probable +
Human gene location lp361,2
COOH Domains
I
Isl
T~
'I'
Tr
TV
Tr
AIC
'l....
ITMI CY]
Tissue distribution CD134 was originally named MRC OX40 after the first antibody, CD134 is specifically expressed on activated T cells with m a x i m u m expression at 24 h after stimulation 1-s. In the rat, CD 134 is only found on activated rat CD4 + T cells with 32 000 sites per cell a and in human it is described as predominantly on CD4 + cells 2. In mouse, CD134 is expressed on both activated CD4 § and CD4+CD8~ + cells a.
Structure CD134 is a member of the TNFR superfamily, with three complete Cys-rich repeats 1,3,6. The membrane-proximal hinge region contains possible sites for O-linked glycosylation 2,3.
Ligands and associated molecules CD134 binds to OX40 ligand (OX40L)7 and there is no evidence for another ligand 4. The affinity of monomeric CD134 for OX40L is 190nM and of trimeric OX40L for CD134 on the surface of activated T cells is 0.2nM s. Three CD 134 receptors bind one OX40L trimer s.
Function OX40L binding to CD134 on T cells co-stimulates proliferation 1. Crosslinking OX40L on activated B cells stimulates proliferation and antibody production s. Similarly, blocking the OX40L-CD134 interaction
lOi~
reduced antibody production, suggesting a role in differentiation into plasma cells s Database accession numbers PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
S12783
P43489 P15725 P47741
X75962 X17037 Z21674
2 3 9
Human Rat Mouse
A m i n o acid s e q u e n c e of h u m a n C D 134 MCVGARRLGR LHCVGDTYPS KPCKPCTWCN PPGHFSPGDN TQGPPARPIT GPLAILLALY
GPCAALLLLG NDRCCHECRP LRSGSERKQL QACKPWTNCT VQPTEAWPRT LLRRDQRLPP
LGLSTVTG GNGMVSRCSR CTATQDTVCR LAGKHTLQPA SQGPSTRPVE DAHKPPGGGS
SQNTVCRPCG CRAGTQPLDS SNSSDAICED VPGGRAVAAI FRTPIQEEQA
PGFYNDVVSS YKPGVDCAPC RDPPATQPQE LGLGLVLGLL DAHSTLAKI
-i 50 i00 150 200 249
References 1 2 3 4
Gruss, H.-J. and Dower, S.K. (1995) Blood 85, 3378-3404. Latza, U. et al. (1994) Eur. J. I m m u n o l . 24, 677-683. Mallett, S. et al. (1990) EMBO J. 9, 1063-1068. A1-Shamkhani, A. et al. (1996)Eur. J. I m m u n o l . 26, 1695-1699.
s Stuber, E. and Strober, W. (1996) J. Exp. Med. 183, 979-989. 6 7 s 9
van Kooten, C. and Banchereau, J. (1996) Adv. I m m u n o l . 61, 1 - 7 7 . Baum, P.R. et al. (1994) EMBO J. 13, 3992-4001. A1-Shamkhani, A. et al. (1997) J. Biol. C h e m . 272, 5275-5282. Calderhead, D.M. et al. (1993) J. I m m u n o l . 151, 5261-5271.
101
CD135
STK-1, FLT3, ilk-2
Molecular weights
NH2
Polypeptide
110131
SDS-PAGE reduced (doublet)
130/160 kDa
02
Carbohydrate N-linked sites O-linked sites
10 unknown
Human gene location 13q12
C2
Domains
I sl
VLV I YLL
~ I
c2
,
CIS
!
I
CKA
IRC
c2
,
I
I
FKA
YTC
c2
,
I
I
CFS
YIF
c2
,
I
I
VKC
v
i
ITMI
K
I
Tissue distribution CD135 is expressed on CD34 § haematopoietic stem cells 1. It has also been detected in mouse brain 2
Structure CD135 has five IgSF domains, a single transmembrane segment, and a cytoplasmic region containing a protein kinase domain with an internal kinase insert domain 1,z. It is a member of the type HI receptor tyrosine kinase family which includes c-kit, M-CSFR (CD115) and PDGFR (CD140) 3. A short sequence that shows no similarity to other proteins precedes the first IgSF domain. The IgSF domains contain a number of cysteines in atypical positions and so the domain assignments are tentative. Domains 1 and 2 show substantial similarity to domains 4 and 5, suggesting an internal duplication event during evolution. Cell lines expressing CD135 show protein products of 130 and 160 kDa, probably due to differential glycosylation.
Ligands and associated molecules CD 135 binds FLT3 ligand 4.
10~
Function CD135 is likely to be involved in growth and differentiation of primitive haematopoietic cells. Antisense RNA of CD135 has been shown to inhibit colony-forming activity of long-term bone marrow culture cells ~ Database accession numbers PIR
Human Mouse
A39931
SWISSPR OT P36888 Q00342
EMBL/GENBANK U02687 M64689 X59398
REFERENCE 1 2 s
A m i n o a c i d s e q u e n c e of h u m a n C D 1 3 5 MPALARDAGT NQDLPVIKCV EAAAVEVDVS SMVILKMTET ENQDALVCIS TDIRCCARNE VNHGFGLTWE YYTCSSSKHP PQIRCTWTFS FTKMFTLNIR EITEGVWNRK ILLNSPGPFP QMVQVTGSSD ATAYGISKTG LLGACTLSGP PTFQSHPNSS EEDLNVLTFE ICDFGLARDI LLWEIFSLGV AFDSRKRPSF REMDLGLLSP
VPLLVVFSAM LINHKNNDSS ASITLQVLVD QAGEYLLFIQ ESVPEPIVEW LGRECTRLFT LENKALEEGN SQSALVTIVG RKSFPCEQKG RKPQVLAEAS ANRKVFGQWV FIQDNISFYA NEYFYVDFRE VSIQVAVKML IYLIFEYCCY MPGSREVQIH DLLCFAYQVA MSDSNYVVRG NPYPGIPVDA PNLTSFLGCQ QAQVEDS
IFGTIT VGKSSSYPMV APGNISCLWV SEATNYTILF VLCDSQGESC IDLNQTPQTT YFEMSTYSTN KGFINATNSS LDNGYSISKF ASQASCFSDG SSSTLNMSEA TIGVCLLFIV YEYDLKWEFP KEKADSSERE GDLLNYLRSK PDSDQISGLH KGMEFLEFKS NARLPVKWMA NFYKLIQNGF LADAEEAMYQ
SESPEDLGCA FKHSSLNCQP TVSIRNTLLY KEESPAVVKK LPQLFLKVGE RTMIRILFAF EDYEIDQYEE CNHKHQPGEY YPLPSWTWKK IKGFLVKCCA VLTLLICHKY RENLEFGKVL ALMSELKMMT REKFHRTWTE GNSFHSEDEI CVHRDLAARN PESLFEGIYT KMDQPFYATE NVDGRVSECP
LRPQSSGTVY HFDLQNRGVV TLRRPYFRKM EEKVLHELFG PLWIRCKAVH VSSVARNDTG FCFSVRFKAY IFHAENDDAQ CSDKSPNCTE YNSLGTSCET KKQFRYESQL GSGAFGKVMN QLGSHENIVN IFKEHNFSFY EYENQKRLEE VLVTHGKVVK IKSDVWSYGI EIYIIMQSCW HTYQNRRPFS
-I 50 i00 150 200 250 300 350 400 450 500 550 600 650 7OO 75O 8OO 85O 900 95O 967
References 1 Small, D. et al. (1994) Proc. Natl Acad. Sci. USA 91, 459-463. z Matthews, W. et al. (1991)Cell 65, 1143-1152. 3 Ullrich, A. and Schlessinger, J. (1990)Cell 61,203-212. 4 H a n n u m , C. et al. (1994) N a t u r e 368, 6 4 3 - 6 4 8 . s R o s n e t , O. et al. ( 1 9 9 1 ) O n c o g e n e 6, 1 6 4 1 - 1 6 5 0 .
103
CDw137
4-1BB ILA
Molecular weight Polypeptide
26 079
SDS-PAGE reduced unreduced
28-30 kDa 28-30, 55 (major form) and 110 kDa
Carbohydrate N-linked sites O-linked
2 probable +
Human gene location lp36 COOH
Domains
CPP
Isl I
AEcCTP Tr
,II
KGcCCF
Tr
,I]
Zr
vvc, I
ITMICu
Tissue distribution
9
i
......
i
Human CDw137 mRNA is absent from resting T cells but on activation is induced in T and B cells and in monocytes 1. It is also detected in activated non-lymphoid cells 1. Expression on the surface of activated human T cells was indicated with peptide antibodies 1. Peptide antibodies showed that mouse CDw137 is expressed on activated thymocytes and activated splenic T cells, CD4 § and CD8 § T cells 2"a. Mouse CDw137 mRNA was not found in activated B cells a.
Structure CDw137 encodes for a transmembrane protein that contains three TNFR repeats a-6. The second and third Cys residues in the first repeat of CDw137 are not adjacent, as they are in the first repeats of other members of this superfamily, but are separated by two amino acids, as they are in a typical second repeat a's'6. The cytoplasmic domain of mouse CDw137 contains a sequence, CSCRCP, which resembles the CXCP motif for the Lck recognition site found in CD4 and CD8 s'7. Human CDw137 has a slightly different sequence CSCRFP. The cytoplasmic domain contains a high proportion of consecutive acidic residues s'7. SDS-PAGE analysis revealed disulfide-linked forms of CDw137 but it is not clear from sequence analysis how these form or their physiological significance 2.
Ligands and associated molecules CDw137 binds to 4-1BBL, a type II membrane protein of the TNF superfamily a'4. CDw137 has been reported to bind to extracellular matrix a. Lck is co-precipitated with mouse CDw137 and Cys20 and Cys23 of Lck are critical for the association r.
104
Function 4-1BBL binding to CDw137 can co-stimulate T cell growth and this can be blocked by soluble CDw137 s,9. Reciprocally, CDw137 binding to 4-1BBL can co-stimulate B cells 1
/
Database accession numbers PIR
Human Mouse
Amino
B32393
SWISSPR OT Q07011 P20334
acid sequence
MGNSCYNIVA FERTRSLQDP CKGVFRTRKE KDCCFGTFND GASSVTPPAP KKLLYIFKQP
TLLLVLN CSNCPAGTFC CSSTSNAECD QKRGICRPWT AREPGHSPQI FMRPVQTTQE
of h u m a n
EMBL/GENBANK U03397 J04492
REFERENCE 4 10
CDw137
DNNRNQICSP CTPGFHCLGA NCSLDGKSVL ISFFLALTST EDGCSCRFPE
CPPNSFSSAG GCSMCEQDCK VNGTKERDVV ALLFLLFFLT EEEGGCEL
GQRTCDICRQ QGQELTKKGC CGPSPADLSP LRFSVVKRGR
-I 50 I00 150 200 238
References 1 2 3 4 s 6 7 s 9 lo
Schwartz, H. et al. (1995)Blood 85, 1043-1052. Pollock, K.E. et al. (1993) J. Immunol. 150, 771-781. Gruss, H.-J. and Dower, S.K. (1995) Blood 85, 3378-3404. Alderson, M.R. et al. (1994) Eur. J. Immunol. 24, 2219-2227. Armitage, R.J. (1994)Curt. Opin. Immunol. 6, 407-413. van Kooten, C. and Banchereau, J. (1996) Adv. Immunol. 61, 1-77. Kim, Y-J. et al. (1993) J. Immunol. 151, 1255-1262. Hurtado, J.C. et al. (1995) J. Immunol. 155, 3360-3367. DeBenedette, M.A. et al. (1995) J. Exp. Med. 181,985-992. Kwon, B.S. and Weissman, S.M. (1989) Proc. Natl Acad. Sci. USA 86, 1963-1967.
~0~
CD138 F
Syndecan-1
Molecular weights Polypeptide SDS-PAGE unreduced
Carbohydrate N-linked sites O-linked sites Glycosaminoglycans
30 507
(immature B cells) 92 kDa (plasma cells) 85 kDa
1 probably + 5
Human gene location 2p23
GOOH
Tissue distribution CD 138 is expressed on pre-B cells, immature B cells and plasma cells, but not on mature circulating B lymphocytes. It is also expressed on the basolateral surfaces of epithelial cells, embryonic mesenchymal cells, vascular smooth muscle cells, endothelium and neural cells 1-4. Structure CD138 has an extended backbone with five glycosaminoglycan (GAG) attachment sites s,6. Site utilization varies: in the mouse, the three distal sites (Ser20, Ser28 and Ser30) are usually modified by heparan sulfate, whereas the two sites close to the membrane (Serl90 and Ser200; equivalent to Ser189 and Ser199 in the human), are usually occupied by chondroitin sulfate 7. In addition, the structure of the GAG chains may be cell typespecific s. CD138 is highly conserved (>90%) between the human and the mouse. The single N-glycosylation site may serve to regulate the glycanation of the distal cluster s'9. CD138 is closely related to syndecan-2 (fibroglycan), syndecan-3 (N-syndecan) and syndecan-4 (amphiglycan or ryudocan); although the extracellular protein cores are different they have very similar transmembrane and cytoplasmic domains a. Their expression patterns also vary, suggesting that they have tissue-specific functions.
CD138
Ligands and associated molecules CD138 has been shown to bind many extracellular matrix proteins through its heparan sulfate side-chains, including fibronectin, collagen types I, llI and V, tenascin, thrombospondin and antithrombin III 2'3. The cytoplasmic region of CD138 is known to interact with actin-rich microfilaments lo. CD138 on the cell surface may be regulated to form oligomeric complexes, since antibody-induced clustering causes CD138 to become insoluble in non-ionic detergents 11. This aggregation is mediated by the transmembrane segment lo CD138 has been shown to bind fibroblast growth factor 2 (FGF-2)12
Function CD138 is an extracellular matrix receptor 2-4. It may serve as a co-receptor for FGF and related molecules. FGF-2 appears to require an association with heparan sulfate-bearing molecules (such as CD138) to transduce signals via the FGF receptor 12,13.
Database accession numbers PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
Human
A41176
P 18827
Mouse
S06619
P18828
J05392 X60306 X15487 Z22532
s 6 9 14
A m i n o acid s e q u e n c e of h u m a n C D 1 3 8 MRRAALWLWL QPALPQIVAT TQLLTAIPTS EQEATPRPRE AGPSQADLHT ENTAVVAVEP VCLVGFMLYR
CALALSL NLPPEDQDGS PEPTGLEATA TTQLPTTHQA PHTEDGGPSA DRRNQSPVDQ MKKKDEGSYS
GDDSDNFSGS ASTSTLPAGE STTTATTAQE TERAAEDGAS GATGASQGLL LEEPKQANGG
GAGALQDITL GPKEGEAVVL PATSHPHRDM SQLPAAEGSG DRKEVLGGVI AYQKPTKQEE
SQQTPSTWKD PEVEPGLTAR QPGHHETSTP EQDFTFETSG AGGLVGLIFA FYA
-I 50 i00 150 200 250 293
References 1 2 3 4 s 6 7 8 9 lo 11 12 13 14
Sanderson, R.D. et al. (1989)Cell Regul. 1, 27-35. Bernfield, M. et al. (1992) Annu. Rev. Cell Biol. 8, 365-393. Couchman, J.R. and Woods, A. (1996) J. Cell. Biochem. 61, 578-584. David, G. (1993)FASEB J. 7, 1023-1030. Mali, M. et al. (1990) J. Biol. Chem. 265, 6884-6889. Lories, V. et al. (1992) J. Biol. Chem. 267, 1116-1122. Kokenyesi, R. and Bernfield, M. (1994) J. Biol. Chem. 269, 12304-12309. Kato, M. et al. (1994) J. Biol. Chem. 269, 18881-18890. Saunders, S. et al. (1989) J. Cell Biol. 108, 1547-1556. Carey, D.J. et al. (1996) J. Biol. Chem. 271, 15253-15260. Miettinen, H.M. et al. (1994) J. Cell Sci. 107, 1571-1581. Yayon, A. et al. (1991) Cell 64, 841-848. Aviezer, D. et al. (1994) Cell 79, 1005-1013. Vihinen, T. et al. (1993) J. Biol. Chem. 268, 17261-17269.
107
CD147 Other names EMMPRIN (human) M6 (human) OX-47 (rat) CE9 (rat) Basigin (mouse) gp42 (mouse) Neurothelin (chicken) HT7 (chicken) 5A11 (chicken) Molecular weights Polypeptide 27 563
NH 2
SDS-PAGE reduced unreduced
65 kDa 54 kDa
Carbohydrate N-linked sites O-linked
3 unknown
Human gene location 19p 13.3 Domains
CKS
CSL
ls, i
I
I
I
YR(
Iou
COOH
Tissue distribution
L
f
The CD 147 antigen has a broad expression pattern in both haematopoietic and non-haematopoietic tissues in all four species and is upregulated upon cell activation. CD147 is expressed weakly on resting leucocytes but is strongly upregulated on activated lymphocytes and monocytes 1. CD 147 is expressed on various epithelial cells with some differences between species, e.g. rat 2,a, mouse 4 and chicken s. Structure
E
E
The CD 147 antigen consists of two IgSF domains, a transmembrane sequence containing a charged residue (Glu) and a cytoplasmic domain of 40 residues. The N-terminal IgSF domain belongs to the C2-set and the membrane proximal domain belongs to the V-set. This is unusual since most members of the IgSF that contain V- and C2-set domains have the opposite arrangement 2. In the chicken, the second domain belongs to the C2-set s. The extent of CD147 glycosylation is tissue-specific and is responsible for a variation in the apparent molecular mass from 40 kDa to 68 kDa a. Within the transmembrane region, the charged residue (Glu) is noteworthy because charged residues are rarely found in transmembrane segments of proteins
with single membrane spanning segments, except in proteins which associate with other polypeptides in the membrane z. The transmembrane sequence of CD 147 is absolutely conserved between human and chicken 1. In the mouse, cDNA clones encoding different N-termini have been found 4. This heterogeneity may be due to alternative splicing.
Function Human CD 147 (EMMPRIN - extracellular matrix metalloproteinase inducer) on tumour cells is thought to bind an unknown ligand on fibroblasts, which stimulates their production of collagenase and other extracellular matrix metalloproteinases, thus enhancing tumour cell invasion and metastasis 6. CD147 knockout mice are abnormal in their response to odour and their lymphocytes show an increased mitogenic response upon mixed lymphocyte reaction 7. An adhesion role for chicken CD 147 has been proposed, since the 5A11 mAb inhibits neural extensions of retinal glial cells and reduces retinal cell re-aggregation in vitro 8.
Database accession numbers Human M6 Rat CD 147 Mouse Basigin Chicken HT7
PIR A46506 A45444 JX0107 S10147
SWISSPR OT P35613 P26453 P18572 P17790
EMBL/GENBANK X64364 X54640 D00611 X52751
REFERENCE 1 z 9 s
A m i n o a c i d s e q u e n c e of h u m a n C D 1 4 7 MAAALFVLLG AAGTVFTTVE TEFKVDSDDQ AMLVCKSESV ENLNMEADPG LVTIIFIYEK
FALLGTHGAS DLGSKILLTC WGEYSCVFLP PPVTDWAWYK QYRCNGTSSK RRKPEDVLDD
G SLNDSATEVT EPMGTANIQL ITDSEDKALM GSDQAIITLR DDAGSAPLKS
GHRWLKGGVV HGPPRVKAVK NGSESRFFVS VRSHLAALWP SGQHQNDKGK
LKEDALPGQK SSEHINEGET SSQGRSELHI FLGIVAEVLV NVRQRNSS
-i 50 i00 150 200 248
References 1 Kasinrerk, W. et al. (1992) J. Immunol. 149, 847-854. z Fossum, S. et al. {1991) Eur. J. Immunol. 2 1 , 6 7 1 - 6 7 9 .
3 Nehme, C.L. et al. (1995) Blood 310, 693-698. 4 Kanekura, T. et al. (1991) Cell Struct. Funct. 16, 23-30.
s 6 z s 9
Seulberger, H. et al. {1990) EMBO J. 9, 2151-2158. Biswas, C. et al. (1995) Cancer Res. 55, 434-439. Igakura, T. et al. (1996) Biochem. Biophys. Res. Commun. 224, 33-36. Fadool, J.M. and Linser, P.J. (1993) Dev. Dynamics 196, 252-262. Altruda, F. et al. (1989) Gene 85, 445-451.
10~
CD148
HTPT~, DEP-1
Molecular weights Polypeptide 141 815
,--4/,
SDS-PAGE reduced
220-250 kDa
Carbohydrate N-linked sites O-linked
34 unknown
"L:L-" --0
"--j_tL,"
Human gene location llpll.2
.423_---" :::::I
COOH Domains
l.sl
WKS
I
!
TSY
F3
I
WKS
I
,
~s, !
I
I I
vG F3
TSY WSN
Fa
F3
,~s ...I
I voY
I
~,s ~ss
TRY WKV
F3
...I
I
F3
ITMI
I
I
~,~ ~3
P
I
I
F~
~'Y
I
I
Tissue distribution Messenger RNA analysis indicates that CD 148 is expressed on a variety of cell types. Protein expression has been demonstrated in haematopoietic, particularly myeloid, cell lines by immunoblotting 1-a.
Structure CD 148 is a type I membrane glycoprotein with eight fibronectin type III (Fn3) domains and a single cytoplasmic phosphotyrosine phosphatase domain 4. A further two Fn3 domains have been proposed in the membrane-proximal region but the alignments are not typical of Fn3 domains i. CD 148 contains a particularly high content of N-linked glycosylation sites.
11(
CD148
Function The levels of expression of CD 148 increase on contact between epithelial cell lines suggesting a role in contact inhibition of cell growth 2,s.
Database accession numbers PIR
SWISSPR OT
Human Mouse
EMBL/GENBANK
D3 7781 D45212
REFERENCE
2 3
A m i n o acid s e q u e n c e of h u m a n C D 148
/
MKPAAREARL GGTPSPIPDP GESSGANDSL LTWKSNDTAA TPGIGNETWG RVLLESIGSH TEGGLDASNT PGTRYNATVY KVSDNESSSN VLGDIEGTPG TQEGAGNSRV RTVPSAVFDI YDKAITLQGL NTTAATLSWQ LIPGSSYTVE VLKWTCPPGA TSYNISITTV FSGFEASHGP IRTEEKGRSQ NITFHPQNKG IVTVGGFIFW SNCGFAEEYE QTHSTDDYIN MLTKCVEQGR IQTSESHPLR VHCSAGVGRT QYVFLNQCVL A
PPRSPGLRWA SVATVATGEN RTPEQGSNGT SEYKYVVKHK DPRVIKVITE EELTQDSRLQ ERSRAGSPTA SQAANGTEGQ YTYKIHVAGE FLQVHTPPVP EITTNQSIII HVVYVTTTEM IPGTLYNITI NFDDASPTYS IFAQVGDGIK NAGFELEVSS SCGKMAAPTR IKAYAVILTT SLSEVLKYEI LIDGAESYVS RKKRKDAKNN DLKLVGISQP ANYMPGYHSK TKCEEYWPSK QFHFTSWPDH GTFIAIDRLI DIVRSQKDSK
LPLLLLLLRL GITQISSTAE DGASQKTPSS MENEKTITVV PIPVSDLRVA VNISGLKPGV PVHDESLVGP PQAIEFRTNA TDSSNLNVSE VSDFRVTVVS GGLFPGTKYC WLDWKSPDGA SPEVDHVWGD YCLLIEKAGN SLEPGRKSFC GAWNNATHLE NTCTTGITDP GEAGHPSADV DVGNESTTLG FSRYSDAVSL EVSFSQIKPK KYAAELAENR KDFIATQGPL QAQDYGDITV GVPDTTDLLI YQIENENTVD VDLIYQNTTA
GQILCA SFHKQNGTGT TGPSPVFDIK HQPWCNITGL LTGVRKAALS QYNINPYLLQ VDPSSGQQSR IQVFDVTAVN PRAVIPGLRS TTEIGLAWSS FEIVPKGPNG SEYVYHLVIE PNSTAQYTRP SSNATQVVTD TDPASMASFD SCSSENGTEY PPPDGSPNIT LKYTYDDFKK YLQWEAGTSG PQDPGVICGA KSKLIRVENF GKNRYNNVLP PNTLKDFWRM AMTSEIVLPE NFRYLVRDYM VYGIVYDLRM MTIYENLAPV
PQVETNTSED AVSISPTNVI RPATSYVFSI WSNGNGTASC SNKTKGDPLG DTEVLLVGLE ISATSLTLIW STFYNITVCP HDAESFQMHI TEGASRTVCN SKHGSNHTST SNVSNIDVST IGITDATVTE CEVVPKEPAL RTEVTYLNFS SVSHNSVKVK GASDTYVTYL LLPACVAGFT VFGCIFGALV EAYFKKQQAD YDISRVKLSV VWEKNVYAII WTIRDFTVKN KQSPPESPIL HRPLMVQTED TTFGKTNGYI
-i 5O i00 150 200 250 300 350 400 450 500 550 600 650 7OO 75O 8OO 85O 9OO 95O i000 1050 ii00 1150 1200 1250 1300 1301
References
1 z 3 4
Honda, H. et al. (1994) Blood 84, 4186-4194. Ostman, A. et al. (1994) Proc. Natl Acad. Sci. USA 91, 9680-9684. Kuramochi, S. et al. (1996) FEBS Lett. 378, 7-14. Fauman, E.B. and Saper, M.A. (1996)Trends Biochem. Sci. 9,1,413-417.
s Keane, M.M. et al. (1996) Cancer Res. 56, 4236-4243.
111
SLAM
Molecular weights Polypeptide 34486 SDS-PAGE reduced
70 kDa C2 s
Carbohydrate N-linked sites O-linked
s
8 unknown LTY
Domains
Isl
CTV
YLMI
]
v
]
I
TTT
YIC t
c=
ITal
cu
I COOH
Tissue distribution Constitutively expressed on immature thymocytes, s o m e C D 4 5 R O high memory T cells, and a proportion of B cells. Rapidly induced on all T and B cells following activation 1.
Structure
?
F [......................
CDwlS0 contains two highly glycosylated IgSF domains and has structural features placing it within the CD2 family, which includes CD48, CD58, 2B4 and Ly-91,2. Two variant cDNAs have been identified which correspond to alternatively spliced mRNA transcripts 1. One encodes a soluble protein lacking the transmembrane region (207-236). The second one encodes a truncated molecule in which residue 261 is followed by the sequence DTHHQTSDLF. The cytoplasmic motif contains three Tyr-containing motifs (Y/hydrophobic/X/hydrophobic) suggestive of SH2-binding sites.
Function The CDwl50 mAb A12 enhances Ag-induced proliferation of CD4 § T cells and can directly stimulate proliferation of previously activated T cells even as an Fab fragment 1. This mAb increases IFNv, but not IL-4, production on Agactivated TH0, TH1 and TH2 clones.
Database accession numbers PIR
Human
112
SWlSSPR OT
EMBL/GENBANK
REFERENCE
U33017
1
CDwl50
A m i n o a c i d s e q u e n c e of h u m a n C D w l 5 0 MDPKGLLSLT RMMNCPKILR IVSLDPSEAG VQRFCLQLRL EKAGTHPLNP CRTDPSETKP LTIYAQVQKP ASVTLPES
FVLFLSLAFG QLGSKVLLPL PPRYLGDRYK YEQVSTPEIK ANSSHLLSLT WAVYAGLLGG GPLQKKLDSF
ASYGTGG TYERINKSMN FYLENLTLGI VLNKTQENGT LGPQHADNIY VIMILIMVVI PAQDPCTTIY
KSIHIVVTMA RESRKEDEGW CTLILGCTVE ICTVSNPISN LQLRRRGKTN VAATEPVPES
KSLENSVENK YLMTLEKNVS KGDHVAYSWS NSQTFSPWPG HYQTTVEKKS VQETNSITVY
-i 50 i00 150 200 250 300 308
References 1 Cocks, B.G. et al. (1995) Nature 376, 260-263. z Davis, S.J. and van der Merwe, P.A. (1996) Immunol. Today 17, 177-187.
11~
PETA-3
Molecular weight Polypeptide 28 295 SDS-PAGE reduced
27 kDa
Carbohydrate N-linked sites O-linked
1 nil NH2
Tissue distribution CD151 is expressed by platelets, megakaryocytes and monocytes, but not by lymphocytes, granulocytes or haematopoietic progenitor cells. The expression of this antigen is not confined to haematopoietic cells, since epithelial and endothelial cells express CD 151 1.
Structure CD 151 is a member of the TM4 superfamily and is predicted to have four transmembrane regions, short cytoplasmic N- and C-termini, and two extracellular regions (reviewed in ref. 2).
Function The CD151 mAb, 14A2.H1, activates platelets in v i t r o 3
Database accession numbers PIR
Human
SWISSPR O T
EMBL/GENBANK
REFERENCE
P48509
U 14650
a
Amino acid sequence of human CD 151 MGEFNEKKTT CGTVCLKYLL FTYNCCFWLAGLAVMAVGIWTLALKSDYIS LLASGTYLAT FLLEIIAGIL QEFHCCGSNN IYKVEGGCIT EHY
AYILVVAGTV AYAYYQQLNT SQDWRDSEWI KLETFIQEHL
VMVTGVLGCC ELKENLKDTM RSQEAGGRVV RVIGAVGIGI
ATFKERRNLL TKRYHQPGHE PDSCCKTVVA ACVQVFGMIF
RLYFILLLII AVTSAVDQLQ LCGQRDHASN TCCLYRSLKL
50 i00 150 200 250 253
References 1 2 3 a
114
Ashman, L.K. et al. (1991) Br. J. Haematol. 79, 263-270. Wright, M.D. and Tomlinson, M.G. (1994) Immunol. Today 15, 588-594. Roberts, J.J. et al. (1995) Br. J. Haematol. 89, 853-860. Fitter, S. et al. (1995) Blood 86, 1348-1355.
CTLA-4
Molecular weights Polypeptide
20 343
SDS-PAGE reduced unreduced
33 kDa 50 kDa
NH 2 NH 2
Carbohydrate N-linked sites O-linked
1 unknown
Human gene location and size
COOH COOH
2 q33; 6 kb 1,2 Domain
CEY
Is[
I
Exon boundaries KAM 1
YIC
v
I
I ITMlCYI I1 Io IDP MLK
Tissue distribution CD152 is expressed on activated but not resting T lymphocytes. Expression peaks at -~24 h and then subsides by 72 h 3, but is always 30-50-fold lower than CD28 4. CD28 ligation is particularly effective in inducing CD152 expression 3. CD152 mRNA is frequently present in the absence of cell surface CD152 protein, suggesting post-transcriptional regulation of cell surface expression 3.
Structure CD 152 is structurally similar to CD28 (31% identity), binds the same ligands (see below), and the two genes are less than 150 kb apart 5, suggesting that they share a common ancestor in evolution. CD152 has been particularly highly conserved during evolution; the cytoplasmic domains of human and mouse CD152 are identical 2. CD152 exists primarily as a disulfide-linked dimer 6 but non-disulfide-linked forms have been reported 3.
Ligands and associated molecules Like CD28, CD152 binds both CD80 and CD86. The binding site for CD80 includes a highly conserved motif (MYPPPY) in the CDR3-1ike loop z. CD80 and CD86 bind CD 152 with Kas of 0.4 and 2.2/~M, respectively and dissociate rapidly (koff >0.4 s-I) s'9. The CD152 cytoplasmic domain has been reported to interact with the SH2 domain of the tyrosine phosphatase SHP-2 (PTP1D, SYP) through the phosphotyrosine motif pYVKM lo. An association with phosphatidylinositol 3-kinase has also been reported, apparently through the same motif 11.
|1~
CD152
!--
Function Unlike CD28, CD152 negatively regulates T cell activation. Anti-CD152 monoclonal antibodies enhance T cell responses in vitro and in v i v o 12"~a and CD152-deficient mice develop a fatal lymphoproliferative disorder la'ls. Although the mechanism of this effect is not known, CD152 may inhibit tyrosine kinase signalling through the TCR through its association with tyrosine phosphatases such as SHP-2 ~o.
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
S08614 A29063
P16410 P09793
M74363 X05719
2 16
Amino acid sequence of human CD152 MACLGFQRHK MHVAQPAVVL TYMMGNELTF PYYLGIGNGA KMLKKRSPLT
AQLNLATRTW ASSRGIASFV LDDSICTGTS QIYVIDPEPC TGVYVKMPPT
PCTLLFFLLF CEYASPGKAT SGNQVNLTIQ PDSDFLLWIL EPECEKQFQP
IPVFCKA EVRVTVLRQA DSQVTEVCAA GLRAMDTGLY ICKVELMYPP AAVSSGLFFY SFLLTAVSLS YFIPIN
-1 50 i00 150 186
References 1 Dariavach, P. et al. (1988) Eur. J. Immunol. 18, 1901-1905. z Harper, K. et al. {1991) J. Immunol. 147, 1037-1044. a Lenschow, D.J. et al. (1996)Annu. Rev. Immunol. 14, 233-258.
a Linsley, P.S. and Ledbetter, J.A. (1993) Annu. Rev. Immunol. 11, 191-211. s 6 7 s 9 lo 11 12 la 14 is 16
|lt
Buonavista, N. et al. (1992) Genomics 13, 856-861. Linsley, P.S. et al. (1995) J. Biol. Chem. 270, 15417-15424. Peach, R.J. et al. (1994) J. Exp. Med. 180, 2049-2058. Greene, J.L. et al. (1996} J. Biol. Chem. 271, 2 6 7 6 2 - 2 6 7 7 1 . van der Merwe, P.A. et al., {1997)J. Exp. Med. 185, 393-403. Mareng4re, L.E.M. et al. {1996) Science 272, 1170-1173. Schneider, H. et al. (1995) J. Exp. Med. 181,351-355. Kcarney, E.R. et al. (1995)J. Immunol. 155, 1032-1036. Leach, D.R. et al. (1996) Science 271, 1734-1736. Tivol, E.A. et al. {1995} I m m u n i t y 3, 541-547. Waterhouse, P. et al. {1995} Science 270, 985-988. Brunet, J.F. et al. {1987} Nature 328, 267-270.
CD30L Molecular weights Polypeptide 26 017 SDS-PAGE reduced unreduced
or oligomers
40 kDa 40 kDa plus dimers and with mouse CD 153 higher oligomers
Carbohydrate N-linked sites O-linked
5 unknown NH2
Human gene location 9q331 WNK
Domain
ICu
I
I
T
FPL
I
I
Tissue distribution CD153 is mainly expressed on activated peripheral blood T macrophages 1,2. Messenger RNA for human CD153 is restricted and monocyte/macrophages activated in a particular way and was in activated B cells 1. Unlike CD30, CD153 is not expressed on lymphoma-derived cell lines 2.
i !
cells and to T cells not found Hodgkin's
Structure CD153 is a member of the TNF superfamily 1-4. Like other members of this superfamily, it is a type II membrane protein with the C-terminal extracellular region showing similarity to TNF 1-4. CD 153 has two cysteines which do not align with other members of the family which may form disulfide-linked multimers 1. It is not clear whether the protein is normally a dimer, trimer or higher oligomer 1.
Ligands and associated molecules CD153 binds to CD30, a member of the TNFR superfamily.
Function The CD153-CD30 interaction co-stimulates T cell proliferation 1,2, upregulates expression of adhesion molecules and stimulates cytokine release 2. A role for the CD 153-CD30 interaction in deletion of thymocytes in the thymus is suggested from studies with CD30-deficient mice s. CD30 § T cell clones produce TH2-type cytokines. A role for the CD153-CD30 interaction in TH2-type autoimmune disease has been suggested 2,6.
11~
CD153
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A40710 B40710
P32971 P32972
L09753 L09754
1 1
Amino acid sequence of human CD153 MDPGLQQALN VFTVATIMVL YLQVAKHLNK CPNNSVDLKL NTTISVNVDT
GMAPPGDTAM VVQRTDSIPN TKLSWNKDGI ELLINKHIKK FQYIDTSTFP
HVPAGSVASH SPDNVPLKGG LHGVRYQDGN QALVTVCESG LENVLSIFLY
LGTTSRSYFY NCSEDLLCIL LVIQFPGLYF MQTKHVYQNL SNSD
LTTATLALCL KRAPFKKSWA IICQLQFLVQ SQFLLDYLQV
References 1 z a 4 s 6
11~
Smith, C.A. et al. (1993) Cell 73, 1349-1360. Gruss, H.-J. and Dower, S.K. (1995) Blood 85, 3378-3404. Armitage, R.J. (1994)Curr. Opin. Immunol. (1994) 6, 407-413. van Kooten, C. and Banchereau, J. (1996) Adv. Immunol. 61, 1-77. Amakawa, R. et al. (1996)Cell, 84, 551-562. Del Prete, G. et al. (1995) Immunol. Today 16, 76-80.
50 i00 150 200 234
CD154
CD40L
,o
Molecular weights Polypeptide
29 274 T
SDS-PAGE reduced
33 kDa
Carbohydrate N-linked sites O-linked
1 unknown
1111 11
Human gene location Xq26.3-27.1; 13 kb i Domain
WAE
IcY ITMI Exon boundaries
I
I
T
/010 ~
SFG
I
NH2
I
,1 VLQ
DKIE VKDI GDQ
Tissue distribution CD154 is absent from resting lymphocytes but is rapidly expressed on activation. It is present mostly on CD4 § cells but also a small population of CD8 § cells and ?5 T cells 1-3. It is also expressed on activated basophils and mast cells 1-3
Structure CD 154 is a member of the TNF superfamily 1-3. Like other members of this superfamily, it is a type II membrane protein expressed as a trimer with the similarity to TNF being in the C-terminal extracellular region 1-s. The crystal structure of the extracellular region has confirmed the similarity to TNF with differences in loop regions predicted to be involved in CD40 binding 4. Trimeric CD 154 is predicted to bind three CD40 molecules 3.
Ligands and associated molecules CD 154 binds to CD40, a member of the TNFR superfamily.
Function CD 154 binding to CD40 on B cells is required for secondary immune responses and germinal centre formation 1-3. Mutations in CD 154 which abolish binding to CD40 cause the immunodeficiency disease, hyper-IgM syndrome which is characterized by lack of isotype switching in Ig production and lack of germinal centres/-3. There is evidence for a role for the CD154-CD40 interaction in negative selection and peripheral tolerance 1-3. Mice deficient in CD40 or CD154 have increased susceptibility to parasite infection, pointing to a role in cell-mediated immunity as well as the humoral response s.
CD154
Database accession numbers Human Mouse
9.
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
$28017 $21738
P29965 P27548
Z15017 X65453
6 7
Amino acid sequence of human CD 154 MIETYNQTSP DKIEDERNLH NKEETKKENS NLVTLENGKQ FERILLRAAN TGFTSFGLLK
RSAATGLPIS EDFVFMKTIQ FEMQKGDQNP LTVKRQGLYY THSSAKPCGQ L
MKIFMYLLTV RCNTGERSLS QIAAHVISEA IYAQVTFCSN QSIHLGGVFE
FLITQMIGSA LLNCEEIKSQ SSKTTSVLQW REASSQAPFI LQPGASVFVN
LFAVYLHRRL FEGFVKDIML AEKGYYTMSN ASLCLKSPGR VTDPSQVSHG
References 1 z a a s 6 7
12(
van Kooten, C. and Banchereau, J. (1996) Adv. Immunol. 61, 1-77. Gruss, H.-J. and Dower, S.K. (1995) Blood 85, 3378-3404. Foy, T.M. et al. (1996)Annu. Rev. Immunol. 14, 591-617. Karpusas, M. et al. (1995) Structure 3, 1031-1039. Noelle, R.J. {1996} I m m u n i t y 4, 415-419. Hollenbaugh D. et al. (1992}. EMBO J. 11, 4313-4321 Armitage, R.J. et al. {1992) Nature 357, 80-82.
50 i00 150 200 250 261
CD161
NKR-P1 family
I
Members 1 NKR-P1A (CD161, NKR-P1 gene 2, mNKR-P1.7 (mouse); 3.2.3, NKR-P1 (rat)) NKR-P1B (NKR-P1 gene 34 (mouse)) NKR-P1C (NKI.1, NKR-P1 gene 40, mNKR-P1.9 (mouse))
Molecular weights (human CD 161 ) Polypeptide
25 415
SDS-PAGE reduced unreduced
- 4 0 - 4 4 kDa -80-85 kDa
$
Carbohydrate (human CD 161 ) N-linked sites O-linked
4 none
NH2 NH2
Human gene location 12p12.3-p13.1 CST
Domain
IC','ITMI
I
c,
wlc,
I
Tissue distribution These molecules are found on most natural killer (NK) cells and a subset of T cells. In humans the only NKR-P1 molecule identified (CD161, NKR-P1A) is present on ~-90% of NK cells, --25% of T cells (both CD4 and CD8 cells) 4, and a subset of very immature thymocytes s. In the rat, NKR-P1A is found on all NK cells at high levels and at low levels on most neutrophils and a subset of T cells 6,7. In the mouse NKR-P1C (NKI.1) is expressed on all NK cells as well as subsets of thymocytes and peripheral T cells. Transcripts of all three mouse genes have been detected in a single cell, suggesting that multiple NKR-P1 molecules can be expressed simultaneously by NK cells 1. T lineage cells expressing NKI.1 form a functionally distinct group (termed 'natural T' or NT cells) characterized by expression of a TCR with an invariant a chain (Va14-Ja281), restriction by CD1, and rapid expression of cytokines (including IL-4, -5 and -10) upon activation s. Several common laboratory strains of mice (e.g. BALB/c) appear not to express NKR-P1 molecules a.
Structure The NKR-P 1 family members are structurally related to several other proteins encoded within the NK gene complex, including Ly-49 proteins, CD69, CD94 and NKG2 9. The NKR-P 1 locus lies 0.41 cM distal to the Ly-49 locus on mouse chromosome 61,2. These molecules are all type II transmembrane glycoproteins with a C-type lectin domain in their extracellular portion and are expressed as disulfide-linked homodimers on the cell surface (human NKRP1A may have a monomeric form as well s). Three distinct genes have been
121
CD161
cloned in the mouse and there are preliminary reports of at least four distinct genes existing in the rat 1 but only one gene has been identified thus far in humans. The three mouse proteins show 73-87% amino acid identity with each other, 6 1 - 7 4 % identity with rat NKR-P1A, and 4 6 - 4 7 % identity with h u m a n NKR-P1A 4,1o. Rat NKR-P1A is most closely related to m o u s e NKR-P 1A 1, but it is not clear which mouse gene is the homologue of h u m a n NKR-P1A 4. There is evidence for alternatively spliced mouse NKR-P1A transcripts 1. The cytoplasmic domains contains conserved potential Tyr(YXXL) and Ser- (SPXSLXXDXC) phosphorylation sites 4.
Ligands and associated molecules The C-type lectin domain of rat NKR-P 1A has been reported to bind to a variety of carbohydrate structures ,1 but the biological relevance of these interactions is under re-evaluation 12.
Function Studies in the rat suggest that the NKR-P1A molecule functions as a specific receptor for certain NK cell targets, with ligation of NKR-P1A activating NK cell killing 1'13. T cells expressing NKR-P1A are capable of cytotoxicity following culture in IL-2 7. In the mouse antibodies to NKR-P1C can (1) activate NK cells and (2) block NK cell killing of target cells 14. However, certain c o m m o n mouse strains appear to not express NKR-P1 gene products and yet exhibit normal NK cell function 1"3. NKR-P1A has also been implicated in NK cell function in h u m a n s but studies with antibodies indicate that receptor ligation may activate or inhibit lytic function in different NK cell clones 4. NKR-P1 is not essential for the development of natural T cells since this cell population is unchanged in mice lacking NKR-P 1 8.
Database accession numbers PIR
Human NKR-P1A (CD161) Mouse NKR-P1A Mouse NKR-P1B Mouse NKR-P1C Rat NKR-P1A
SWISSPR OT
I38700 A46467 B46467 C46467 A35917
P27811 P27812 P27814 P27471
EMBL/GENBANK
REFERENCE
U11276
4
M77676 M77677 M77678 M62891
lo lo lo 15
Amino acid sequence of human NKR-P1A (CD161)
F1 12",
MDQQAIYAEL LVLVVTGLSV LREKCLLFSH LFWIGLNFSL YCSTEIRWIC
NLPTDSGPES SVTSLIQKSS TVNPWNNSLA SEKNWKWING QKELTPVRNK
SSPSSLPRDV IEKCSVDIQQ DCSTKESSLL SFLNSNDLEI VYPDS
CQGSPWHQFA SRNKTTERPG LIRDKDELIH RGDAKENSCI
LKLSCAGIIL LLNCPIYWQQ TQNLIRDKAI SISQTSVYSE
References 1 Yokoyama, W.M. and Seaman, W.E. (1993) Annu. Rev. Immunol. 11, 613-635.
50 i00 150 200 225
2 3 4 s 6 z 8 9 lo 11 12 13 14 is
Yokoyama, W.M. et al. (1991)J. Immunol. 147, 3229-3236. Giorda, R. et al. (1992) J. Immunol. 149, 1957-1963. Lanier, L.L. et al. (1994) J. Immunol. 153, 2417-2427. Poggi, A. et al. (1996) Eur. J. Immunol. 26, 1266-1272. Chambers, W.H. et al. (1989) J. Exp. Med. 169, 1373-1389. Brissette-Storkus, C. et al. (1994)J. Immunol. 152, 388-396. Bix, M. and Locksley, R.M. (1995)J. Immunol. 155, 1020-1022. Gumperz, J.E. and Parham, P. (1995) Nature 378, 245-248. Giorda, R. and Trucco, M. (1991)J. Immunol. 147, 1701-1708. Bezouska, K. et al. (1994) Nature 372, 150-157. Feizi, T. (1996)Nature 380, 559. Ryan, J.C. et al. (1995) J. Exp. Med. 181, 1911-1915. Kung, S.K.P. and Miller, R.G. (1995) J. Immunol. 154, 1624-1633. Giorda, R. et al. (1990) Science 249, 1298-1300.
12~
CD162
P-selectin glycoprotein ligand 1, PSGL-1
Molecular weights Polypeptide
SDS-PAGE reduced unreduced
41301 38 608 (without propeptide)
\~ - 120 kDa -220 kDa
_
tyrosine sulfation sites
Carbohydrate N-linked sites O-linked
3 +++
Human gene location 12q24; -- 11 kb 1
Decamer repeats
TTTTT
TTTTT
COOH COOH
Tissue distribution CD 162 is expressed on neutrophils, monocytes and most lymphocytes 2,3. On neutrophils CD 162 in concentrated on the tips of microvilli 2. In the mouse CD 162 mRNA is detected in most tissues 4.
Structure CD162 is a highly extended (-50nm long s) mucin-like type I transmembrane glycoprotein which is expressed as a disulfide-linked homodimer 2. A 23 residue propeptide is cleaved off at a consensus cleavage site (RxRR) for paired basic amino acid-converting enzyme (PACE) s. The region immediately following the propeptide contains many negatively charged residues as well as three tyrosines, within a tyrosine-sulfation consensus site 6. This sulfotyrosine region is poorly conserved in the mouse but retains a negative charge and two tyrosine residues 4. It is followed by a heavily O-glycosylated, mucin-like segment which includes 16 repeats of the decamer A(T/M)EAQTTX(P/L)(A/T) 1"7. A human CD162 variant has been identified in cell lines which lacks one of the decameric repeats 1,7. This is not due to altemative splicing as the CD162
124
protein is encoded by a single exon 1. Only 10 decamer repeats are present in mouse CD 162 4.
Ligands and associated molecules
!
I
CD162 binds to CD62P (P-selectin) 2,s, CD62E (E-selectin) z, and CD62L (Lselectin) 9,1o. CD62P binds with a relatively high affinity, apparently utilizing both carbohydrate (fucosylated, sialylated O-glycans)and protein determinants, with the latter including sulfotyrosine-containing N-terminal region 2"6"s. CD62L binding also requires this sulfotyrosine-containing region lo. In contrast, CD62E binds with a -50-fold lower avidity than CD62P and does not require the sulfotyrosine-containing region 2,6,s. CD62P does not bind all cells which express CD 162, presumably because of differences in glycosylation and/or tyrosine sulfation 3. For example, CD62P will bind activated but not resting T cells, despite no change in the CD 162 expression level 3.
Function C D 162 is the major CD62P ligand on neutrophils 2 and T lymphocytes 3. This
interaction mediates the tethering and rolling of these cells on endothelial cells under physiological flow 2'11'12, an important initial step in leucocyte extravasation 2. Interactions between CD162 and CD62L can mediate neutrophil-neutrophil interactions, which may amplify neutrophil extravasation 9.
Database accession numbers PIR Human
SWISSPR OT
A57468
Mouse
5t
EMBL/GENBANK
REFERENCE
U02297 U25956 X91144
7 1 4
Amino acid sequence of human CD 1621 MPLQLLLLLI LLGPGNSL
QLWD'I~WADEA EKALGPLLARDRRQATEYEY LDYDFLPETE PPEMLRNSTD TTPLTGPGTP MEIQTTQPAA TEAQTTPPAA MEAQTTQTTA TEALSMEPTT QCLLAILILA LPDGGEGPSA
ESTTVEPAAR RSTGLDAGGA TEAQT.T.QPVP.TEAQTTPLAA TEAQTTQPTG LEAQTTAPAA MEAQTTAPEA TEAQTTQPTA KRGLFIPFSV SSVTHKGIPM LVATIFFVCT VVLAVRLSRK TANGGLSKAK SPGLTPEPRE
VTELTTELAN TEAQTTRLTA MEAQTTAPAA TEAQTTPLAA AASNLSVNYP GHMYPVRNYS DREGDDLTLH
MGNLSTDSAA TEAQTTPLAA MEAQTTPPAA MEALSTEPSA VGAPDHISVK PTEMVCISSL SFLP
-i 50 i00 150 200 250 300 350 394
The propeptide cleaved by PACE is in bold; the decamer repeat absent in one CD 162 variant is dotted underlined 1.
References 1 Veldman, G.M. et al. (1995) J. Biol. Chem. 270, 16470-16475. 2 McEver, R.P. et al. (1995) J. Biol. Chem. 270, 11025-11028. 3 Vachino, G. et al. (1995)J. Biol. Chem. 270, 21966-21974.
12~
CD162
i
12(
4 s 6 7 s 9 lo 11 le
Yang, J. et al. (1996)Blood 87, 4176-4186. Li, E et al. (1996) J. Biol. Chem. 271, 6342-6348. Li, F. et al. (1996) J. Biol. Chem. 271, 3255-3265. Sako, D. et al. (1993) Cell 75, 1179-1186. Rosen, S.D. and Bertozzi, C.R. (1996) Curr. Biol. 6, 261-264. Walcheck, B. et al. (1996) J. Clin. Invest. 98, 1081-1087. Spertini, O. et al. (1996) J. Cell Biol. 135, 523-531. Alon, R. et al. (1994) J. Cell Biol. 127, 1485-1495. Norman, K.E. et al. (1995) Blood 86, 4417-4421.
CD163
M130 antigen, GHI/61, Ber-Mac3, Ki-M8, SM4
Molecular weights Polypeptide variant 1 variant 2
116 654 120 495 121027
SDS-PAGE reduced unreduced
130 kDa 110 kDa
Carbohydrate N-linked sites O-linked sites
11 unknown
COOH GRV
Domains
I
Isl I
GRV I I
, -,-,-, G R I v/L,
,,1
sc
so
1'"
GRV VVC I I
/
,,,... G R L v,~
so
so
,I /
VIC .... I
II
,-,-,.. G T V v/~,
s~ GRV I I
so
,I I
ITC
sc
GRL VIC i II
I
so
l I
_ GRV ITC I I
I
s~
VNC I
ITMI,cyl
Tissue distribution =
CD 163 is restricted to the monocyte/macrophage lineage. It is present on all circulating monocytes and most tissue macrophages 1, with the exceptions of tingible body macrophages, macrophages in the mantle zone and germinal centers of lymphoid follicles, interdigitating reticulum cells, and Langerhans cells 2. Multinucleated cells within inflammatory lesions in vivo do not express CD 163 a.
Structure
i
CD 163 is a type I membrane protein 4. The extracellular domain consists of nine SRCR domains. Domains 5-9 show similarity to a long-range repeat of domains 2-6 and 7-11 of the WC1 antigen s with a characteristic insertion (31 amino acids in CD163)between domains 6 and 7. Three membrane bound forms of CD163 identified at the cDNA level are generated by alternative splicing within the cytoplasmic region; the predominant form is the short form 4. The N-terminus has been determined by protein sequencing 4.
127
CD163
Function Unknown. Database accession numbers PIR
Human Human variant 1 Human variant 2
SWISSPROT
$36077 $36078 $36079
EMBL/GENBANK
REFERENCE
Z22968 Z22969 Z22970
4 4 4
Amino acid sequence of human CD163 MVLLEDSGSA GTDKELRLVD TAIKAPGWAN DAGVTCSDGS ICRQLECGSA HNCDHAEDAG DSYDAAVACK KHHEWGKHYC KVCDRGWGLK ETSLWDCKNW GDTWGSICDS QCEGHESHLS VELKTLGAWG WRHMFHCTGT SSLGPTRPTI DSWDLSDAHV WQCHSHGWGQ AWGTVGKSSM PKGPDTLWQC HGGSWGTVCD VKCKGNESSL SRQSSFIAVG
DFRRHFVNLS GENKCSGRVE SSAGSGRIWM NLEMRLTRGG VSFSGSSNFG VICSKGADLS QLGCPTAVTA NHNEDAGVTC EADVVCRQLG QWGGLTCDHY DFSLEAASVL LCPVAPRPEG SLCNSHWDIE EQHMGDCPVT PEESAVACIE VCRQLGCGEA QNCRHKEDAG SETTVGVVCR PSSPWEKRLA DSWDLDDAQV WDCPARRWGH ILGVVLLAIF
PFTITVVLLL VKVQEEWGTV DHVSCRGNES NMCSGRIEIK EGSGPIWFDD LRLVDGVTEC IGRVNASKGF SDGSDLELRL CGSALKTSYQ EEAKITCSAH CRELQCGTVV TCSHSRDVGV DAHVLCQQLK ALGASLCPSE SGQLRLVNGG INATGSAHFG VICSEFMSLR QLGCADKGKI SPSEETWITC VCQQLGCGPA SECGHKEDAA VALFFLTKKR
SACFVTSSLG CNNGWSMEAV SVICNQLGCP ALWDCKHDGW GKHSNCTHQQ FQGRWGTVCD DNFNIDHASV LICNGNESALWNCKHQGWGK SGRLEVRFQG EWGTICDDGW GHIWLDSVSC QGHEPAVWQC RGGGSRCAGT VEVEIQRLLG VYSKIQATNT WLFLSSCNGN REPRLVGGDI PCSGRVEVKH SILGGAHFGE GNGQIWAEEF VCSRYTEIRL VNGKTPCEGR CGVALSTPGG ARFGKGNGQI QVASVICSGN QSQTLSSCNS GRCAGRVEIY HEGSWGTICD EGTGPIWLDE MKCNGKESRI LTSEASREAC AGRLEVFYNG NPASLDKAMS IPMWVDNVQC DNKIRLQEGP TSCSGRVEIW LKAFKEAEFG QGTGPIWLNE VNCTDISVQK TPQKATTGRS RQRQRLAVSS RGENLVHQIQ
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 i000 1050
YREMNSCLNA DDLDLMNSSG GHSEPH 1076 vl YREMNSCLNA DDLDLMNSSE NSHESADFSA AELISVSKFL PISGMEKEAI ii00 v2 YREMNSCLNA DDLDLMNSSG LWVLGGSIAQ GFRSVAAVEA QTFYFDKQLK ii00 v i LSHTE KENGN L v2 KSKNVIGSLD AYNGQE
iiii 1116
Note: v l and v2 indicate the cytoplasmic sequences of the variants 1 and 2, respectively.
References 1 2 3 4 s
Radzun, H.J. et al. (1987) Blood 69, 1320-1327. Pulford, K. et al. ( 1 9 9 2 ) I m m u n o l o g y 75, 588-595. Backe, E. et al. (1991) J. Clin. Pathol. 44, 936-945. Law, S.K.A. et al. (1993) Eur. J. Immunol. 23, 2320-2325. Wijngaard, P.L.J. et al. (1992) J. I m m u n o l . 149, 3273-3277.
ALCAM, CD6L, BEN, SC-1, DM-GRASP, neurolin, KG-CAM
Molecular weights Polypeptide
62 293
SDS-PAGE reduced unreduced
105 kDa 105 kDa
Carbohydrate N-linked sites O-linked
8 unknown
Human gene location 3q13.1-q13.21
tttt
It COOH
Domains
CRL
i@,,1 i
v
Fvc ,,
CIS
I
I
v
FTCI
CTI ~
CLG YKC
i
I
c~
,
I
I .....
CHV , LTC
YFC
c=
,
I
I
c~,
,,
ITMlcu
Tissue distribution CD 166 has a broad tissue distribution including cortical and medullary thymic epithelial cells 2 and activated T cells 1 The avian equivalent, BEN a is expressed on epithelial cells of the Bursa. In the nervous system, CD166 is expressed on neurons and the distribution of BEN is well-characterized on developing sensory and motor neurons a,4
Structure CD 166 is a member of the IgSF with the Ig domains arranged as V-V-C2-C2-C2 as in MUC181,3,4. The IgSF domains are followed by a transmembrane region and short (32 amino acid) cytoplasmic domain 1.
Ligands and associated molecules CD166 on thymic epithelial cells, via its N-terminal IgSF V-set domain, mediates binding to the membrane proximal scavenger receptor domain of CD6 on thymocytes 1,s. A homophilic interaction has been described for the chicken homologue 6.
~2~
CD166
Function Through its interaction with CD 166 in the thymus, CD6 may have a role in T cell development. A role in axonal guidance is postulated based on inhibitory effects with mAbs 4,7.
Database accession numbers FIR
Human Mouse Chicken
SWISSPR OT
EMBL/GENBANK
REFERENCE
P42292
L38608 L25274 X64301
1 4 3
A m i n o acid sequence of human CD 166 MESKGASSCR WYTVNSAYGD VQYDDVPEYK VKVFKQPSKP HPLEGAVVII SGQKTIHSEQ PEEFLFYLPG TVHYLDLSLN SFSSLHYQDA GLSKTIICHV EENVTLTCTA LIVGIVVGLL NHKTEA
LLFCLLISAT TIIIPCRLDV DRLNLSENYT EIVSKALFLE FKKEMDPVTQ AVFDIYYPTE QPEGIRSSNT PSGEVTRQIG GNYVCETALQ EGFPKPAIQW ENQLERTVNS LAALVAGVVY
VFRPGLG PQNLMFGKWK LSISNARISD TEQLKKLGDC LYTMTSTLEY QVTIQVLPPK YTLMDVRRNA DALPVSCTIS EVEGLKKRES TITGSGSVIN LNVSAISIPE WLYMKKSKTA
YEKPDGSPVF EKRFVCMLVT ISEDSYPDGN KTTKADIQMP NAIKEGDNIT TGDYKCSLID ASRNATVVWM LTLIVEGKPQ QTEESPYING HDEADEISDE SKHVNKDLGN
IAFRSSTKKS EDNVFEAPTI ITWYRNGKVL FTCSVTYYGP LKCLGNGNPP KKSMIASTAI KDNIRLRSSP IKMTKKTDPS RYYSKIIISP NREKVNDQAK MEENKKLEEN
References 1 2 3 4 s 6 7
13C
Bowen, M.A. et al. (1995) J. Exp. Med. 181, 2213-2220. Patel, D.D. et al. (1995) J. Exp. Med. 181, 1563-1568. Pourquie, O. et al. (1992) Proc. Natl Acad. Sci. USA 89, 5261-5265. Kanki, J.P. et al. (1994)J. Ncurobiol. 25, 831-845. Bajorath, J. et al. (1995) Protein Sci. 4, 1644-1647. Tanaka, H. et al. (1991) N e u r o n 7, 535-545. Burns, F.R. et al. (1991) N e u r o n 7, 209-220.
-i 50 i00 150 200 250 300 350 400 450 500 550 556
Molecular weights Polypeptide SDS-PAGE reduced
56 915
150 kDa (mean) 100-300 kDa (cell type variation)
Carbohydrate N-linked sites O-linked
3 probable +++
uncated EGF domain
COOH Domains
Isl
CNP CDH pL~CSQ RKC I SSC, I ~I ' I E [ SEA I E I E I ITMI CY
i
Tissue distribution The 114/A10 antigen is expressed on mouse haematopoietic cells and cell lines that are responsive to IL-3, including primary erythroid, myeloid and multipotent progenitors 1.
Structure The extracellular N-terminus consists of eight highly conserved Ser/Thr-rich repeats of about 27 residues each, followed by three complete EGF domains and one truncated EGF domain. Between EGF domains 1 and 3 there is a region of 120 residues with similarity to the recently defined SEA module which is commonly found in proteins with O-glycosylation 2. Each Ser/Thrrich repeat contains one Ser-Gly motif which could potentially serve as an attachment site of glycosaminoglycan side-chains 1. However, digestions with glycosidases fail to demonstrate the presence of glycosaminoglycans. Instead it is suggested that the size heterogeneity of the l14/A10 molecule ~3]
is the result of differential post-translational modification with sialylated Olinked carbohydrates, which are probably present in the repeats 3.
Function The tissue distribution has prompted speculation that 114/A10 might play a regulatory role in the cellular response to IL-3. Further speculation has concerned the group of Arg residues, between the first and second EGF domains, that may be a site for cleavage by proteases. This would release a soluble product, containing the Ser/Thr repeats and a single EGF domain, which might function as a cytokine 1. Database accession numbers
Mouse xr;~-.
!
!
SWISSPR OT
EMBL/GENBANK
REFERENCE
P19467
J04634
1
Amino acid sequence of mouse 114/A10 MKGFLLLSLS SSSQASSTTS QSPGSSSQAS TVQSQSPGSS SPPTTVQSQS GGASSSTVPS SLSSCVKGTT TDYGQTVIIK SAIETAIKTS NPQVPFCVAV ECPFGYSGMN EEQRLIEDDF MPRPDY
I
FIR
A33533
LLLVTVG SSGGTSPPTT TTTSSSGGAS SQASTTTSSS PGSSSQVSTT GGSTGPSDLC FPGDISMSVS VSTAPSRSAR GNVKDYVSIN TCSQPCNAEE CKDQFQLILT HNLRLRQTGF
VQSQSPGSSS PPTTVQSQSP GGASPPTTVQ TSSSGGASPP NPNPCKGTAS ETANLEDENS SAMRDATKDV LCDHYGCVGN KEQCLKMDNG IVGTIAGALI SNLGADNSIF
QASTTTSSSG GSSSQASTTT SQSPGSSSQA TTVQSQSPGS CVKLHSKHFC VGYQELYNSV SVSVVNIFGA DSSKCQDILQ VMDCVCMPGY LILLIAFIVS PKVRTGVPSQ
GASPPTTVQS SSSGGASPPT STTTSSSGGA SSQPGPTQPS LCLEGYYYNS TDFFETTFNK DTKETEKSVS CTCKPGLDRL QRANGNRKCE ARSKNKKKDG TPNPYANQRS
References 1 Dougherty, G.J. et al. (1989) J. Biol. Chem. 264, 6509-6514. 2 Bork, P. and Patthy, L. (1995) Protein Sci. 4, 1421-1425. 3 Kay, R. et al. (1990) J. Biol. Chem. 265, 4962-4968.
13~
-i 50 i00 150 200 250 300 350 400 450 500 550 556
I Molecular weights
[ i
I ! 0
!
i
!
Polypeptide
43 091
SDS-PAGE reduced unreduced
66 kDa 66 kDa
Carbohydrate N-linked sites O-linked
PSN [ YLLI
Domains
????
7 nil
Ist
v
CLV [ YTCI I
c~
I~1
0u
1
COOH
Tissue distribution Expressed in mice on all NK cells and T cells capable of non-MHC-restricted cytotoxicity 1"2. The latter include dendritic epidermal (76) T cells and a subset of T cells cultured in IL-21,2.
Structure The extracellular portion of this IgSF domain-containing glycoprotein exhibits several structural features that place it within the CD2 family of the IgSF (see CD2), which probably arose from a series of gene duplication events s'4. In addition to CD2 and 2B4 this family includes CD48, CD58, Ly-9, and CD150. The mouse 2B4 gene is situated on mouse chromosome 1 near the gene for Ly-17 (mouse CD32) s, placing it close to the Ly-9 and CD48 lociS'6. The membrane-proximal IgSF domain is particularly highly glycosyiated.
Function The expression of 2B4 on all NK cells as well as T cells capable of non-MHCrestricted cytotoxicity suggests that it may contribute to the latter process. Treatment with a 2B4 mAb activates these cells and augments non-MHCrestricted cytotoxicity/,2,s,7.
Database accession numbers PIR
Mouse
SWISSPROT
EMBL/GENBANK
REFERENCE
L19057
5
133
2B4
A m i n o acid sequence of m o u s e 2B4 MLGQAVLFTT GQDCPDSSEE WYNDGPSWSN CNKNFQLLIL RGSTLISNQR VPSNFRFLPF YVKDSRASRD PGDRGTMYSM VYEEVGNPWL
FLLLRAHQ VVGVSGKPVQ VSFSDIYGFD DHVETPNLKA NSTHWENQID GVIIVILVTL QQGCSRASGS IQCKPSDSTS KAHNPARLSR
LRPSNIQTKD YGDFALSIKS QWKPWTNGTC ASSLHTYTCN FLGAIICFCV PSAVQEDGRG QEKCTVYSVV RELENFDVYS
VSVQWKKTEQ AKLQDSGHYL QLFLSCLVTK VSNRASWANH WTKKRKQLQF QRELDRRVSE QPSRKSGSKK
GSHRKIEILN LEITNTGGKV DDNVSYAFWY TLNFTHGCQS SPKEPLTIYE VLEQLPQQTF RNQNYSLSCT
-i 50 i00 150 200 250 300 350 380
References 1 2 3 4 s 6 7
134
Garni-Wagner, B.A. et al. (1993) J. Immunol. 151, 60-70. Sehuhmaehers, G. et al. (1995) J. Invest. Dermatol. 105, 592-596. Wong, Y.W. et al. (1990)J. Exp. Med. 171, 2115-2130. Davis, S.J. and van der Merwe, P.A. (1996)Immunol. Today 17, 177-187. Mathew, P.A. et al. (1993) J. Immunol. 151, 5328-5337. Kingsmore, S.F. et al. (1995) Immunogenetics 42, 59-62. Schuhmachers, G. et al. (1995) Eur. J. Immunol. 25, 1117-1120.
CDw137L
Molecular weights Polypeptide
26625
SDS-PAGE reduced nonreduced
50 kDa 97 kDa
Carbohydrate N-linked sites O-linked
0 probable
Human gene location 19p13.31 Domain
WYS
Ic;',' ITM'I
~.
I
I
VLG
T
i
I
NH2
Tissue distribution 4-1BBL is expressed on activated T and B lymphocytes 2,a. On cell lines, highest expression was found on IgG § B cell lymphomas with an estimated site number of 3680 and on macrophage lines a. Messenger RNA for 4-1BBL is widespread 1
Structure 4-1BBL is a member of the TNF superfamily &s'6. Like other members of this superfamily, it is a type II membrane protein with the similarity to TNF being in the C-terminal extracellular region. Unusually, 4-1BBL is expressed as a disulfide-linked homodimer not a trimer 1. Dimerization probably occurs through the cysteine in the serine/proline-rich membrane-proximal stalk 1. Mouse 4-1BBL has three potential N-linked glycosylation sites a.
Ligands and associated molecules 4-1BBL binds to CDw137, a member of the TNFR superfamily.
Function Cells expressing 4-1BBL can co-stimulate T cell growth and inhibited by soluble 4-1BBL 4'7'8. Reciprocally, cells expressing CDw137 can co-stimulate proliferation by B cells 2. These suggest a role for the 4-1BBL/CDw137 interaction in T growth 2,7,8.
this can be recombinant experiments and B cell
Database accession numbers PIR
Human Mouse
SWISSPR O T
EMBL/GENBANK
REFERENCE
P41273 P41274
U03398 L15435
4
1
Amino acid sequence of human 4-1BBL MEYASDASLD CPWAVSGARA LLIDGPLSWY RVVAGEGSGS GRLLHLSAGQ PRSE
PEAPWPPAPR SPGSAASPRL SDPGLAGVSL VSLALHLQPL RLGVHLHTEA
ARACRVLPWA REGPELSPDD TGGLSYKEDT RSAAGAAALA RARHAWQLTQ
LVAGLLLLLL PAGLLDLRQG KELVVAKAGV LTVDLPPASS GATVLGLFRV
LAAACAVFLA MFAQLVAQNV YYVFFQLELR EARNSAFGFQ TPEIPAGLPS
References
t i
1 2 a a [ s 'i 6 [ 7 I 8
13~
Alderson, M.R. et al. (1994) Eur. J. Immunol. 24, 2219-2227. Pollock, K. et al. (1994) Eur. J. Immunol. 24, 367-374. Gruss, H.-J. and Dower, S.K. (1995) Blood 85, 3378-3404. Goodwin, R.G. et al. (1993) Eur. J. Immunol. 23, 2631-2641. Armitage, R.J. (1994)Curr. Opin. Immunol. 6, 407-413. van Kooten, C. and Banchereau, J. (1996)Adv. Immunol. 61, 1-77. DeBenedette, M.A. et al. (1995) J. Exp. Med. 181, 985-992. Hurtado, J.C. et al. (1995)J. Immunol. 155, 3360-3367.
50 i00 150 200 250 254
APA
Other names Glutamyl aminopeptidase (EC 3.4.11.7) BP- 1/6C3 (mouse) gpl60 Molecular weights Polypeptide 109 245
I I
I
SDS-PAGE reduced unreduced
160 kDa 280-500 kDa
Carbohydrate N-linked sites O-linked
13 unknown
NH 2
NH 2
Tissue distribution F
Aminopeptidase A (APA) is expressed in mice on early B lineage cells and on a population of thymic cortical epithelial cells, but not on mature lymphocytes (reviewed in refs 1 and 2). IL-7 has been shown to selectively induce APA expression on pre-B cells coincident with their growth 3. Expression on bone marrow stromal cell lines correlates with their ability to support growth of B lineage cells 4. The molecule is widely expressed on other tissues including vascular endothelium, kidney glomeruli and tubules, and the brush border of small intestine ~,2,s. Northern blot analysis indicates a similar expression pattern in humans 6. Structure APA is a disulfide-linked homodimer of a type II integral membrane protein2'6'7. The extracellular region contains a typical zinc binding motif (shown below). Human APA shows sequence similarity to human CD13 (aminopeptidase N or APN). The molecule can be phosphorylated 8. Ligands and associated molecules APA binds to a wide range of oligopeptides 9. Function
I i
APA is a zinc-dependent metallopeptidase that cleaves N-terminal Glu or Asp residues from peptides 6,9,1o. For example, in the kidney APA removes the Nterminal Asp residues of angiotensins I and II, rendering them less potent as vasoconstrictors. In mice the BP-1 mAb blocks APA enzymatic activity and inhibits IL-7-driven proliferation of pre-B cells in the context of the bone marrow microenvironment, but does not inhibit the growth of purified pre-B cells in response to IL-71~. This suggests that APA cleaves, and inactivates, a peptide which serves as a natural inhibitor of B cell precursor proliferation.
137
Zinc binding motif of APA (amino acids 391-399) and related proteins APA H u m a n C D I 3 (APN) Rat A P N E. coli A P N
VA}{ELV}{QW IAHELAHQW IAH~.LAHQW
I GI-IEYFI-INW
Database accession numbers Humarl Mouse Rat
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A48287 $30398
Q07075 P 16406 P50123
L14721 M29961 $73583
6 1 lz
Amino acid sequence of human aminopeptidase A MNFAEREGSK PGTAPAPSHL HYDLHVKPLL SGDQVQVRRC VGFYRTTYTE KEYGALSNMP SNSGKPLTIY IPDFGTGAME GNIVTMDWWE DDSLMSSHPI GCQMYLEKYQ GVKNITQKRF KEGITLNSSN SSADRASLID YIISMFEDDK FACKMGDREA NYTLEQYQKT FTVIRYISYN TELQLWQMES FNLLESG
RYCIQTKHVA PSSTASPSGP EEDTYTGTVS FEYKKQEYVV NGRVKSIAAT VAKEESVDDK VQPEQKHTAE NWGLITYRET DLWLNEGFAS IVTVTTPDEI FKNAKTSDFW LLDPRANPSQ PSGNAFLKIN DAFALARAQL ELYPMIEEYF LNNASSLFEQ SLAQEKEKLL SYGKNMAWNW FFAKYPQAGA
ILCAVVVGVG PAQDQDICPA ISINLSAPTR VEAEEELTPS DHEPTDARKS WTRTTFEKSV YAANITKSVF NLLYDPKESA FFEFLGVNHA TSVFDGISYS AALEEASRLP PPSDLGYTWN PDHIGFYRVN LDYKVALNLT QGQVKPIADS WLNGTVSLPV YGLASVKNVT IQLNWDYLVN GEKPREQVLE
LIVGLAVGLT SEDESGQWKN YLWLHLRETR SGDGLYLLTM FPCFDEPNKK PMSTYLVCFA DYFEEYFAMN SSNQQRVATV ETDWQMRDQM KGSSILRMLE VKEVMDTWTR IPVKWTEDNI YEVATWDSIA KYLKREENFL LGWNDAGDHV NLRLLVYRYG LLSRYLDLLK RYTLNNRNLG TVKNNIEWLK
RSCDSSGDGG FRLPDFVNPV ITRLPELKRP EFAGWLNGSL ATYTISITHP VHQFDSVKRI YSLPKLDKIA VAHELVHQWF LLEDVLPVQE DWIKPENFQK QMGYPVLNVN TSSVLFNRSE TALSLNHKTF PWQRVISAVT TKLLRSSVLG MQNSGNEISW DTNLIKTQDV RIVTIAEPFN QHRNTIREWF
References 1 z 3 4 s 6 7 s 9 lo 11 lz
|3~
Wu, Q. et al. (1990) Proc. Natl Acad. Sci. USA 87, 993-997. Adkins, B. et al. (1988) Immunogenetics 27, 180-186. Welch, P.A. et al. {1990) Int. Immunol. 2, 697-705. Whitlock, C.A. et al. (1987)Cell 48, 1009-1021. Li, L. et al. {1993)Tissue Antigens 42, 488-496. Li, L. et al. {1993) Genomics 17, 657-664. Cooper, M.D. et al. {1986) Nature 3 2 1 , 6 1 6 - 6 1 8 . Wu, Q. et al. {1989) J. Immunol. 143, 3303-3308. Nanus, D.M. et al. {1993) Proc. Natl Acad. Sci. USA 90, 7069-7073. Wu, Q. et al. (1991) Proc. Natl Acad. Sci. USA 88, 676-680. Welch, EA. (1995) Int. Immunol. 7, 737-746. Song, L. et al. (1994)Am. J. Physiol. 267, F546-F557.
5O i00 150 200 25O 300 350 400 450 5OO 55O 600 650 700 75O 8OO 85O 9O0 95O 957
~
Molecular weights Polypeptide (bg14/8)
41576
SDS-PAGE reduced unreduced
35-55 kDa 90-160 kDa
Carbohydrate N-linked sites O-linked
nil nil
Domain
CHL
lsl
[
YSC I
$
$
],TMicY] COOH COOH
Tissue distribution Chicken B-G molecules are expressed on red blood cells, thrombocytes, B and T lymphocytes, bursal B cells, thymocytes and various non-haematopoietic cells 1,2
Structure The multigene family of chicken B-G molecules is encoded within the chicken MHC 1-3. Structural information is given for erythrocyte B-G which is expressed mainly as a disulfide-linked dimer and contains no oligosaccharide 1,2 On other cells B-G molecules may be monomeric or multimeric and be glycosylated 1,2. The extracellular domain consists of a single IgSF V-set domain which shows polymorphism and similarity to myelin oligodendrodyte glycoprotein (MOG) &4. A subfamily of the IgSF containing BT, MOG, B-G, CD80 and CD86 has been suggested 4. The cytoplasmic domains contain different numbers of seven amino acid helical repeats giving rise to size polymorphism 1'2. B-G has been identified as a polymorphic family in another species of birds, namely the pheasant s.
Function The function of B-G molecules is unknown 2.
Database accession numbers PIR
Chicken bgl 4/8
SWISSPROT
EMBL/GENBANK
REFERENCE
M61860
6
13~
B-G
Amino acid sequence of chicken B-G (bg14/8) MAFTSGCNHP QITVVAPSLR RNGVDLGQME DAYAEAVVNL QSRELKRKDA TDEVENWNSV AKLEHQTKEL KLEIPAVKVG MAEQTEAVVV
SFTLPWRTLL VTAIVGQDVV EYKGRTELLR EVSDPFSMII ELVEKAAALE LKKDSEEMGY EKQHSQFQRH QQAKESEEQK ETEE
PYLVALHLLQ LRCHLSPCKD DGLSDGNLDL LYWTVALAVI RKDAELAEQA GFGDLKKLAA FQNMYLSAGK SELKEHHEET
PGSA VRNSDIRWIQ RITAVTSSDS ITLLVGSFVV AQSKQRDAML ELEKHSEEMG QKKMVTKLEE GQQAKESEKQ
QRSSRLVHHY GSYSCAVQDG NVFLHRKKVA DKHVLKLEEK TRDLKLERLA HCEWMVRRNV KSELKERHEE
References
[ [
1 z 3 4 5 6
E
~4C
Kaufman, J. et al. (1991) CRC Crit. Rev. Immunol. 11, 113-143. Kaufman, J. and Salomonsen, J. (1992)Immunol. Today 13, 1-3. Trowsdale, J. (1995)Immunogenetics 41, 1-17. Linsley, P.S. et al. (1994) Protein Sci. 3, 1341-1343. Jarvi, S.I. et al. (1996) Immunogenetics 43, 125-135. Miller, M.M. et al. (1991) Proc. Natl Acad. Sci. USA 88, 4377-4381.
-i 50 I00 150 200 250 300 350 364
The complex family of chemokine receptors has been extensively detailed in two other books in the FactsBook series 1,2 and recently reviewed in ref. 3. A separate entry is included for one well-characterized example for which antibodies are available, IL-8R (CXCR1/CDw128).
E
i
Nomenclature An amended nomenclature was adopted for the chemokine receptors at the Gordon Conference on Chemotactic Cytokines (June 1996). The C-C chemokines (fl chemokine family) share a group of five known receptors designated CCR1-CCR5. The C-X-C chemokines (a chemokine family)bind to the four receptors designated CXCR1-CXCR4. In addition, the Duffy antigen on erythrocytes acts as a receptor for a wide range of C-C and C-X-C chemokines. Molecular weights Apparent Mr of 46-52 kDa.
COOH
Tissue distribution The tissue distribution of chemokine receptors has not yet been well defined, through the lack of specific monoclonal antibodies. However, mRNA analyses suggests that these receptors are widely distributed on leucocytes. The Duffy antigen is expressed on erythrocytes, T cells and endothelial cells in the spleen, lung, brain and kidney. Structure The chemokine receptors are seven transmembrane-spanning G proteinlinked receptors. The CCR5 molecule is depicted in the figure. Ligands and associated molecules One feature of the chemokine receptors is that a number of receptors can bind several ligands and vice versa. The CCR1-CCR5 receptors bind to members of the C-C chemokine family, such as RANTES (CCR1, CCR3, CCR4 and CCR5), MIP-la (CCR1, CCR4 and CCR5)and MCP-1 (CCR2) 3. The CXCR1-CXCR4 receptors bind to members of the C-X-C chemokine family, either specifically such as IL-8 binding to CXCR1 (CDw128)and SDF-1 binding to CXCR4 (LESTR/fusin), or in a shared manner such as CXCR3 binding to both IP10 and Mig 3. These interactions are usually of a high affinity, but are often within a range of affinities; for example MIP-la binds to CCR5 with K d = 5 nM whereas RANTES binds to CCR5 with K d = 470 riM. The chemokine
141
receptors are coupled to heterotrimeric G proteins (believed to be of the Gia class) located at the cytoplasmic face of the cell membrane. Function
i
i
:
............
i
Chemokine-chemokine receptor interactions play an important role in the chemotactic recruitment of leucocytes to sites of infection and tissue damage. The signalling process involves G protein activation, leading to the generation of inositol trisphosphate with subsequent release of intracellular Ca ~§ opening of Ca ~+ channels and activation of protein kinase C 4. A recent development has been the finding that several chemokine receptors can function as cofactors (in concert with CD4) for HIV-1 infection of CD4 + T cells and macrophages: CXCR4 (LESTR/fusin) acts as a cofactor for fusion and entry of T cell line-tropic strains of HIV-1, whereas CCR5 and, for some HIV-1 strains, CCR3 act as cofactors for macrophage-tropic isolates of HIV-1 s-lo. Dual-tropic strains of HIV-1 may utilize CXCR4, CCR5 or CCR2 as cofactors for infection of target cells lo. Comments There are examples of viral gene products, such as US28 (cytomegalovirus)and ECRF3 (herpes saimiri virus), which have been shown to bind chemokines 3. The Duffy antigen functions as a receptor for the malarial parasite
Plasmodium vivax 11. References i
14~
1 Vaddi, K. et al. {1997)The Chemokine FactsBook. Academic Press, London. 2 Callard, R.E. and Gearing, A.J.H. (1993)The Cytokine FactsBook. Academic Press, London. 3 Premack, B.A. and Schall, T.J. (1996) Nature Med. 2, 1174-1178. 4 Bokoch, G.M. (1995)Blood 86, 1649-1660. s Feng, Y. et al. (1996) Science 272, 872-877. 6 Deng, H.K. et al. (1996) Nature 381,661-666. 7 Dragic, T. et al. (1996)Nature 381,667-673. s Alkhatib, G. et al. (1996) Science 272, 1955-1958. 9 Choe, H. et al. (1996) Cell 85, 1135-1148. lo Doranz, B.J. et al. (1996) Cell 85, 1149-1158. 11 Horuk, R. et al. (1993) Science 261, 1182-1184.
c-kit ligand, m a s t / s t e m cell growth factor, steel factor
Molecular weights Polypeptide 27 906 (longer membrane form) 18 458 or 18 529 (predominant soluble form) E
SDS-PAGE reduced unreduced Carbohydrate N-linked sites O-linked
28-36 kDa (soluble form) 28-36 kDa (soluble form)
?
?
OOOH
COOH
4or5 +
Human gene location 12q22
Tissue distribution c-kitL is expressed by bone marrow stromal cells, fibroblasts, oocytes and in a range of tissues such as the liver, lung, kidney, testis and brain 1,2. Structure c-kitL is expressed both on the cell surface and, following proteolytic cleavage, in a soluble form 1,2. Both membrane bound and secreted c-kitL are biologically active. The soluble form, and presumably also the membrane bound form, exists as a dimer 3. Intramolecular disulfide bonds are formed between Cys489 and 43-138 (mature numbering)and the molecule contains considerable secondary structure including a helices and fl sheets 1,3. Together both N- and O-linked glycosylation accounts for approximately 30% of the total weight of c-kitL 3. Alternative splicing gives rise to a second form of the molecule that lacks 28 amino acids (150-177; mature protein numbering), including one of the five potential N-linked glycosylation sites and the protease recognition site, which therefore yields soluble c-kitL less efficiently and is predominantly membrane bound 1. Ligands and associated molecules c-kitL binds to CD117 (c-kit), a cell surface receptor with Tyr kinase activity (see CD 117). Function c-kitL is an early-acting haematopoietic growth factor, critical to the development of several distinct lineages from haematopoietic progenitors 1,2. c-kitL also stimulates the proliferation of mast cells, as well as myeloid and lymphoid progenitors in bone marrow cultures, and functions as a survival factor for primordial germ cells 1,4.
c-kitL
t
Comment
! !
c-kitL is analogous to PDGF and M-CSF, insofar as each growth factor is dimeric and their receptors (CD 117 (c-kitL), PDGFRA/PDGFRB (CD 140a/CD 140b) and C D l l 5 (c-fms))constitute subclass 11I within the family of growth factor receptors with Tyr kinase activity s. Mutations at the S1 locus, which encodes mouse c-kitL, lead to alterations of coat colour (white coats), anaemia, defective mast cell development and defective gonad development 6,7. Database accession numbers
Human Mouse Rat
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A35974 A35972 B35974
P21583 P20826 P21581
M59964 M38436 M59966
s 9 s
Amino acid sequence of human c-kitL MKKTQTWILT EGICRNRVTN QLSDSLTDLL SFKSPEPRLF RVSVTKPFML LIIGFAFGAL
CIYLQLLLFN NVKDVTKLVA DKFSNISEGL TPEEFFRIFN PPVAASSLRN YWKKRQPSLT
PLVKT NLPKDYMITL SNYSIIDKLV RSIDAFKDFV DSSSSNRKAK RAVENIQINE
KYVPGMDVLP NIVDDLVECV VASETSDCVV NPPGDSSLHW EDNEISMLQE
SHCWISEMVV KENSSKDLKK SSTLSPEKDS AAMALPALFS KEREFQEV
-i 50 i00 150 200 248
The longer membrane form of c-kitL (shown above) is cleaved following Ala 164 or Ala165 (mature numbering)to yield soluble c-kitL.
References I Galli, S.J. et al. (1994) Adv. Immunol. 55, 1-96. 2 Callard, R.E. and Gearing, A.J.H. (1994)The Cytokine FactsBook. Academic Press, London. 3 Arakawa, T. et al. (1991) J. Biol. Chem. 266, 18942-18948. 4 Godin, I. et al. (1991) Nature 352, 807-809. s Ullrich, A. and Schlessinger, J. (1990)Cell 61,203-212. 6 Witte, O.N. (1990)Cell 63, 5-6. 7 Copeland, N.G. et al. (1990) Cell 63, 175-183. s Martin, F.H. et al. (1990)Cell 63, 203-211. 9 Anderson, D.M. et al. (1990)Cell 63, 235-243.
144
Molecular weight Polypeptide 22 606 SDS-PAGE unknown Carbohydrate N-linked sites O-linked
2 probable +
Note No protein data has thus far been obtained. Domain
CRY
[
Isl
YWC
v
I
I
[TMi6YI COOH
Tissue distribution The CMRF35 antigen is expressed on human granulocytes, monocytes, neutrophils, NK cells, 25% of circulating T cells and 15% of circulating B cells 1 Structure The extracellular domain of the CMRF35 antigen contains a single IgSF V-set domain closely related to the first, third and fourth IgSF domains of the poly Ig receptor on epithelial cells 2. This suggests that the two molecules evolved from a common precursor 2. A Pro- and Ser-rich sequence that is likely to be O-glycosylated forms a hinge-like region linking the V-set domain to a transmembrane sequence that contains a charged residue (Glu) 2. The presence of this charged residue suggests that the CMRF35 antigen may associate with other molecules in the membrane.
I
Function Mitogenic stimulation of peripheral blood lymphocytes reduces both mRNA levels and cell surface expression of the CMRF35 antigen 1,2. A m i n o a c i d s e q u e n c e of h u m a n C M R F 3 5 MTAR~WASWR SSALLLLLVP CYFPLSHPMT VmPVCGSLS VQCRYEK~HR TLNKFWCRPP QILRCDKIVE TKGSAGKRNG RVSIRDSPAN LSFTVTLENL TEEDAGTYWC GVDTPWLRDF HDPIWWVS W P A C T T T A S S P Q S S M G T S C P P T K L P W T W PSVTRKDSPE PSP~PCSLFS N W F L L L V L L E ~ P L L L S M L C m L W W R P Q R SSRSRQNWPK GENQ
-1 SO
i00 150 2OO 204
Re[erences 1 Daish, A. et al. (1993)Immunology 79, 55-63. z Jackson, D.G. et al. (1992) Eur. J. Immunol. 22, 1157-1163.
14{
Molecular weights Polypeptide 194 529 SDS-PAGE reduced
205 kDa CL
Carbohydrate N-linked sites O-linked
CL
17 nil
CL CL CL CL CL CL CL CL
COOH
Domains CEF
isl
I
I
CHS I
! E
E
GICI ,
F2
I
c,
I
I
I
,
I
I
I
,
,
I
I
CSQ
IQS
c,
CNL
QSC
c,
CEK
ADC
c,
CKA I
VSC CL
CMK
TSC
,
CQN
I
I
,
c, ,
I
I
CQE
EKC
cL
CQA
KNC
I
I
CAS
FAC
cL
,
I
I
ADC
c,
,
I
VHC
c,
,
ITMIcu I
Tissue distribution Mouse DEC-205 is recognized by the monoclonal antibody NLDC- 1451. Using this antibody the antigen was originally reported to be expressed on dendritic cells in the spleen, lymph nodes, Peyer's patches, thymic medulla and skin (Langerhans cells) and on epithelial cells in the thymic cortex and intestinal villi 1 A rabbit polyclonal antibody raised against purified DEC-205 has subsequently detected low levels of expression on mature B cells, granulocytes, thioglycollate-elicited macrophages, thymocytes, mature T cells from spleen and lymph node, bone marrow stromal cells, the epithelia of the lung and on brain capillaries 2-4.
Structure DEC-205 is a type I membrane glycoprotein comprising an N-terminus Cysrich domain that shows sequence similarity to the B subunit of the plant
147
protein ricin D, followed by a fibronectin type II domain, ten C-type lectin domains, transmembrane and cytoplasmic regions s. The molecule shows overall sequence similarity to the macrophage mannose receptor, the M-type receptor for secretory phospholipases A~ and a fourth, widely expressed, member of this C-type lectin family 6. The N-terminus of DEC-205 has been confirmed by protein sequencing s.
Ligands and associated molecules Unknown.
Function Electron microscopy studies using gold-labelled monoclonal and polyclonal anti-DEC-205 antibodies have shown that dendritic cells rapidly internalize the antigen by means of coated pits and vesicles s. The molecule is delivered to a muhivesicular compartment that resembles MHC Class H-containing vesicles. Dendritic cells treated with rabbit anti-DEC-205 antibodies were also found to induce activation of rabbit IgG-specific T cell clones more efficiently than cells treated with non-immune rabbit IgG s. Together these data suggest that DEC-205 contributes to antigen presentation. Database accession numbers PIR
SWISSPR OT
Mouse
EMBL/GENBANK
REFERENCE
U19271
s
Amino acid sequence of mouse DEC-205
!
148
MRTGRVTPGL AAGLLLLLLR SESSGNDPFT IVHENTGKCI LESQKCLGLD ITKATDNLRM GYAVANTNTS DVWKKGGSEE TWYHDCIHDE DHSGPWCATT SCYQFNNQEI LSWKEAYVSC LGLNQLYSAR GWEWSDFRPL QSVSCESQQP YVCKKPLNNT ESSSWDAAHL KCKAFGADLI TNSPALFQWS DGTEVTLTYW EKKLRYVCKK KGEITKDAES CNLTITSRFE QEFLNYMMKN KQAVTFSNWN FLEPASPGGC EPQEPEEAAP KPDDPCPEGW CQALGAHLPS FSRREEIKDF WSDRTPVSAV MMEPEFQQDF KPFACDAKLEWVCQIPKGST DPHLNYEEAV LYCASNHSFL SENPIDRYFL GSRRRLWHHF PFICERYNVS SLEKYSPDPA SGICHSYGGT LPSVLSRGEQ DNRELTYSNF HPLLVGRRLS FTSCSERHSL SLCQKYSETE ALKECMKEKM RLVSITDPYQ
SFGLVEP QPLSDWVVAQ FSCDSTVMLW NLCAQPYHEI LSYEYDQKWG QNQGADLLSI KFLNWDPGTP LELPDVWTYT SMHSLADVEV NENEPSVPFN DKLCPPDEGW YDKSLRKYFW VAMSTGKTLG HTFPSSLSCY VHLLKDQFSG DIRDCAAIKV PQMPDWYNPE ATITSFTGLK PMTFGDECLH AKVQCTEKWI DFIISLLPEM IPTNFFDDES DGQPWENTSK QAFLAVQATL
DCSGTNNMLW WKCEHHSLYT YTRDGNSYGR ICLLPESGCE HSAAELAYIT VAPVIGGSSC DTHCHVGWLP VVTKLHNGDV KTPNCVSYLG KRHGETCYKI TGLRDPDSRG KWEVKNCRSF KVFHIERIVR QRWLWIGLNK LDVPWRRVWH RTGIHGPPVI AIKNKLANIS MSAKTWLVDL PFQNKCFLKV EASLWIGLRW HFHCALILNL TVKYLNNLYK RNSSFWIGLS
KWVSQHRLFH AAQYRLALKD PCEFPFLIGE GNWEKNEQIG GKEDIARLVW ARMDTESGLW NNGFCYLLAN KKEIWTGLKN KLGQWKVQSC YEKEAPFGTN EYSWAVAQGV RALSICKKVS KRNWEEAERF RSPDLQGSWQ LYEDKDYAYW IEGSEYWFVA GEEQKWWVKT SKRADCNAKL NSGPVTFSQA TAYERINRWT KKSPLTGTWN IISKPLTWHG SQDDELNFGW
-i 5O i00 150 200 250 300 350 400 450 5OO 55O 6OO 650 700 75O 80O 85O 90O 950 i000 1050 ii00 1150
SDGKRLQFSN TEEEVRALDT CEKLHPKAHS DKTALSYTHW ISACKIEMVD GELASVHNPN VPWQSLQSPG SKCPVAKRDG DENENKFVSR KTKDGDGKCS LCLLGLISLA
WAGSNEQLDD AKCPSPVQST LSIRNEEENT RTGRPTVKNG YEDKHNGTLP GKLFLEDIVN DCVVLYPKGI PQWVQYGGHC LMRENYNITM ILIASNETWR IWFLLQRSHI
CVILDTDGFW PWIPFQNSCY FVVEQLLYFN KFLAGLSTDG QFIPYKDGVY RDGFPLWVGL WRREKCLSVK YASDQVLHSF RVWLGLSQHS KVHCSRGYAR RWTGFSSVRY
KTADCDDNQP NFMITNNRHK YIASWVMLGI FWDIQSFNVI SVIQKKVTWY SSHDGSESSF DGAICYKPTK SEAKQVCQEL LDQSWSWLDG AVCKIPLSPD EHGTNEDEVM
GAICYYPGNE TVTPEEVQST TYENNSLMWF EETLHFYQHS EALNACSQSG EWSDGRAFDY DKKLIFHVKS DHSATVVTIA LDVTFVKWEN YTGIAILFAV LPSFHD
References 1 z 3 4 s 6
Kraal, G. et al. (1986) J. Exp. Med. 163, 981-997. Swiggard, W.J. (1995) Cell. Immunol. 165, 302-311. Inaba, K. et al. {1995) Cell. Immunol. 163, 148-156. Witmer-Pack, M.D. et al. (1995) Cell. Immunol. 163, 157-162. Jiang, W. et al. (1995) Nature 375, 151-155. Wu, K. et al. {1996)J. Biol. Chem. 271, 21323-21330.
1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1696
DNAM-1
DNAX accessory molecule 1
Molecular weights Polypeptide
36 497
SDS-PAGE reduced unreduced
65 kDa 65 kDa
NH 2
V
s S
Carbohydrate N-linked sites O-linked
8 nil
Human gene location 18q22.3
Tissue distribution DNAM-1 is expressed on the majority of T cells, NK cells, monocytes, some B cells and some thymocytes. It is absent from erythrocytes and granulocytes 1.
Structure The extracellular region consists of two IgSF domains which are unusually both V-set. The N-terminus has been established by protein sequencing 1.
Ligands and associated molecules COS cells transfected with DNAM-1 bind to a variety of haematopoietic and non-haematopoietic cells indicating the presence of a widely distributed ligand 1.
Function DNAM-1 mAb blocks NK and cytotoxic T cell killing 1. Crosslinking of DNAM-1 causes phosphorylation of tyrosine residues in DNAM-1 indicating a possible role in signalling 1.
Database accession number PIR
Human
15(
SWISSPR OT
EMBL/GENBANK
REFERENCE
U56102
1
A m i n o acid s e q u e n c e of h u m a n D N A M - 1 MDYPTLLLAL EEVLWHTSVP MVIRKPYAER QKVIQVVQSD QPRQIDLLTY LYRCYLQASA TIIVIFLNRR EDIYVNYPTF
LHVYRALC FAENMSLECV VYFLNSTMAS SFEAAVPSNS CNLVHGRNFT GENETFVMRL RRRERRDLFT SRRPKTRV
YPSMGILTQV NNMTLFFRNA HIVSEPGKNV SKFPRQIVSN TVAEGKTDNQ ESWDTQKAPN
EWFKIGTQQD SEDDVGYYSC TLTCQPQMTW CSHGRWSVIV YTLFVAGGTV NYRSPISTGQ
SIAIFSPTHG SLYTYPQGTW PVQAVRWEKI IPDVTVSDSG LLLLFVISIT PTNQSMDDTR
-i 50 i00 150 200 250 300 318
Reference 1 Shibuya, A. (19961Immunity 4, 573-581.
151
ESL-1
E-selectin ligand 1, MG-160, cysteine-rich FGF receptor
Molecular weights Polypeptide
131023
SDS-PAGE reduced unreduced
150 kDa 130 kDa
Gin-rich region
[
I
NH2
Carbohydrate N-linked sites O-linked
5 unknown 10 more cysteine-rich
Human gene location 16q22-q231
repeats
/,x /,x l,l GOOH
Tissue distribution ESL-1 is expressed in virtually all cells 1-a. However, the glycoform which binds to E-selectin has only been detected in cells of the myeloid lineage a.
Structure ESL-1 is a large cysteine-rich type I membrane glycoprotein with a short cytoplasmic domain and five potential N-linked glycosylation sites in the extracellular region. A glutamine-rich N-terminal segment (-70 amino acids) is followed by 16 repeats of a novel, 50-60 amino acid long, cysteine-rich motif 2. Mouse ESL-1 is likely to be the homologue (98% identical) of the rat Golgi glycoprotein MG-160 2. ESL-1 may also be a splice variant of the cysteine-rich fibroblast growth factor (FGF)receptor, identified in the chicken a and the human (Genbank U28811). Apart from the glutamine-rich N-terminal segment, ESL-1 is >98% identical to the human cysteine-rich FGF receptor a.
Ligands and associated proteins E-selectin (CD62E) binds to a glycoform of ESL-1 expressed on myeloid cells {but not other cells) a. Binding requires N-linked carbohydrates on ESL-1 containing both sialic acid and fucose a. Proteins (MG-160, cysteine-rich FGF receptor} which may be species homologues of ESL-1 bind to fibroblast growth factor 1,4.
~5~
ESL-1
Function Studies in vitro suggest that ESL- 1 is a major E-selectin ligand on myeloid cells a but the functional significance of this interaction has not been tested in vivo. It is not known what role, if any, the protein has in Golgi function. Database accession numbers PIR
SWISSPR OT
Mouse Rat MG- 160
Amino
acid sequence
EMBL/GENBANK X84037 U08136
REFERENCE 3 2
of m o u s e ESL-1
, ,,,4s~..~
MAVCGRVRGM QNGHGQGQGP QQQQQSLFAA SNNLAVLECL STISEIKECA RLICGFMDDC EPKIQVSELC GRVYKCLFNH KYRCNVENLP EDFSLSPEII QALQTLIQET HLYTEKMVED ETSELMPPGA DPALQDKCLI ESEDIQIEAL EKCAIGVTHF TTVRNDTLQE NYCSTVQYGN RVCKQMIKRF YRLNPVLRKA SDCEDQIRII ECLKVNLLKI ITPGRGRQMS SDLAMQVMTS
FRLSAALPLL GTNFGPFPGQ GGLPARRGGA QDVREPENEI EEPVGKGYMV KNDINLLKCG KKAILRVAEL KFEESMSEKC RSREARLSYL LSCRGEIEHH DPGADYRIDR CEHRLLELQY VFSCLYRHAY DLGKWCSEKT LMRACEPIIQ QLVQMKDFRF AKEHRVSLKC AQIIECLKEN CPEADSKTML CKADIPKFCH IQESALDYRL KTELCKKEVL CLMEALEDKR PSKNYILSVI
LLAAAGA GGGGSPAGQQ GPGGTGGGWK SSDCNHLLWN SCLVDHRGNI SIRLGEKDAH SSDDFHLDRH REALTTRQKL LMCLESAVHR CSGLHRKGRT ALNEACESVI FISRDWKLDP RTEEQGRRLS ETGQELECLQ NFCHDVADNQ SYKFKMACKE RKQLRVEELE KKQLSTRCHQ QCLKQNKNSE GILTKAKDDS DPQLQLHCSD NMLKESKADI VRLQPECKKR SGSICILFLI
PPQQPQLSQQ LAEEESCRED YKLNLTTDPK TEYQCHQYIT SQGEVVSCLE LYFACRDDRE IAQDYKVSYS GRQVSSECQG LHCLMKVVRG QTACKHIRSG VLYRKCQGDA RECRAEVQRI DHLDDLAVEC IDSGDLMECL DVLKLCPNIK MTEDIRLEPD KVFKLQETEM LMDPKCKQMI ELEGQVISCL EIANLCAEEA FVDPVLHTAC LNDRIEMWSY GLMCGRITKR
QQQPPPQQQQ VTRVCPKHTW FESVAREVCK KMTAIIFSDY KGLVKEAEEK RFCENTQAGE LAKSCKSDLK EMLDYRRMLM EKGNLGMNCQ DPMILSCLME SRLCHTHGWN LHQRAMDVKL RDIVGNLTEL IQNKHQKDMN KKVDVVICLS LYEACKSDIK MDPELDYTLM TKRQITQNTD KLRYADQRLS AAQEQTGQVE ALDIKHHCAA AAKVAPADGF VTRELKDR
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 7OO 75O 80O 85O 90O 95O i000 1050 ii00 1148
References 1 Mourelatos, Z. et al. (1995) Genomics 28, 354-355. 2 Gonatas, J.O. et al. (1995) J. Cell Sci. 108, 457-467. a Steegmaier, M. et al. (1995) Nature 373, 615-620. 4 Burrus, L.W. et al. (1992) Mol. Cell. Biol. 12, 5600-5609.
15~
F4/80
EGF module-containing mucin-like hormone
receptor 1 (EMR1)(human)
Molecular weights Polypeptide 98 892 E
SDS-PAGE reduced unreduced
160 kDa 160 kDa
Carbohydrate N-linked sites 10 O-linked + Glycosaminoglycan Human gene location (EMR1) 19p13.3
GOOH
Tissue distribution The F4/80 monoclonal antibody has been used to detect mouse macrophage populations in a wide range of tissues 1. F4/80 § macrophages are located in the splenic red pulp, liver (Kupffer cells), medullary region of lymph nodes, brain (microglia), gut (lamina propria), bone marrow stroma, skin (Langerhans cells) and peritoneum 2. Blood monocytes, thymic macrophages and macrophages in the lung express lower levels of the F4/80 molecule. The sequence of human EMR1 is available, however antibodies specific for EMR1 have not yet been developed. Limited RToPCR analysis suggests that the human molecule is highly expressed on peripheral blood monocytes and a number of haematopoietic cell lines a.
Structure F4/80 is a member of the recently described EGF-TM7 family which also includes the CD97 molecule 4. The deduced amino acid sequence of F4/80 comprises seven extracellular EGF domains, followed by a region of 277 amino acids and seven transmembrane-spanning regions with sequence similarities to the G protein-coupled transmembrane receptors s. F4/80 has also been shown to be extensively N-glycosylated, moderately O-glycosylated and contains a chondroitin sulfate glycosaminoglycan attachment 6.
15~
Ligands and associated molecules Unknown. Function Unknown. Comments The mouse F4/80 and human EMR1 polypeptide sequences share 68% overall amino acid identity, although mouse F4/80 contains seven extracellular EGF domains compared to the six in EMR14,s. F4/80 and EMR1 are either species homologues or are members of a closely related group of proteins. Database accession numbers PIR
SWISSPR OT
H u m a n EMR1 Mouse
EMBL/GENBANK X81479
REFERENCE
X93328
5
3
A m i n o a c i d s e q u e n c e of m o u s e F 4 / 8 0
I i
I
MWGFWLLLFW QTLGGVNECQ ECQDVNECLQ NFLCADVDEC VTRDVCPEHA RNSTLCGPTF DTCPLNSSCT LQCGLNSVCT ILQSEQIQQC ATSVSLVLEQ MIQTEYLDIE VAFVSFAHME IYTLQHIQPK RMANLAIIMA HNTYMHLHLC MLVEAVMLFL PRGYGMHNRC VSSEVSKLKD NSLQGAFIFL SKMG
GFSGMYRWGM DTTTCPAYAT SDSPCGPNSV LTIGICPKYS TCHNTLGSYY ICINTLGSYS NTIGSYFCTC NVPGSYICGC QAVQGRDLGY ATTWFELSKE SKVINEECKE SVLNERFFED QKSERPICVS SGELTMEFSL VCLFLAKILF MVRNLKVVNY WLNTETGFIW TRLLTFKAIA IHCLLNRQVR
TTLPTLG CTDTTDSYYC CTNILGRAKC NCSNSVGSYS CTCNSGLESS CSCPAGFSLP HPGFASSNGQ LPDFQMDPEG ASFCTLVNAT ETSTLGTILL NESINLAARG GQSFRKLRMN WNTDVEDGRW YIISHVGTVI LTGIDKTDNQ FSSRNIKMLH SFLGPVCMII QIFILGCSWV DEYKKLLTRK
TCKRGFLSSN SCLRGFSSST CTCQPGFVLN GGGPMFQGLD TFQILGHPAD LNFKDLEVTC SQGYGNFNCK FTILDNTCEN ETVESTMLAA DKMNVGCFII SRVVGGTVTG TPSGCEIVEA SLVCLALAIA TACAIIAGFL LCAFGYGLPV TINSVLLAWT LGIFQIGPLA TDLSSHSQTS
GQTNFQGPGV GKDWILGSLD GSICEDEDEC ESCEDVDECS GNCTDIDECD EDIDECTQDP RILFKCKEDL KSAPVSLQSA LLIPSGNASQ KESVSTGAPG EKKEDFSKPI SETHTVCSCN TFLLCRAVQN HYLFLACFFW LVVIISASVQ LWVLRQKLCS SIMAYLFTII GILLSSMPST
References 1 2 3 4 s 6
Austyn, J.M. and Gordon, S. (1981) Eur. J. Immunol. 11,805-815. Gordon, S. et al. (1992)Curr. Top. Microbiot. Immunol. 181, 1-37. Baud, V. et al. (1995)Genomics 26, 334-344, McKnight, A.J. and Gordon, S. (1996)Immunol. Today 17, 283-287. McKnight, A.J. et al. (1996) J. Biol. Chem. 2~1, 486-489. Haidl, I.D. and Jefferies, W.A. (1996)Eur. J.i:Immunol. 26, 1139-1146.
-i 5O i00 150 200 250 300 350 400 450 5OO 55O 600 650 7O0 75O 8OO 85O 900 904
FasL Molecular weights Polypeptide
31485
SDS-PAGE reduced
38-42 kDa |
Carbohydrate N-linked sites O-linked
3 probable ++
TTI
Human gene location and size 1q23; 8 kb 1 WED
Domain
Icu Exon boundaries
I,
i
FFG
T
I
I
NH2
REST / TGK IGH
Tissue distribution In contrast to Fas, FasL is mainly restricted to activated T lymphocytes and is induced rapidly 2. FasL mRNA has been identified in rodent testis and eye and expression confirmed in the latter with a mAb 2.
Structure FasL is a member of the TNF superfamily 2-5 and its amino acid sequence is highly conserved across species 1. Like other members of this superfamily, it is a type II membrane protein expressed as a trimer with the similarity to TNF being in the C-terminal extracellular region &s. The cytoplasmic domain is proline rich and contains a consensus SH3 binding motif 1. The TNF-like extracellular region can be found in soluble form a. Other members of the TNF superfamily are clustered on chromosome 1 1.
Ligands and associated molecules
D
The extracellular region of FasL binds to CD95 (Fas), a member of the TNFR superfamily.
Function FasL binding to CD95 induces apoptosis in activated mature lymphocytes, thus has a role in maintaining peripheral tolerance but does not appear critical in development 2'7. Autoimmune disease in gld -/- mouse is due to mutations in FasL 2,7. FasL on cytotoxic T cells can induce cytolysis of CD95-expressing target cells 7,8.
15~
Database accession numbers PIR Human Mouse Rat
SWISSPR OT P48023 P41047 P36940
EMBL/GENBANK U11821 $76752 U03470
REFERENCE 1
6 5
Amino acid sequence of human FasL MQQPFNYPYP PPPLPPPPPP MFQLFHLQKE GKSNSRSMPL SCNNLPLSHK LTSADHLYVN
QIYWVDSSAS PPLPPLPLPP LAELRESTSQ EWEDTYGIVL VYMRNSKYPQ VSELSLVNFE
SPWAPPGTVL LKKRGNHSTG MHTASSLEKQ LSGVKYKKGG DLVMMEGKMM ESQTFFGLYK
PCPTSVPRRP LCLLVMFFMV IGHPSPPPEK LVINETGLYF SYCTTGQMWA L
GQRRPPPPPP LVALVGLGLG KELRKVAHLT VYSKVYFRGQ RSSYLGAVFN
50 i00 150 200 250 281
References 1 2 3 4 s 6 7 8
Takahashi T. et al. (1994)Int. Immunol. 6, 1567-1574. van Parijs, L. and Abbas, A.K. (1996) Curr. Opin. Immunol. 8, 355-361. Gruss, H.-J. and Dower, S.K. (1995) Blood 85, 3378-3404. Nagata, S. and Golstein, P. (1995) Science 267, 1449-1456. Suda, T. et al. (1993)Cell 75, 1169-1178. Lynch, D.H. et al. (1994)Immunity 1, 131-136. Lynch, D.H. et al. (1995)Immunol. Today 16. 569-574. Takayama, H. et al. (1995) Adv. Immunol. 60, 289-321.
15~
FceRI
High-affinity receptor for IgE
Molecular weights Polypeptide
SDS-PAGE reduced
unreduced
a fl
29 596 26 533 9667
45-65 kDa 27 kDa 7-10 kDa 45-65 kDa 27 kDa 20 kDa
TT r
Carbohydrate N-linked sites
O-linked
COOH
c~ 7 ~ 0 0 probable + ]3 unknown 7 nil
NH2
COON
Human gene location and size chain: 1q23 //chain: 11 q13; 10 kb 1 ? chain: 1q23; 4 kb 2 Domains human
CNG [ YKC I
Fc~R1 Exon boundaries
F
AVP
I
CHG YYC I I
SDW
KAP
Tissue distribution Fc~RI was once thought to be expressed only on mast cells and basophils 3, but has recently been identified on eosinophils 4, monocytes s, and Langerhans cells in the skin 6.
Structure Fc~RI is a multisubunit structure composed of an ~ chain, fl chain and disulfidelinked 7 homodimer 3- In the human, but not the mouse, FceRI is thought to exist as both a]37~ and ~7~ complexes 7. Fc~RIa has an extracellular region composed of two IgSF domains followed by a highly conserved transmembrane region. Within the transmembrane region a sequence of nine amino acids is conserved in human, rat and mouse, of which eight residues are also identical in the CD16 transmembrane region s. Fc~RIfl is a member of the CD20/FceRIfl superfamily which includes CD20 and HTm4. These molecules are predicted to have four transmembrane regions, cytoplasmic Nand C-termini, and short extracellular loops 9. FceRI7 is identical to the 7
158
subunit of CD 16 s. FceRI? has 86% amino acid identity between human, rat and mouse and is related to the ~ and ~? chains that are associated with the C D 3 / T C R complex 2. The C-terminal cytoplasmic regions of FceRIfl and Fc~RI? each contain an immunoreceptor tyrosine-based activation motif (ITAM) 1~ FceRI shows no sequence similarity to the low-affinity IgE receptor CD23 8.
Ligands and associated molecules Fc~RI~ binds to monomeric IgE with high affinity (K~ of 101~ -1) and a stoichiometry of 1:1. The binding site involves the second IgSF domain of Fc~RI~ which interacts with the N-terminal segment of the C~3 domain of IgE 3. The cytoplasmic tails of Fc~RIfl and Fc~RI7 associate non-covalently with the non-receptor protein tyrosine kinases Lyn and Syk, respectively 7.
Function As the high-affinity receptor for IgE, FceRI on basophils and mast cells plays a central role in allergic reactions. When a multivalent allergen binds to FceRIbound IgE, Fc~RI molecules are crosslinked and a signalling response is initiated. The result is cellular degranulation, a rapid release of histamine and other stored mediators, and the secretion of pro-inflammatory cytokines. These factors combine to induce the symptoms of immediate hypersensitivity 3. The precise functions of the a, ]/ and ? FceRI subunits in this process are becoming clear. The a chain performs the ligand binding role by binding to IgE 3, whereas the // chain serves to amplify signals that are transduced through the ~ homodimer 7. The function of FceRI on monocytes and Langerhans cells, in which t h e / / chain is not expressed, is not clear, although FceRI on these cells is upregulated in atopic individuals s,6. On eosinophils, FceRI can mediate a cytotoxicity response against a metazoan parasite 4. Indeed, the physiological role of FceRI in a normal i m m u n e response is thought to be in protection against parasites.
Comments 1 The ~ and 7 genes of Fc~RI are linked to other Fc receptor genes on chromosome 1 s. 2 A c o m m o n variant of FceRIfl, Ile 181Leu within the fourth transmembrane region, shows a strong association with atopic IgE responses 11.
Database accession numbers Human ~ Human fl Human ~ Rat ~ Rat fl Rat ? Mouse ~ Mouse fl Mouse 7
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
S00682 $21154 A35241 A27116 A31231 S02118 A34342 B34342
P12319 Q01362 P30273 P12371 P13386 P20411 P20489 P20490 P20491
X06948 D10583 M33195 M17153 M22923
8 1 8 s s s s s s
J05018 J05019 J05020
15~
Fc~:RI
Amino acid sequence of human FccRIc~ MAPAMESPTL VPQKPKVSLN SSLNIVNAKF QPLFLRCHGW CTGKVWQLDY TQQQVTFLLK
LCVALLFFAP PPWNRIFKGE EDSGEYKCQH RNWDVYKVIY ESEPLNITVI IKRTRKGFRL
DGVLA NVTLTCNGNN QQVNESEPVY YKDGEALKYW KAPREKYWLQ LNPHPKPNPK
HNGSLSEETN QASAEVVMEG ATVEDSGTYY FAVDTGLFIS
-i 50 i00 150 200 232
SASSPPLHTW DIFSSFKAGY GGTGITILII LGLGSAVSLT PIDL
50 i00 150 200 244
MIPAVVLLLL LLVEQAAA LGEPQLCYIL DAILFLYGIV LTLLYCRLKI QVRKAAITSY EKSDGVYTGL STRNQETYET L K H E K P P Q
-i 50 68
FFEVSSTKWF LEVFSDWLLL YENHNISITN FFIPLLVVIL NN
Amino acid sequence of human FceRIfl MDTESNRRAN LTVLKKEQEF PFWGAIFFSI NLKKSLAYIH ICGAGEELKG
LALPQEPSSV LGVTQILTAM SGMLSIISER IHSCQKFFET NKVPEDRVYE
PAFEVLEISP ICLCFGTVVC RNATYLVRGS KCFMASFSTE ELNIYSATYS
QEVSSGRLLK SVLDISHIEG LGANTASSIA IVVMMLFLTI ELEDPGEMSP
Amino acid sequence of human FceRI-y
References 2 3 4 s 6 7 s 9 lo 11
16~
Kuster, H. et al. (1992) J. Biol. Chem. 267, 12782-12787. Kuster, H. et al. (1990)J. Biol. Chem. 265, 6448-6452. Sutton, B.J. and Gould, H.J. (1993) Nature 366, 421-428. Gounni, A.S. et al. (1994) Nature 367, 183-186. Maurer, D. et al. (1994) J. Exp. Med. 179, 745-750. Bieber, T. (1994)Immunol. Today 15, 52-53. Lin, S. et al. (1996) Cell 85, 985-995. Ravetch, J.V. and Kinet, J.-P. (1991)Annu. Rev. Immunol. 9, 457-492. Adra, C.N. et al. (1994) Proc. Natl Acad. Sci. USA 91, 10178-10182. Jouvin, M.-H. et al. (1995) Semin. Immunol. 7, 29-35. Shirakawa, T. et al. (1994) Nature Genet. 7, 125-130.
ilk-2 (fetal liver kinase 2) ligand
Molecular weights Polypeptide
NH 2
23 716
SDS-PAGE E
soluble: reduced Carbohydrate N-linked O-linked
30 kDa
2 unknown
E
Human gene location and size 19q13.3; 5.9 kb ~
/Is Exon boundaries
PTTY
I Io
ELSD
" ITMIcu Io Io\~ Io
I FQPP \ EQVP I PDS QDEE
Tissue distribution mRNA for human and mouse FLT3 ligand is widespread in adult and fetal tissue with highest expression in spleen and lung 2. Human FLT3 mRNA is high in peripheral blood mononuclear cells 1. Surface expression has been demonstrated on cell lines by binding of recombinant FLT3 3. A mouse thymic stromal line secretes a soluble form of FLT3 ligand 2
Structure FLT3 ligand has an organization similar to c-kit ligand and macrophage colonystimulating factor (M-CSF) and contains a four helix bundle, a single transmembrane and a short cytoplasmic domain 1,4. Alternative splicing of exon 6 gives rise to a soluble form 2. The native ligand has been reported to form a
homodimer z
Ligands and associated molecules FLT3 ligand binds to FLT3 (ilk-2)receptor.
Function FLT3 ligand binding to its receptor stimulates growth of primitive haematopoietic cells s. Database accession numbers PIR
Human Mouse
SWISSPR OT P49771 P49772
EMBL/GENBANK
REFERENCE
U04806 U04807
2 2
161
Amino acid sequence of human FLT3 ligand MTVLAPAWSP TQDCSFQHSP VLAQRWMERL ISRLLQETSE TAPTAPQPPL PQDLLLVEH
TTYLLLLLLL ISSDFAVKIR KTVAGSKMQG QLVALKPWIT LLLLLLPVGL
SSGLSG ELSDYLLQDY LLERVNTEIH RQNFSRCLEL LLLAAAWCLH
PVTVASNLQD FVTKCAFQPP QCQPDSSTLP WQRTRRRTPR
EELCGGLWRL PSCLRFVQTN PPWSPRPLEA PGEQVPPVPS
References 1 z 3 4 s
162
Lyman, S.D. et al. (1995) Oncogene 11, 1165-1172. Hannum, C. et al. (1994) Nature 368, 643-648. Brasel, K. et al {1995) Leukemia 9, 1212-1218. Mott, H. and Campbell, I.D. {1995)Curt. Opin. Struct. Biol. 5, 114-121. Lyman, S.D. et al. (1995) Curr. Opin. Hematol. 2, 177-181.
-i 50 i00 150 200 209
FPR
N-Formyl peptide receptor, fMLP receptor (fMLPR)
Molecular weights Polypeptide
38 402
SDS-PAGE reduced
55-70 kDa
Carbohydrate N-linked sites O-linked
3 unknown
Human gene location and size
GOOH
19; 6 kb 1
Tissue distribution FPR is expressed by neutrophils, monocytes, macrophages and liver parenchymal cells 2-4
Structure FPR contains seven hydrophobic membrane spanning regions and is a member of the rhodopsin superfamily of G protein-coupled receptors 2,3,6. It is closely related to the other chemoattractant receptors of the family, such as the IL-8, C5a and MIP-la receptors. The third cytoplasmic loop contains a potential protein kinase A phosphorylation site and the C-terminal region is rich in Ser and Thr residues 2. The cDNA clone originally reported to encode the rabbit FPR is actually the rabbit homologue of the IL-8 receptor (type A; CDw128) 6'7. There are two human FPR isoforms that differ by two amino acids (Leul01 and Ala346 in the sequence shown below are replaced by Val and Glu, respectively) and by significant differences in the 5' and 3' untranslated regions. These probably represent allelic variants 2.
Ligands and associated molecules FPR binds N-formyl peptides of bacterial and mitochondrial origin, such as the prototype fMLP (formyl-Met-Leu-Phe)(Kd = 1 - 2 riM). Putative natural ligands for FPR have also been described 8. The reconstitution of a functional human FPR in Xenopus oocytes requires a complementary human factor 3.
Function N-formyl peptides interact with FPR to induce neutrophil chemotaxis, phagocytosis, production of superoxide radicals and release of proteolytic enzymes from intracellular granules 2'3. The binding of fMLP to FPR activates phospholipase C, via a pertussis toxin-sensitive G protein. The resulting production of diacylglycerol and phosphoinositides induces the activation of protein kinase C and mobilization of intracellular calcium. Other activation pathways of the receptor include the stimulation of phospholipases A2 and D 3
16~
FPR
Comments The genes for two FPR-like receptors FPRL 1 and FPRL2, which show 69 % and 56% amino acid sequence identity to FPR, have also been mapped to human chromosome 19 9. The FPRL1 molecule is a low-affinity receptor for fMLP (Kd=430nM), whereas FPRL2 does not bind fMLP and has no known ligand ~o,~.
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A35495
P21462 P33766
M60626, M60627 L22181
2 10
Amino acid sequence of human FPR METNSSLPTN AGFRMTHTVT LFTIVDINLF VMALLLTLPV RGIIRFIIGF FLCWSPYQVV YVFMGQDFRE
ISGGTPAVSA TISYLNLAVA GSVFLIALIA IIRVTTVPGK SAPMSIVAVS ALIATVRIRE RLIHALPASL
GYLFLDIITY DFCFTSTLPF LDRCVCVLHP TGTVACTFNF YGLIATKIHK LLQGMYKEIG ERALTEDSTQ
LVFAVTFVLG FMVRKAMGGH VWTQNHRTVS SPWTNDPKER QGLIKSSRPL IAVDVTSALA TSDTATNSTL
VLGNGLVIWV WPFGWFLCKF LAKKVIIGPW INVAVAMLTV RVLSFVAAAF FFNSCLNPML PSAEVALQAK
50 i00 150 200 250 300 350
References 1 2 3 4 5 6 7 s 9 lo n
164
Murphy, P.M. et al. (1993)Gene 133, 285-290. Boulay, E et al. (1990) Biochemistry 29, 11123-11133. Murphy, P.M. and McDermott, D. (1991) J. Biol. Chem. 266, 12560-12567. McCoy, R. et al. (1995) J. Exp. Med. 182, 207-217. Dohlman, H.G. et al. (1991) Annu. Rev. Biochem. 60, 653-688. Thomas, K.M. et al. (1990) J. Biol. Chem. 265, 20061-20064. Thomas, K.M. et al. (1991) J. Biol. Chem. 266, 14839-14841. Gao, J-L. et al. (1994) J. Exp. Med. 180, 2191-2197. Bao, L. et al. (1992) Genomics 13, 437-440. Gao, J-L. and Murphy, P.M. (1993) J. Biol. Chem. 268, 25395-25401. Durstin, M. et al. (1994) Biochem. Biophys. Res. Commun. 201, 174-179.
Mac-2, ~BP, IgEBP, CBP-35, CBP-30, RL29, L29, L31, L34, LBL etc. Molecular weights Polypeptide
27 482
SDS-PAGE reduced unreduced
35 kDa 35 kDa, 67 kDa, 80 kDa
PGAYPG repeats
Carbohydrate N-linked sites O-linked
0 unknown
NH2
Human gene location lp131 Domain
AFHI
I
I
FESI
SL
Tissue distribution The highest levels of galectin 3 are found on thioglycollate-elicited macrophages (1.7x10 s sites/cell) 2, basophils, mast cells, some epithelial cells and some sensory neurons 1-3. It is likely that cell surface expression is due to the carbohydrate recognition domain binding to cell surface glycoproteins as surface expression is reduced by inhibitory sugars 3. !
! !
! ! !
! !
I I
Structure Galectins, previously known as S-type lectins, contain carbohydrate binding domains which have sequence similarity in the carbohydrate binding sites and have affinity for fl-galactoside sugars 1. Galectin 3 consists of two distinct regions: the N-terminal region is rich in Pro and Gly residues and contains multiple repeats of the sequence PGAYPG or slight variations thereof, and the C-terminal region contains the carbohydrate binding domain 1"&4 The protein contains no hydrophobic sequences that may function as signal sequences or transmembrane domains and is secreted by unknown mechanisms 1,3. Galectin 3 can form multimers through its Nterminal region and this is not dependent on disulfide formation 3,4.
Ligands and associated molecules Galectin 3 binds IgE and Fc~RI 4. Another ligand is a secreted glycoprotein, Mac-2 binding protein, which contains a scavenger receptor cysteine-rich domain 3. Binding activities can be inhibited by galactose 1,3
Function A role for galectin 3 in crosslinking IgE receptors is postulated 4. Galectin 3 is the major non-integrin laminin binding protein of inflammatory macrophages and an antiadhesive role is suggested 3 A recent study describes the
165
Galectin 3
activation of neutrophils by recombinant galectin 3 which is dependent on both its lectin binding activity and the N-terminal region s. An intracellular role for galectin 3 has been suggested based on localization of the protein in the nucleus 1. There is evidence for galectin 3 as a factor in RNA splicing 6. Activity was carbohydrate-dependent, in contrast to a report describing RNA binding by galectin 3 7.
Database accession numbers Human Rat Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A35820 A23148 A28651
P 17931 P08699 P16110
J02921 J02962 X16834
8 9 10
Amino acid sequence of human galectin 3 MADNFSLHDA PGAYPGQAPP YPATGPYGAP RGNDVAFHFN QVLVEPDHFK
LSGSGNPNPQ GAYHGAPGAY AGPLIVPYNL PRFNENNRRV VAVNDAHLLQ
GWPGAWGNQP PGAPAPGVYP PLPGGVVPRM IVCNTKLDNN YNHRVKKLNE
AGAGGYPGAS GPPSGPGAYP LITILGTVKP WGREERQSVF ISKLGISGDI
YPGAYPGQAP SSGQPSAPGA NANRIALDFQ PFESGKPFKI DLTSASYTMI
References 1 2 a a s 6 7 s 9 lo
166
Barondes, S.H. et al. (1994) J. Biol. Chem. 269, 20807-20810. Ho, M.-K. and Springer, T.A. {1982} J. Immunol. 128, 1221-1227. Hughes, R.C. {1994} Glycobiology 4, 5-12. Liu, F-T. {1993} Immunol. Today 14, 486-490. Yamaoka, A. et al. {1995} J. Immunol. 154, 3479-3487. Dagher, S.E et al. {1995} Proc. Natl Acad. Sci. USA 92, 1213-1217. Wang, L. et al. {1995} Biochem. Biophys. Res. Commun. 217, 292-303. Cherayil, B.J. et al. {1990} Proc. Natl Acad. Sci. USA 87, 7324-7328. Albrandt, K. et al. (1987} Proc. Natl Acad. Sci. USA 84, 6859-6863. Cherayil, B.J. et al. (1989) J. Exp. Med. 170, 1959-1972.
50 i00 150 200 250
Granulocyte colony-stimulating factor receptor, CDll4
G-CSFR
Molecular weights Polypeptide G-CSFR1 89 618 Carbohydrate N-linked sites O-linked
9 unknown
E
Human gene location and size lp35-p34.3; - 16.5 kb 1
E
COOH Domains Cll I |
LSC ,
CQW~Hc::5 I
,
GSL
GYP ,2 FKS Exon boundaries
WEP
VVK
I
TAY ,
ALVG RAP
WKP I
VAY
FWKP
,
WEP [ QLY 11 RGP
WVP
SLY , F3 ]TM ~2CY1 F3 I1 L1 tl IIPNR~ KEN MAP FSA o PEG EEDA
I
I
Tissue distribution C D l l 4 is expressed on neutrophils and their bone marrow precursors, endothelial cells, platelets, myeloid leukaemias and carcinoma cell lines 2,3
i
.......
Structure
:
L
L
........
The extracellular region of CD 114 consists of an IgSF C2-set domain, followed by a cytokine receptor domain and four fibronectin type III domains, the first of which contains the WSXWS motif characteristic of many Class I cytokine receptor family members 2. Four forms of human C D l l 4 (G-CSFR1, GCSFR2, G-CSFR3 and G-CSFR4/D7) have been identified by cDNA cloning that differ at the C-terminal region of the molecule and are probably generated by alternative splicing 2'4. The polypeptide encoded by G-CSFR2 appears to be a secreted form of CD114. The cytoplasmic domains of the remaining three forms have a high content of Ser and Pro residues. Based on crosslinking experiments, the estimated molecular weight of C D l l 4 is 150 kDa 2.
16;
G-CSFR
Ligands and associated molecules A single class of high-affinity binding sites for G-CSF (Kd= 100-500pM) is detected on the cell surface 2. The monomeric form of purified murine CD 114 binds G-CSF with low affinity while its oligomeric forms show highaffinity binding, which suggests that the high-affinity receptor may be formed by a homodimer of the CD114 protein 3,s. Following the binding of G-CSF to CD114, the Janus family Tyr kinases Jakl and Jak2 and the transcriptional activator Stat3 are Tyr phosphorylated 6. Mutational analysis has shown that the cytokine receptor domain and the WSXWS motif are necessary for G-CSF binding, and that a region of the cytoplasmic domain is essential for signal transduction 3.
Function G-CSF stimulates the proliferation and differentiation of neutrophils from their bone marrow precursors, activates mature neutrophils and causes proliferation and migration of endothelial cells 7-9. Database accession numbers
Human Mouse
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
JH0330 A34898
Q99062 P40223
X55721 M32699
2 10
The accession numbers for the alternatively spliced forms of human CD 114 are M59819 (G-CSFR2), M59820 (G-CSFR3)and X5572 (G-CSFR4/D7).
Amino acid sequence of human C D l l 4 (G-CSFR1) MARLGNCSLT ECGHISVSAP QQRLSDGTQE IPHNLSCLMN LDCVPKDGQS VKLEPPMLRT GEASWALVGP LELRTTERAP WRPSGQAGAI SESRGPALTR KTWRMEQNGR MAPSHAPELH SAILNASSRG ELHIILGLFG PTIMEEDAFQ QTYVLQGDPR PLLAGLTPSP IRVHGMEALG
WAALIILLLP IVHLGDPITA SIITLPHLNH LTTSSLICQW HCCIPRKHLL MDPSPEAAPP LPLEALQYEL TVRLDTWWRQ LPLCNTTELS LHAMARDPHS ATGFLLKENI LKHIGKTWAQ FVLHGLEPAS LLLLLTCLCG LPGLGTPPIT AVSTQPQSQS KSYENLWFQA SF
GSLE SCIIKQNCSH TQAFLSCCLN EPGPETHLPT LYQNMGIWVQ QAGCLQLCWE CGLLPATAYT RQLDPRTVQL CTFHLPSEAQ LWVGWEPPNP RPFQLYEIIV LEWVPEPPEL LYHIHLMAAS TAWLCCSPNR KLTVLEEDEK GTSDQVLYGQ SPLGTLVTPA
LDPEPQILWR WGNSLQILDQ SFTLKSFKSR AENALGTSMS PWQPGLHINQ LQIRCIRWPL FWKPVPLEED EVALVAYNSA WPQGYVIEWG TPLYQDTMGP GKSPLTHYTI QAGATNSTVL KNPLWPSVPD KPVPWESHNS LLGSPTSPGP PSQEDDCVFG
References 1 Seto, Y. et al. (1992) J. Immunol. 148, 259-266. 2 Larsen, A. et al. (1990) J. Exp. Med. 172, 1559-1570. 3 Fukunaga, R. et al. (1991) EMBO J. 10, 2855-2865. 16~
LGAELQPGGR VELRAGYPPA GNCQTQGDSI PQLCLDPMDV KCELRHKPQR PGHWSDWSPS SGRIQGYVVS GTSRPTPVVF LGPPSASNSN SQHVYAYSQE FWTNAQNQSF TLMTLTPEGS PAHSSLGSWV SETCGLPTLV GHYLRCDSTQ PLLNFPLLQG
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 700 750 800 812
G-CSFR
4 s 6 7 8
Fukunaga, R. et al. (1990) Proc. Natl Acad. Sci. USA 87, 8702-8706. Fukunaga, R. et al. (1990) J. Biol. Chem. 265, 14008-14015. Tian, S.S. et al. (1994) Blood 84, 1760-1764 Nicola, N.A. (1989)Annu. Rev. Biochem. 58, 45-77. Arai, K. et al. (1990) Annu. Rev. Biochem. 59, 783-836.
9 Callard, R.E. and Gearing, A.J.H. (1994) The Cytokine FactsBook. Academic Press, London. lo Fukunaga, R. et al. (1990)Cell 61,341-350.
16~
Granulocyte-macrophage factor receptor
GM-CSFR 7-
Subunits CD 116 (a chain) CDwl31 (tic chain)
colony-stimulating
CDll6
CDw131
:
Molecular weights Polypeptide CD 116 CDwl31 Carbohydrate N-linked sites O-linked
CDll6 CDwl31 CD 116 CDwl31
43 777 95707
11 3 unknown unknown
I>777I>I>I>< T71>l
Essays
COOH c
Human gene location and size CD 116: Xp22.32, Yp 11.3; _>45kb 1 CDw131:22q12.2-q13.1 Domains
CDl16
COOH C'I'VV
[
I sl Domains
CDw131
c-
I CRW [ SAC
Isl
cK
,
cK
WSV
I
I
VGC i
Fa
I
F~
CSW
STY
,
WKQAK H I i
I
I
cK
ITM I Cu WET
HHC
,
I
I
TRY
F3
,
ITMIcu
Tissue distribution The GM-CSFR is expressed on monocytes, neutrophils, eosinophils, fibroblasts and endothelial cells. It is also present on myelocytic and promyelocytic cell lines, osteogenic sarcoma cell lines, osteoblast-like cells and breast and lung carcinoma cell lines 2.
Structure The human GM-CSF receptor is formed by the association of CD116 and CDwl31, which is the // chain common also to the IL-3R and IL-5R 3. Crosslinking experiments show that the molecular weights of CD116 and CDwl31 are 70-85 kDa and 120-140 kDa respectively 2"4. The extracellular region of CD116 consists of an N-terminal region of about 100 amino acids with sequence similarities to that present in CDw123 (IL-3R a chain) and CDw125 (IL-5R a chain), followed by a cytokine receptor domain and a fibronectin type III domain that contains the WSXWS motif 2"3. A soluble form of CDl16 which binds GM-CSF with relatively low affinity (Kd=3 4riM) has been identified by PCR cloning s-z. In addition, an alternatively spliced form of CD 116 with an altered cytoplasmic tail has been described 7. The structure of CDw 131 is described in the entry for the IL-3R.
J,7(
GM-CSFR Ligands and associated m o l e c u l e s CD 116 binds GM-CSF with low affinity (Kd = 1-8 riM) 2. CDw 131, which does not bind GM-CSF, associates with CD116 to generate a high-affinity receptor for GM-CSF (Kd=30-120pM) 4. The GM-CSFR is believed to bind and mediate phosphorylation of the Janus family Tyr kinase, Jak2 8. In the mouse it has been shown that CDwl31 is essential for signal transduction 9.
Function GM-CSF promotes the growth and differentiation of neutrophils, eosinophils and monocytes from multipotential bone marrow precursors. It is also a growth factor for erythroid progenitors, endothelial cells, megakaryocytes and T cells lO-12. GM-CSF induces the tyrosine phosphorylation of a similar set of proteins as IL-3 13. GM-CSF activates p 2 l ras and induces glucose transport, ion fluxes and the expression of a variety of genes 14,1s
Comment The human CD116 gene has been mapped to the pseudoautosomal region of the X and Y chromosomes 16
Database accession numbers Human CDll6
PIR S06945
SWISSPR OT P15509
EMBL/GENBANK X17648
REFERENCE z
A m i n o acid s e q u e n c e of h u m a n CD 116 MLLLVTSLLL EKSDLRTVAP RLSNNECSCT NFSCFIYNAD GTHVGCHLDN NVTVRCNTTH VSGDLENRYN GSVYIYVLLI EDEIIWEEFT
CELPHPAFLL ASSLNVRFDS FREICLHEGV LMNCTWARGP LSGLTSRNYF CLVRWKQPRT FPSSEPRAKH VGTLVCGIVL PEEGKGYREE
IP RTMNLSWDCQ TFEVHVNTSQ TAPRDVQYFL LVNGTSREIG YQKLSYLDFQ SVKIRAADVR GFLFKRFLRI VLTVKEIT
ENTTFSKCFL RGFQQKLLYP YIRNSKRRRE IQFFDSLLDT YQLDVHRKNT ILNWSSWSEA QRLFPPVPQI
TDKKNRVVEP NSGREGTAAQ IRCPYYIQDS KKIERFNPPS QPGTENLLIN IEFGSDDGNL KDKLNDNHEV
-i 50 i00 150 200 250 300 350 378
The accession numbers and amino acid sequence of CDwl31, that is common to the IL-3R, IL-5R and GM-CSFR, are given in the IL-3R entry.
References
! !
1 Rappold, G. et al. (1992) Genomics 14, 455-461. z Gearing, D.P. et al. (1989)EMBO J. 8, 3667-3676.
3 Nicola, N.A. and Metcalf, D. (1991) Cell 67, 1-4. 4 s 6 7 s
Hayashida, K. et al. (1990) Proc. Natl Acad. Sci. USA 87, 9655-9659. Ashworth, A. and Kraft, A. (1990)Nucleic Acids Res. 18, 7178. Raines, M.A. et al. (1991) Proc. Natl Acad. Sci. USA 88, 8203-8207. Crosier, K.E. et al. (1991) Proc. Natl Acad. Sci. USA 88, 7744-7748. Witthuhn, B.A. et al. (1993)Cell 74, 227-236.
171
9 lo 11 lz la 14 is 16
172
Kitamura, T. et al. (1991) Proc. Natl Acad. Sci. USA 88, 5082-5086. Nicola, N.A. et al. (1989) Annu. Rev. Biochem. 58, 45-77. Arai, K. et al. (1990) Annu. Rev. Biochem. 59, 783-836. Callard, R.E. and Gearing, A.J.H. (1994)The Cytokine FactsBook. Academic Press, London. Isfort, R.J. and Ihle, J.N. (1990) Growth Factors 2, 213-220. Satoh, T. et al. (1991) Proc. Natl Acad. Sci. USA 88, 3314-3318. Vairo, G. and Hamilton, J.A. (1991) Irnmunol. Today 12, 362-369. Gough, N.M. et al. (1990) Nature 345, 734-736.
GIyCAM-1
Glycosylation-dependent cell adhesion molecule 1, Sgp50 NH2
Molecular weights Polypeptide
14154
SDS-PAGE reduced unreduced
50 kDa 50 kDa
Carbohydrate N-linked sites O-linked
1 +++
Mouse gene location and size 15; -2.5 kb 1
COOH
Isl Exon boundaries PGsil[1 I
AlP
AAT
Tissue distribution GlyCAM-1 is expressed by peripheral and mesenteric lymph node (LN) high endothelial venules (HEVs), the lactating mammary gland, and unknown cells in the lung 2'3. Expression has also been reported by HEV-like vessels at sites of chronic inflammation 4. GlyCAM-1 is a secreted molecule and is detectable in blood and milk 3.
Structure GlyCAM-1 is a heavily O-glycosylated soluble secreted glycoprotein. Two segments within the extracellular region (residues 23-44 and 74-103) contain a high proportion (-50%)of serine and threonine residues and are likely to possess a mucin-like structure 2. The N-terminus of the mature polypeptide has been established by protein sequencing 2.
Ligands and associated m o l e c u l e s CD62L binds to carbohydrate structures present in some glycoforms of GlyCAM-1 expressed by HEVs. The exact ligand has yet to be identified but binding requires sialylation, sulfation and (probably) fucosylation of GlyCAM-1 carbohydrates. Notably, O-linked carbohydrates of GlyCAM-1 from LN HEV contain sulfated forms of sialylated Lewis x (sLex, Sia~2 --+3Galfll --* 4(Fuc~l ~ 3)GlcNAc), which contain all three groups s. In contrast, milk GlyCAM-1, which does not bind CD62L, is not sulfated 3.
Function As a ligand for CD62L, GlyCAM-1 was initially proposed to enhance lymphocyte adhesion to LN HEVs. However, the identification of GlyCAM-1
as a secreted molecule suggests that it may function instead to modulate CD62L-mediated adhesion 6. Consistent with an inhibitory role, expression of GlyCAM-1 in the draining LN is dramatically decreased following antigen priming 7. Soluble GlyCAM-1 can induce, via an interaction with cell surface CD62L, rapid activation of/12 (CD18)integrins on naive T cells s. Database accession numbers Mouse Rat
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A41908 A47167
Q02596 Q04807
M93428 L08100
2 9
Amino acid sequence of mouse GIyCAM-1 MKFFTVLLFV LPGSKDELQM ISKDNVVIES SQTVEEELGK
SLAATSLAL KTQPTDAIPA AQSTPTSYTS EESTSSKDLS KEPSIFREEL TKPENQEAQD GLRSGSSQLE ETTRPTTSAA TTSEENLTKS IIEGFVTGAE DIISGASRIT KS
References 1 Dowbenko, D. et al. (1993) J. Biol. Chem. 268, 4525-4529. 2 Lasky, L.A. et al. (1992)Cell 69, 927-938. 3 Lasky, L.A. (1995) Annu. Rev. Biochem. 64, 113-139. 4 0 n r u s t , S.V. et al. (1996) J. Clin. Invest. 97, 54-64. 5 Hemmerich, S. et al. (1995) J. Biol. Chem. 270, 12035-12047. 6 Brustein, M. et al. (1992) J. Exp. Med. 176, 1415-1419. 7 Hoke, D. et al. (1995) Curr. Biol. 5, 670-678. s Hwang, S.T. et al. (1996)J. Exp. Med. 184, 1343-1348. 9 Dowbenko, D. et al. (1993) J. Biol. Chem. 268, 14399-14403.
174
-i 50 i00 132
Molecular weights
NH 2
Polypeptide
21217
SDS-PAGE reduced unreduced
40-45 kDa 40-45 kDa
IIIIIIIIT
Carbohydrate N-linked sites O-linked
3 unknown
CTT
I
Domains
YQGI
i
CET I YSC ,,
IGI
T i s s u e distribution The antigen is not expressed on freshly isolated rat NK cells or T cells but its expression is selectively induced on NK cells in the presence of high concentrations of IL-2 in vitro 1.
Structure gp42 is a GPI-anchored antigen consisting of two IgSF C2-set domains. The Nterminus and site of addition of the GPI anchor have not been determined but are predicted below.
Function The function of gp42 is unknown. However, like other GPI-anchored molecules, gp42 may be capable of transmitting intracellular signals since anti-gp42 mAbs cross-linked with secondary antibodies induce increases in intracellular Ca ~§ and inositol phosphates in a rat leukaemic cell line 2. gp42 mAbs do not block the natural killer activity of NK cells 1.
Database accession numbers Rat
PIR JH03 72
SWISSPR OT P23505
EMBL/GENBANK X56448
REFERENCE 2
A m i n o acid s e q u e n c e of rat gp42 MLLWMVLLLC VSMTEA QELFQDPVLS RLNSSETSDL LLKCTTKVDP RSHNPLFFIS EANEENSGLY QCVVDAKDGT LTLQHEATNL AEGDKVKFLC ETQLGSLPIL ASLLISVKAE WSGKNYSCQA ENKVSRDISE ASMKSTTV VIWLPVSCLV GWPWLLRF
NKPASELFYS IQKKSDYLDI YSFYMDGEIL PKKFPLVVSG
FYKDNHIIQN DLCTSVSQPV GEPLAPSGRA T
References 1 Imboden, ].g. et al. (1989) ]. Immunol. 143, 3100-3103.
2 Seaman, W.E. et al. (1991) J. Exp. Med. 173, 251-260.
-i 50 i00 150 191 +26
r-
Molecular weights Polypeptide gp49B1 SDS-PAGE
E E
!
.....
gp49A gp49A
Carbohydrate N-linked sites O-linked
gp49B1
35032
NH 2
reduced 49 kDa unreduced 49 kDa
2
nil
Domains
gp49B1
I s[
Exon boundaries
CQG I YKC I
c2
L
[
CSS FRC I I
c2
TGA AGH
[TM [ CY[0
SE
N
QKDT
DGLSWNP
Tissue distribution gp49 is present on mouse mast cells and their precursors z.
F
Structure
F
The extracellular region consists of two IgSF domains, followed by a hydrophobic transmembrane domain and a cytoplasmic domain 2. The Nterminus of gp49A has been determined by protein sequencing 2. gp49A and gp49B1 differ in amino acid sequence by 9% except for an extra 32 residues at the C-terminal of gp49B1. The cytoplasmic domain of gp49B1 contains an ITIM motif. The 5.6kb gene which encodes gp49B1 is alternatively spliced to produce gp49B2 which lacks a transmembrane region 3.
Function The activation of bone marrow cultured mast cells by crosslinking of the FceRI, or with a calcium ionophore or with PMA leads to phosphorylation of serine residues on gp491.
I--!
Database accession numbers PIR
Mouse gp49B1
A5343
SWISSPR OT
EMBL/GENBANK
REFERENCE
U05264
a
Amino
,
i
acid sequence
MIAMLTVLLY GHLPKPIIWA PLETRNKAKF PSLSVYPSSN NQPSYATFVL QSSTPTEDGL VKNTQSENNA QLCIRTQEQN
LGLILEPRTA EPGSVIAAYT NIPSMTTSYA VTSGVSISFS DAVTPNHNGT ETYQKILIGV ELNSWNPQNE NS
of m o u s e g p 4 9 B 1 VQA SVITWCQGSW GIYKCYYESA CSSSIVFGRF FRCYGYFRNE LVSFLLLFFL DPQGIVYAQV
EAQYYHLYKE AGFSEHSDAM ILIQEGKHGL PQVWSKPSNS LLFLILIGYQ KPSRLQKDTA
KSVNPWDTQV ELVMTGAYEN SWTLDSQHQA LDLMISETKD YGHKKKANAS CKETQDVTYA
-i 50 I00 150 200 250 300 312
References 1 Katz, H.R. et al. (1989) J. Immunol. 142, 919-926. z Arm, J.P. et al. (1991) J. Biol. Chem. 266, 15966-15973. s Castells, M.C. et al. (1994) J. Biol. Chem. 266, 8393-8401.
177
HTm4 Molecularweights Polypeptide
~~t
22 971
Carbohydrate N-linked sites O-linked
P
~ nil unknown
~T~I
~l~~
Humangonelocation
~
~l~
NH2
COOH
1 lq12-q13.1
Tissue distribution Northern blot and reverse transcriptase-PCR analyses suggest that HTm4 mRNA expression is restricted to cells of haematopoietic origin 1.
Structure
l .......
HTm4 is a member of the CD20/Fc~RIfl superfamily of leucocyte surface antigens which includes CD20 and the fl subunit of the high-affinity receptor for IgE (FceRIfl). These molecules are predicted to have four transmembrane regions, cytoplasmic N- and C-termini, and short extracellular loops 1. The gene for HTm4 maps to the same region of the genome as CD20 and Fc~RIfl 1. The CD20/Fc~RIfl superfamily shares no sequence similarity with members of the TM4SF which also have four transmembrane regions.
Function HTm4 cDNA was identified as a marker of haematopoietic cells by subtractive hybridization screening of cDNA libraries. The function of HTm4 is not known, although the sequence similarity to CD20 and Fc~RIfl would suggest a role in signal transduction and/or as a subunit of receptor complexes 1.
Databaseaccessionnumbers PIR
SWISSPR OT
Human
EMBL/GENBANK
REFERENCE
L35848
1
Amino acid sequence of human HTm4 MASHEVDNAE LGAIQILNAA SSGTLSVVAG RSCHSSSESP CNSREEISSP
LGSASAHGTP MILALGVFLG IKPTRTWIQN DLCNYMGSIS PNSV
GSETGPEELN SLQYPYHFQK SFGMNIASAT NGMVSLLLIL
TSVYHPINGS HFFFFTFYTG IALVGTAFLS TLLELCVTIS
PDYQKAKLQV YPIWGAVFFC LNIAVNIQSL TIAMWCNANC
Reference 1 Adra, C.N. et al. (1994) Proc. Natl Acad. Sci. USA 91, 10178-10182.
178
50 i00 150 200 214
Interferon ~ receptor
Subunits CD 119 (a chain) IFN? accessory factor 1 (IFN? AF-1)
CD119
F3[ ~
Molecular weights Polypeptide CDll9 52563 IFN? AF-1 35 034 SDS-PAGE reduced F
Carbohydrate N-linked sites
E
O-linked
IFN~,AF- 1
CD 119
90-100 kDa
CDll9 IFN? AF- 1 CD 119 IFN2 AF-1
5 6 probable + probable +
F~33 ~0
COOH COOH
E
Human gene location CD 119:6q23-q24 IFN? AF- 1" 21 q22.1-q22.2
cm19
IFNy AF-1
Domains Domains
Isl I sl
WEY I NSLI F3
WEP I FNVI F3
FHP
[
I
SQYI
F~
ITMIcu
FSS I
I ,.RVYI F3
ITM ICu
Tissue distribution The IFNTR is expressed on monocytes, macrophages, T cells, B cells, NK cells, neutrophils, fibroblasts, epithelial cells, endothelium and a wide range of tumour cells 1,2
Structure The functional IFNyR consists of a complex formed between CD 119 and IFN2 AF-1. Both C D l l 9 and IFN2 AF-1 belong to the Class II cytokine receptor family, each consisting of two extracellular fibronectin type III domains, followed by a transmembrane region and a 222 amino acid (CD119) or 69 amino acid (IFN? AF-1) cytoplasmic tail 1,a. The WSXWS motif, characteristic of Class I cytokine receptors, is absent from both fibronectin type III domains in CD 119 and IFN? AF- 1.
Ligands and associated molecules The IFN?R binds IFN? with high affinity (Kd = 1 nM-10 pM)4. Although CD119 alone can bind IFN? with high affinity (Kd=50pM) the species-specific accessory factor (IFN? AF-1), which interacts with the extracellular region of C D l l 9 , is required for signal transduction l'a. In Colo-205 cells, a human
17S
IFN~/R
adenocarcinoma cell line, the IFNTR receptor is constitutively phosphorylated on Ser and Thr residues and its phosphorylation is enhanced by IFN7 or phorbol ester s .
Function IFN7 plays a key role in the initiation and effector phases of i m m u n e responses, including macrophage activation, B and T cell differentiation, activation of NK cells and upregulating the expression of MHC Class I and II antigens in several cell types 2,6.
Comments
I,
The IFN7 AF-1 protein reconstitutes IFNT-induced Class I MHC expression, but not viral resistance, when transfected into heterologous cells together with the human C D l l 9 and HLA-B7 genes 3. This suggests that other accessory factors are necessary for IFNT-mediated activities such as antiviral responses. Despite normal T cell responses, CDll9-deficient mice show increased susceptibility to infection by Listeria m o n o c y t o g e n e s and vaccinia virus, but are resistant to endotoxic shock 7,8.
Database accession numbers PIR
Human CDll9 Human IFN7 AF-1 Mouse CDll9
SWISSPR OT
A 3 1 5 5 5 P15260 P38484 A 3 4 3 6 8 P15261
EMBL/GENBANK
REFERENCE
J03143 U05875, U05877 M26711
1 3 9
Amino acid sequence of human CD 119 MALLFLLPLV EMGTADLGPS VKNSEWIDAC EFAVCRDGKI YIRVYNVYVR EGVLHVWGVT FYIKKINPLK KEVVCEEPLS GSHLTPIERE CLESHSSLSD VDDSGKESLI
MQGVSRA SVPTPTNVTI INISHHYCNI GPPKLDIRKE MNGSEIQYKI TEKSKEVCIT EKSIILPKSL PATVPGMHTE SSSPLSSNQS SEFPPNNKGE GYRPTEDSKE
ESYNMNPIVY SDHVGDPSNS EKQIMIDIFH LTQKEDDCDE IFNSSIKGSL ISVVRSATLE DNPGKVEHTE EPGSIALNSY IKTEGQELIT FS
VFTVEVKNYG QKESAYAKSE EVDYDPETTC SLNSQYCVSA FLVLSLVFIC ITSYQPFSLE TEENIPDVVP SRNGFDTDSS DKPHVLVDLL
-i 50 i00 150 200 250 300 350 400 450 472
TRPVVYRVQF KYTDSKWFTA DFNVTLRLRA ELGALHSAWV RFSSPFDIAD TSTAFFCYYV YCLQVQAQLL WNKSNIFRVG LLSVLAGACF FLVLKYRGLI KDSSPKDDVWDSVSIISFPE
-i 50 I00 150 200 250 300 310
WEYQIMPQVP LWVRVKARVG PSVFVNGDEQ IQCQLAIPVS WIPVVAALLL TKPESKYVSL ELSSITEVVT HSRNCSESDH VIKAPTSFGY
Amino acid sequence of human IFN7 AF-1 MRPTLLWSLL SQLPAPQHPK DIMSIGVNCT TMPWFQHYRN HYWEKGGIQQ HLSNISCYET KYWFHTPPSI KEQEDVLQTL
18(]
LLLGVFAAAA IRLYNAEQVL QITATECDFT VTVGPPENIE VKGPFRSNSI MADASTELQQ PLQIEEYLKD
AAPPDPL SWEPVALSNS AASPSAGFPM VTPGEGSLII SLDNLKPSRV VILISVGTFS PTQPILEALD
IFNTR
! I !
! ! i
! i .....
1
References 1 Aguet, M. et al. (1988) Cell 55, 273-280. 2 Callard, R.E. and Gearing, A.J.H. (1994) The Cytokine FactsBook. Academic Press, London. 3 Soh, J. et al. (1994) Cell 76, 793-802. 4 Langer, J.A. and Pestka, S. (1988) Immunol. Today 9, 393-400. s Khurana Hershey, G.K. et al. (1990) J. Biol. Chem. 265, 17868-17875. 6 Arai, K. et al. (1990) Annu. Rev. Biochem. 59, 783-836. 7 Huang, S. et al. (1993) Science 259, 1742-1745. a Car, B.D. et al. (1994)J. Exp. Med. 179, 1437-1444. 9 Gray, P.W. et al. (1989) Proc. Natl Acad. Sci. USA 86, 8497-8501.
181
IL-IR
Interleukin 1 receptor
Subunits CD121a (IL-1R type I) CDwl 2 lb (IL-1R type II) IL-1R accessory protein (IL-1R AcP)
CD121a
IL-1R AcP
CDw121b
Molecular weights Polypeptide CD 121a 63 487 CDwl21b 43 988 IL-1R AcP 63 428 SDS-PAGE reduced unreduced
T TTTT T T T TTTT: TTTT
CD 121 a 80 kDa CDwl21b 60-70 kDa IL- 1R AcP 70-90 kDa
Carbohydrate N-linked sites CD121a CDwl21b IL-1R AcP O-linked CD 12 la CDw 12 lb IL- 1R AcP
COOH
COOH
COOH
6 5 7 unknown unknown unknown
= = = = = =
Human gene location and size CD121a: 2q12; -75 kb CDwl21b: 2q12-q22; -38 kb 1
=
Domains
CD121a Exon boundaries
Domains
IL-1R AcP
Domains CDw121b Exon boundaries
Isl
CPL I YYCI
c~
I1
ADK
12
I
CPY I YTC
02
CNV
I
"1
VRN
Isl
I
I1
TGA
CPD
YVC, I 02 I 12
TRN
CKV
YRC,
c2 ,o
YKDS
K
I
I
ITMIcu
1 PVT
"s\,EDI
YVC
c~
CKV
I IK1K[
i
YSV
LGS
I
FTC
02
,o LEE 1 YKDC
CPL CPN I YTC I YTCI i v Is I I c~
CPQ
I
I
i
FKCI
c2 12
ITMIcu
ITMICu ,,
1 RQE EAS LGS
Tissue distribution CD121a is expressed by T cells, thymocytes, fibroblasts, chondrocytes, synovial cells, hepatocytes, endothelial cells and keratinocytes 2. CDwl21b is predominantly expressed by B cells, monocytes, macrophages and neutrophils a,4. CDwl21b mRNA is present in a number of cells, including T cells 4.
182
IL-1R
Structure
I
! I
The extracellular region of both CD 121a and C D w l 2 lb consists of three C2-set IgSF domains and the molecules have 28% amino acid sequence identity 2'4. CD121a has a 213 amino acid cytoplasmic tail that is highly conserved across species (78% sequence identity between human and mouse) and is necessary for signal transduction 2"s. In contrast, the cytoplasmic tail of CDwl21b is only 29 amino acids long 4. The N-terminal IgSF domains of CD121 a and CDwl21b are encoded by an exon flanked with the normal phase 1 intron/exon boundary at the N-terminus, but with an unusual phase 2 intron predicted between the F and G strands rather than after the G strand. Soluble forms of CD121a and CDwl21b have been identified 6'7. The IL-1R AcP also contains three extracellular IgSF domains 8.
Ligands and associated molecules F
Both CD121a and CDwl21b bind IL-I~ and IL-lfl, but with different affinities 2,4. A second subunit of the IL-1R complex has been cloned from mouse 3T3-L1 cells, the widely expressed IL-1R AcP 8. There is approximately 25% overall sequence homology between mouse IL-1R AcP and both CD121a and CDwl21b from human, mouse and rat. A complex of CD121a and IL-1R AcP forms a high-affinity IL-1 binding site at the cell surface 8. Alternative splicing of the IL-1R AcP gene may generate a soluble form of the molecule. There is an IL-1 receptor antagonist (IL-lra) that inhibits the function of IL-1 in vivo and in vitro by binding to both CD121a and CDwl21b 9"1~ The IL-lra is unable to stimulate the kinase activity or the internalization of CD121a 11
Function IL-1 mediates thymocyte and T cell activation, fibroblast proliferation, induction of acute phase proteins and inflammatory reactions through binding to CD121a 12,13. IL-1 bound to CD121a on fibroblasts induces phosphorylation of several proteins, including the EGF receptor and the heat shock protein p27 2,14. CDwl21b appears to be dispensable for IL-1 signalling and may act as a decoy receptor is
Comments Vaccinia virus contains an open reading frame with strong resemblance to a soluble form of CDwl21b 16. The genes encoding human CD121a and CDwl21b are linked on the same chromosome, together with the genes encoding IL-la and IL-lfl 4. The ST2 antigen (also termed T1 or Fit-l) has approximately 25% amino acid sequence identity to CD121a and CDwl21b 17. The extracellular region of ST2 contains three C2-set IgSF domains and the molecule exists in both a soluble and membrane bound form 17,18. ST2 does not appear to function as a receptor for either IL-I~, ILlfl or IL-lra, but binds to a cell surface ligand expressed by a range of cell types 19 .
182
r
Database accession numbers PIR H u m a n CD121a Human CDwl21b M o u s e CD121a Mouse CDwl21b M o u s e IL-1R AcP Rat C D 1 2 1 a Rat C D w l 2 1 b
A36187 S17428 A32604
SWISSPROT
EMBL/GENBANK
REFERENCE
P14778 P27930 P13504 P27931
M27492 X59770 M20658 X59769 X85999 M95578 Z22812
2 4 2o 4 s 21 22
Q02955 P43303
Amino acid sequence of human CD121a MKVLLRLICF LEADKCKERE TEQASRIHQH PNLCYNAQAI LLDNIHFSGV LEENKPTRPV DEDDPVLGED HGIDAAYIQL RDSCYDFLPI QCGYKLFIYG SEEQIAMYNA DFTQGPQSAK LG
IALLISS EKIILVSSAN KEKLWFVPAK FKQKLPVAGD KDRLIVMNVA IVSPANETME YYSVENPANK IYPVTNFQKH KASDGKTYDA RDDYVGEDIV LVQDGIKVVL TRFWKNVRYH
EIDVRPCPLN VEDSGHYYCV GGLVCPYMEF EKHRGNYTCH VDLGSQIQLI RRSTLITVLN MIGICVTLTV YILYPKTVGE EVINENVKKS LELEKIQDYE MPVQRRSPSS
PNEHKGTITW VRNSSYCLRI FKNENNELPK ASYTYLGKQY CNVTGQLSDI ISEIESRFYK IIVCSVFIYK GSTSDCDIFV RRLIIILVRE KMPESIKFIK KHQLLSPATK
YKDDSKTPVS KISAKFVENE LQWYKDCKPL PITRVIEFIT AYWKWNGSVI HPFTCFAKNT IFKIDIVLWY FKVLPEVLEK TSGFSWLGGS QKHGAIRWSG EKLQREAHVP
-i 50 i00 150 200 250 300 350 400 450 500 550 552
PYWLWASVSP GTYVCTTRNA DLSEFTRDKT YYRCVLTFAH LTIPCKVFLG NYIEVPLIFD LAPLSLAFLV
-I 50 i00 150 200 250 300 350 385
Amino acid sequence of human CDwl21b MLRLYVLVMG FTLQPAAHTG RINLTWHKND SYCDKMSIEL DVKIQWYKDS EGQQYNITRS TGTPLTTMLW PVTREDLHMD LGGIWMHRRC
.~--.
VSA AARSCRFRGR SARTVPGEEE RVFENTDAFL LLLDKDNEKF IELRIKKKKE WTANDTHIES FKCVVHNTLS KHRTGKADGL
HYKREFRLEG TRMWAQDGAL PFISYPQILT LSVRGTTHLL ETIPVIISPL AYPGGRVTEG FQTLRTTVKE TVLWPHHQDF
EPVALRCPQV WLLPALQEDS LSTSGVLVCP VHDVALEDAG KTISASLGSR PRQEYSENNE ASSTFSWGIV QSYPK
Amino acid sequence of mouse IL-1R AcP
. ,ea..1
MGLLWYLMSL SFYGILQSHA SERCDDWGLD T M R Q I Q V F E D YWTRQDRDLE E P I N F R L P E N YCSKVAFPLE V V Q K D S C F N S VKPSVTWYKG CTEIVDFHNV RLFHLTRTVT VKVVGSPKDA SFIMDSHNEVWWTIDGKKPD V T P E D L R R N Y VCHARNTKGE VVVLIVVYHV YWLEMVLFYR VLLTLRGVLE NEFGYKLCIF NYVLQGTQAL LELKAGLENM L T V I K W K G E K SKYPQGRFWK
184
EPARIKCPLF RISKEKDVLW AMRFPVHKMY LPEGMNLSFF LPPQIYSPND DVTVDITINE AEQAAKVKQK AHFGTDETIL DRDSLPGGIV ASRGNINVIL QLQVAMPVKK
EHFLKYNYST FRPTLLNDTG IEHGIHKITC IPLVSNNGNY RVVYEKEPGE SVSYSSTEDE VIPPRYTVEL DGKEYDIYVS TDETLSFIQK VQYKAVKDMK SPRWSSNDKQ
AHSSGLTLIW NYTCMLRNTT PNVDGYFPSS TCVVTYPENG ELVIPCKVYF TRTQILSIKK ACGFGATVFL YARNVEEEEF SRRLLVVLSP VKELKRAKTV GLSYSSLKNV
-i 50 I00 150 200 250 300 350 400 450 500 550
IL-1R
References 1 2 3 4 s 6 7 s 9 lo 11 12 ~3 ~4 ~s 16 17 is 19 2o 21 22
Sims, J.E. et al. (1995) Cytokine 7, 483-490. Sims, J.E. et al. (1989) Proc. Natl Acad. Sci. USA 86, 8946-8950. Spriggs, M.K. et al. (1990) J. Biol. Chem. 265, 22499-22505. McMahan, C.J. et al. (1991) EMBO J. 10, 2821-2832. Curtis, B.M. et al. (1989) Proc. Natl Acad. Sci. USA 86, 3045-3049. Svenson, M. et al. (1995) Eur. J. Immunol. 25, 2842-2850. Symons, J.A. et al. (1995) Proc. Natl Acad. Sci. USA 92, 1714-1718. Greenfeder, S.A. et al. (1995) J. Biol. Chem. 270, 13757-13765. Eisenberg, S.P. et al. (1990) Nature 343, 341-346. Dinarello, C.A. and Thompson, R.C. (1991)Immunol. Today 12, 404-410. Dripps, D.J. et al. (1991) J. Biol. Chem. 266, 10331-10336. Di Giovine, F.S. and Duff, G.W. (1990)Immunol. Today 11, 13-20. Callard, R.E. and Gearing, A.J.H. (1994) The Cytokine FactsBook. Academic Press, London. Kaur, P. et al. (1989) FEBS Lett. 258, 269-273. Colotta, F. et al. (1993) Science 261, 472-475. Alcami, A. and Smith, G.L. (1995) Immunol. Today 16, 474-478. Tominaga, S. (1989) FEBS Lett. 258, 301-304. Bergers, G. et al. (1994) EMBO J. 13, 1176-1188. Gayle, M.A. et al. (1996) J. Biol. Chem. 271, 5784-5789. Sims, J.E. et al. (1988) Science 241, 585-589. Hart, R.P. et al. (1993)J. Neuroimmunol. 44, 49-56. Bristulf, J. et al. (1994) Eur. Cytokine Network 5, 319-330.
IL-2R
Interleukin 2 receptor
Subunits
CD25
CD122
CD132
CD25 (a chain)
CD122 (fl chain) CD 132 (7c chain)
: : : : : : :
: : : : : : : :
Other names CD25: Tac antigen, p55
: : : : : : : : : : : : : : : :
F3
CD122:p75
: : : :
CD 132: cytokine receptor common 7 chain, p64 Molecular weights Polypeptide CD25 CD122 CD 132
TT
TTTT
(L COOH E~~I~I(:LI (L(L&
28 447 58359 39 920
COOH
COOH
SDS-PAGE
reduced
Carbohydrate N-linked sites
: : : : :
: :
CD25 CD122 CD 132
55 kDa 70-75 kDa 64 kDa
CD25 CD122 CD132
2 4 6
:
O-linked
:
: :
r
CD25
abundant +
CD 122 CD 132
unknown unknown
Human gene location and size CD25" 10p 14-p 15; >25 kb 1 CD122: 22ql 1.2-q13; 24.3 kb 2 CD 132: Xq 13; 4.2 kb 3
0D25
Domains
Domains CD122 Exon boundaries Domains 0D132 Exon boundaries
CRE
CQC
Isll Exon boundaries
18d
CDD I
c
AEL
,
I
SAT CVWwA C
ISlll
I
CK
,
I
I
PGH WEI
I
LIC
TQYI
F3
1 I
PGE
ITMICY I I1 \2
h AVA RQR TDF i
ITMIcYI
NGT ~2 PDSQ I0 I0 I1 /2 ~ 0 1 WEEAAAL RRR NLR PWL VQKW CTW WAP I QKCI I KRYI Isl CK I F3 ITMICYI I1 12 I1 Io LVI WTEQ 1 ERTXO ADF YWY KEN FSAW
IL-2R
Tissue distribution The IL-2R is expressed on activated cells including T cells, B cells and monocytes. The CD25 subunit of the human IL-2R is also present on a subset of thymocytes, HTLV-I transformed T and B cells, EBV transformed B cells, myeloid precursors and oligodendrocytes. Natural killer (NK)cells, certain B cell lines and a subpopulation of resting T cells constitutively express the CD122 subunit, which is upregulated 5- to 10-fold following T cell activation. IL-2 induces the expression of the CD25 subunit on NK cells 4,s
Structure E
I,
The IL-2R exists in three alternative forms made up from the individual components of CD25, CD122 and CD132. CD25 contains two CCP domains, is rich in O-linked carbohydrates, and has a short cytoplasmic tail (13 amino acids) 6. The extracellular region of CD122 consists of a cytokine receptor domain and a fibronectin type III domain that contains the WSXWS motif. Its cytoplasmic tail (286 amino acids) contains Ser-rich, acidic and Pro-rich regions 7. Similarly, the extracellular region of CD132 also contains a cytokine receptor domain and a fibronectin type III domain containing the WSXWS sequence. Its cytoplasmic tail (86 amino acids) has a region of limited homology to SH2 subdomains 4 and 5 of Src-related kinases n. CD 132 is also a component of the IL-4R, IL-7R, IL-9R and IL-15R.
Ligands and associated molecules The functional high affinity IL-2R (Kd = 10 pM) is composed of a non-covalently associated CD25/CD122/CD132 heterotrimer. The isolated CD25 subunit constitutes a low-affinity IL-2R (Kd=10nM), while the CD122/CD132 heterodimer binds IL-2 with intermediate affinity (Kd= 1 nM). Both the highand intermediate-affinity receptors are important for IL-2 signalling.
Function IL-2 induces the activation and proliferation of T cells, thymocytes, NK cells, B cells and macrophages 4,9. The WSXWS motif of CD122 is essential for IL-2 binding, while its cytoplasmic Ser-rich region is necessary for signal transduction ~o,~. CD 122 associates with the Lck tyrosine kinase through its cytoplasmic acidic region tt. The interaction of IL-2 with its high-affinity receptor induces Tyr phosphorylation of CD 122 and several other proteins it. CD25 is phosphorylated on Ser and Thr residues 4.
Comments Proteolytic cleavage of membrane-bound CD25 generates a soluble form present in human serum s. Mutations in the CD132 gene are responsible for X-linked severe combined immunodeficiency (SCID) in humans 12.
~87
IL-2R
Database accession numbers Human CD25 Human CD122 Human CD132 Mouse CD25 Mouse CD122 Mouse CD132 Rat CD25 Rat CD 122
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
A01856 A30342 A42565 A01857 A35052 JN0592 A46535
P01589 P14784 P31785 P01590 P16297 P34902 P26897 P26896
X01057 M26062 Dl1086 K02891 M28052 L20048 M55049 M55050
6 7 s la ~s 16 ~4 ~a
Amino acid sequence of human CD25 MDSYLLMWGL ELCDDDPPEI SHSSWDNQCQ PGHCREPPPW HGKTRWTQPQ IQTEMAATME I
st
LTFIMVPGCQ PHATFKAMAY CTSSATRNTT ENEATERIYH LICTGEMETS TSIFTTEYQV
A KEGTMLNCEC KQVTPQPEEQ FVVGQMVYYQ QFPGEEKPQA AVAGCVFLLI
KRGFRRIKSG KERKTTEMQS CVQGYRALHR SPEGRPESET SVLLLSGLTW
SLYMLCTGNS PMQPVDQASL GPAESVCKMT SCLVTTTDFQ QRRQRKSRRT
-I 50 i00 150 200 250 251
RRRWNQTCEL RVMAIQDFKP FEARTLSPGH FTTWSPWSQP CRNTGPWLKK GLAPEISPLE FHLPDALEIE AYCTFPSRDD WDPQPLGPPT PGQGEFRALN
-i 50 i00 150 200 250 300 350 400 450 500 525
Amino acid sequence of human CD122 MAAPALSWRL AVNGTSQFTC LPVSQASWAC FENLRLMAPI TWEEAPLLTL LAFRTKPAAL VLKCNTPDPS VLERDKVTQL ACQVYFTYDP LLLFSPSLLG PGVPDLVDFQ ARLPLNTDAY
PLLILLLPLA FYNSRANISC NLILGAPDSQ SLQVVHVETH KQKQEWICLE GKDTIPWLGH KFFSQLSSEH LLQQDKVPEP YSEEDPDEGV GPSPPSTAPG PPPELVLREA LSLQELQGQD
TSWASA VWSQDGALQD KLTTVDIVTL RCNISWEISQ TLTPDTQYEF LLVGLSGAFG GGDVQKWLSS ASLSSNHSLT AGAPTGSSPQ GSGAGEERMP GEEVPDAGPR PTHLV
TSCQVHAWPD RVLCREGVRW ASHYFERHLE QVRVKPLQGE FIILVYLLIN PFPSSSFSPG SCFTNQGYFF PLQPLSGEDD PSLQERVPRD EGVSFPWSRP
Amino acid sequence of human CD132 ly
/"/i
MLKPSLPFTS LNTTILTPNG TWNSSSEPQP HLYQTFVVQL NWNNRFLNHC VRSRFNPLCG IISLLCVYFW PDYSERLCLV
LLFLQLPLLG NEDTTADFFL TNLTLHYWYK QDPREPRRQA LEHLVQYRTD SAQHWSEWSH LERTMPRIPT SEIPPKGGAL
VG TTMPTDSLSV NSDNDKVQKC TQMLKLQNLV WDHSWTEQSV PIHWGSNTSK LKNLEDLVTE GEGPGASPCN
STLPLPEVQC SHYLFSEEIT IPWAPENLTL DYRHKFSLPS ENPFLFALEA YHGNFSAWSG QHSPYWAPPC
FVFNVEYMNC SGCQLQKKEI HKLSESQLEL VDGQKRYTFR VVISVGSMGL VSKGLAESLQ YTLKPET
References 1 Leonard, W.J. et al. {1985) Science 230, 6 3 3 - 6 3 9 . 2 Shibuya, H. et al. {1990) N u c l e i c Acids Res. 18, 3 6 9 7 - 3 7 0 3 . a Noguchi, M. c t a l . {1993)J. Biol. C h e m . 268, 13601-13608. 4 Waldmann, T.A. (1989)Annu. Rev. Biochem. 58, 8 7 5 - 9 1 1 . s W a l d m a n n , T.A. (1991) J. Biol. C h e m . 266, 2 6 8 1 - 2 6 8 4 .
188
-i 50 I00 150 200 250 300 347
r--
6 7 8 9 lo 11 le 13 14 zs 16
Leonard, W.J. et al. (1984)Nature 311,626-631. Hatakeyama, M. et al. (1989) Science 244, 551-556. Takeshita, T. et al. (1992) Science 257, 379-382. Callard, R.E. and Gearing, A.J.H (1994) The Cytokine FactsBook. Academic Press, London. Miyazaki, T. (1991) EMBO J. 10, 3191-3197. Hatakeyama, M. et al. (1991) Science 252, 1523-1528. Noguchi, M. et al. (1993) Cell 73, 147-157. Miller, J. et al. (1985) J. Immunol. 134, 4212-4217. Page, T.H. and Dallman, M.J. (1991) Eur. J. Immunol. 21, 2133-2138. Kono, T. et al. (1990) Proc. Natl Acad. Sci. USA 87, 1806-1810. Cao, X. et al. (1993) Proc. Natl Acad. Sci. USA 90, 8464-8468.
18~
Interleukin 3 receptor, m u l t i c o l o n y - s t i m u l a t i n g factor receptor
IL-3R
Subunits CDw 123 (~ chain) CDwl31 (tic chain)
CDw123
Molecular weights Polypeptide CDw123 CDwl31 Carbohydrate N-linked sites O-linked
CDw131
41282 95 707
CDw123 CDwl31 CDw123 CDwl31
6 3 unknown unknown
TTTTT<1TIT
Human gene location and size CDw 123: Xp22.3, Yp 13.3; -40 kb 1 CDw131" 22q12.2-q13.1
COOH
c
COOH Domains
[
Isl
CDw123 Exonboundaries Domains
CDw131
CSW
Isl
I1
I
Io
CK
I1
NSG ANR
CRW
WSV
I
CK
SACI
I
I
SWI
Fa
I
I
12
EDP MPAV
WK%T Y
IGCI
~
I,~ I cyl
~ /ol '1 i2\o I I QKRM o FEC RRY
KLVV
EQVR
CSW
I
I
HHCI
CK
WET
I
I
TRYI
F3
ITMIcYI
Tissue distribution The IL-3R is expressed on bone marrow multipotential haematopoietic precursors, on neutrophil, basophil, eosinophil, monocyte and megakaryocyte committed precursors and on the erythroid lineage. It is also present on some myelocytic leukaemias and pre-B lymphomas and leukaemias 2. ?-----
Structure The IL-3R is formed by the association of CDw123 and a common fl chain (tic chain, CDwl31) that is also a component of the receptors for GM-CSF and IL-5 3,4. CDwl31 does not bind any of these cytokines in isolation s. Based on crosslinking experiments the estimated molecular weights of CDw123 and CDwl31 are 70kDa and 120-140kDa respectively 3. CDw123 consists of an N-terminal region of about 100 amino acids, similar in sequence to that present in CDw125 (IL-5R a chain)and C D l l 6 (GM-CSFR chain), followed by a cytokine receptor domain and a fibronectin type III domain a'6"z. The extracellular region of CDwl31 can be divided into two homologous units, each one containing a cytokine receptor domain and a fibronectin type III domain s,6. The cytoplasmic tail of CDwl31 contains Pro- and Ser-rich regions, similar to those found in other cytokine receptor
E
19(
subunits including CD122 (IL-2R ]3 chain), CD124 (IL-4R a chain) and CD114 (G-CSFR) 8.
Ligands and associated molecules CDw123 binds IL-3 with very low affinity (Kd = 80-90 nM), while its association with CDw 131 generates a high affinity receptor for IL-3 (Kd = l l 0-180 pM) 3. A 110 kDa Ser/Thr kinase that is activated following IL-3 stimulation was shown to be constitutively associated with the IL-3R 9.
Function IL-3 promotes the growth and differentiation of multipotential haematopoietic precursors and of erythroid, neutrophil, basophil, monocyte, eosinophil and megakaryocyte committed precursors. In the mouse, IL-3 also activates mast cells and pre-B cells lo,11. IL-3 induces the Tyr phosphorylation of several proteins and activates Ras 12'13. The signalling mechanisms of the IL-3 receptor are reviewed in ref. 14.
Comments In the mouse, two proteins with 91% amino acid sequence identity have been identified: the AIC2A and the AIC2B proteins 8,1s. Both AIC2A and AIC2B can associate with the murine CDw123 homologue to form distinct high-affinity IL-3Rs 16. The AIC2B protein is a common component of the murine IL-3R, IL-5R and GM-CSFR 3,4 The overall sequence of both proteins is similar to that of human CDwl31 s,8,1s. The human CDw123 gene has been mapped to the pseudoautosomal region of the X and Y chromosomes 17
Database accession numbers H u m a n CDw123 Human CDwl31 Mouse CDw123 M o u s e AIC2A M o u s e AIC2B
5<
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A40266 A39255 $22909 A40091 A35782
P26951 P32927 P26952 P26954 P26955
M74782 M59941 X64534 M29855 M34397
3 s 16 8 is
A m i n o acid sequence of h u m a n C D w 1 2 3 MVLLWLTLLL TKEDPNPPIT QFGAISLCEV FLSCSWAVGP DISRLSSGSQ KTHSFMHWKM TVQIRARERV VCVFVICRRY VTEVQVVQKT
IALPCLLQ NLRMKAKAQQ TNYTVRVANP GAPADVQYDL SSHILVRGRS RSHFNRKFRY YEFLSAWSTP LVMQRLFPRI
LTWDLNRNVT PFSTWILFPE YLNVANRRQQ AAFGIPCTDK ELQIQKRMQP QRFECDQEEG PHMKDPIGDS
DIECVKDADY NSGKPWAGAE YECLHYKTDA FVVFSQIEIL VITEQVRDRT ANTRAWRTSL FQNDKLVVWE
SMPAVNNSYC NLTCWIHDVD QGTRIGCRFD TPPNMTAKCN SFQLLNPGTY LIALGTLLAL AGKAGLEECL
-i 50 i00 150 200 250 300 350 360
191
IL-3R
Amino acid sequence of human CDwl31 MVLAQGLLSM WERSLAGAEE EDLLEPVSCD RPLGTRLTVT GDLEFEVVYK APGSRLSGRP EVASSVSFGL GQYIVSVQPR HIDHTFEIQY SRTGYNGIWS GYRLRRKWEE RFPELEGVFP PSPQPGPPAA SQKSPPPGSL SLESGGGPAP LVFTPNSGAS VELPPIEGRS QQVGDYCFLP QVPVIQLFKA
ALLALC TIPLQTLRCY LSDDMPWSAC LTQHVQPPEP RLQDSWEDAA SKWSPEVCWD FYKPSPDAGE RAEKHIKSSV RKDTATWKDS EWSEARSWDT KIPNPSKSHL VGFGDSEVSP SHTPEKQASS EYLCLPAGGQ PALGPRVGGQ SVSLVPSLGL PRSPRNNPVP GLGPGPLSLR LKQQDYLSLP
NDYTSHITCR PHPRCVPRRC RDLQISTDQD ILLSNTSQAT SQPGDEAQPQ EECSPVLREG NIQMAPPSLN KTETLQNAHS ESVLPMWVLA FQNGSAELWP LTIEDPKHVC FDFNGPYLGP VQLVPLAQAM DQKDSPVAIP PSDQTPSLCP PEAKSPVLNP SKPSSPGPGP PWEVNKPGEV
WADTQDAQRL VIPCQSFVVT HFLLTWSVAL LGPEHLMPSS NLECFFDGAA LGSLHTRHHC VTKDGDSYSL MALPALEPST LIVIFLTTAV PGSMSAFTSG DPPSGPDTTP PHSRSLPDIL GPGQAVEVER MSSGDTEDPG GLASGPPGAP GERPADVSPT EIKNLDQAFQ C
VNVTLIRRVN DVDYFSFQPD GSPQSHWLSP TYVARVRTRL VLSCSWEVRK QIPVPDPATH RWETMKMRYE RYWARVRVRT LLALRFCGIY SPPHQGPWGS AASDLPTEQP GQPEPPQEGG RPSQGAAGSP VASGYVSSAD GPVKSGFEGY SPQPEGLLVL VKKPPGQAVP
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 881
References 1 2 3 4 s 6 7 8 9 lo 11 12 ~3 14 is 16 lr
192
Kosugi, H. et al. (1995)Biochem. Biophys. Res. Commun. 208, 360-367. Park, L.S. et al. (1989) J. Biol. Chem. 264, 5420-5427. Kitamura, T. et al. (1991) Cell 66, 1165-1174. Nicola, N.A. and Metcalf, D. (1991) Cell 67, 1-4. Hayashida, K. et al. (1990) Proc. Natl Acad. Sci. USA 87, 9655-9659. Bazan, J.F. (1990)Proc. Natl Acad. Sci. USA 87, 6934-6938. Patthy, L. (1990)Cell 61, 13-14. Itoh, N. et al. (1990) Science 247, 324-327. Liu, L. et al. (1995) J. Biol. Chem. 270, 22422-22427. Arai, K. et al. (1990) Annu. Rev. Biochem. 59, 783-836. Callard, R.E. and Gearing, A.J.H. (1994) The Cytokine FactsBook. Academic Press, London. Isfort, R.J. and Ihle, J.N. (1990) Growth Factors 2, 213-220. Satoh, T. et al. (1991) Proc. Natl Acad. Sci. USA 88, 3314-3318. Vairo, G. and Hamilton, J.A. (1991) Immunol. Today 12, 362-369. Gorman, D.M. et al. (1990) Proc. Natl Acad. Sci. USA 87, 5459-5463. Hara, T. and Miyajima, A. (1992) EMBO J. 11, 1875-1884. Kremer, E. et al. (1993) Blood 82, 22-28.
Interleukin 4 receptor Subunits CD 124 (a chain) CD 132 (7c chain) IL- 13R a chain
Molecular weights Polypeptide
CD124 CD 132 IL- 13R a chain
87 067 39 920 41457
CD124 CD132 IL- 13R :r chain CD 124 CD132 IL- 13R :r chain
6 6 4 unknown unknown unknown
Carbohydrate N-linked sites
O-linked
IL- 13 R~ chain CD124
CD132
CD124
I
I
:["
\
-] -i/
0..
'I ttTTT T m
~COOH
c-
~_-
--~COOH
~
E~
E
COOH
COOH
Human gene location and size CD 124: 16p 11.2-p 12.1 CD132: Xq13; 4.2 kb 1
isl
CD124
Domains CD132 Exon boundaries
I
cK
,
I
WSN ISY I I F3
ITMICYI
~;TW QKC WAP I KRYI Is[ CK I I~ Io I1 12 ADF YWY
LVI
Domains
IL-13Ro~ chain
CEWcvc
Domains
lsl
I
WTEQ ll ERT \o KEN FSAW
CSW [ IGC i
cK
I
WSI RQL I i
F3
ITMI CYI
19~
IL-4R
Tissue distribution The IL-4R is expressed on mature B and T cells, haematopoietic precursors, fibroblasts, epithelial and endothelial cells 2.
Structure CD124 is a member of the cytokine receptor superfamily. The extracellular domain consists of a cytokine receptor domain and a fibronectin type III domain containing the WSXWS motif 3"4. The 596 amino acid cytoplasmic tail contains Pro- and Ser-rich regions, similar to those present in the IL-2R fl chain (CD122), G-CSFR ( C D l l 4 ) a n d the GM-CSFR/IL-3R/IL-SR common ]/ chain (CDwl31) 4. CD124 associates with CD132 (see IL-2R)to form a heterodimeric receptor complex at the cell surface s,e. In addition a second type of IL-4R is formed through the association 6f CD124 and the IL-13R chain 7,8. A soluble form of the mouse IL-4R is produced by alternative splicing of CD 124 9.
i.................
Ligands and associated molecules High-affinity IL-4Rs (Kd= 50-100 pM) may be composed of CD124/CD132 or CD124/IL-13R a chain 4"7. IL-4-induced signal transduction involves IL-4R binding and activation of two Janus family Tyr kinases (Jakl and Jak3)7. These Tyr kinases activate signal transducers and activators of transcription (Stat) proteins; denoted IL-4 Stat(s) in the case of IL-4 7,1o. IL-4 induces association of PI 3-kinase with the mouse IL-4R 11. I
Function IL-4 is a growth factor for pre-activated B cells and T cells. IL-4 enhances IgG1 and IgE production, differentiation of TH2-type CD4 § T cells and the expression of MHC Class II molecules on B cells and macrophages. IL-4 also induces macrophage activation and synergizes with colony-stimulating factors in promoting the growth of haematopoietic cells 12,13. IL-4, and IL-13, induces Tyr phosphorylation of CD12414.
Comments The soluble form of mouse IL-4R binds IL-4 with high affinity, inhibits the biological activity of IL-4 and prevents the degradation of the cytokine is. A recombinant extracellular domain of the human IL-4R is a powerful antagonist of its specific ligand 16. Database accession numbers Human CD 124 Mouse CD 124 Mouse CD 124 (secreted form) Rat CD124
194
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A60386 A33380
P24394 P16382
X52425 M27959 M27960
3 9 9
X69903
IL-4R
The accession numbers and amino acid sequence of CD 132, which is c o m m o n to the IL-2R, IL-4R, IL-7R, IL-9R and IL-15R, are given in the IL-2R entry. The IL-13R a chain is detailed in the IL-13R entry.
Amino acid sequence of human CD124 MGWLCSGLLF MKVLQEPTCV PENNGGAGCV PGNLTVHTNV NVTYLEPSLR REPFEQHLLL AIIIQDAQGS KAAKEMPFQG EVEEEKGSFC DMGESCLLPP TQSPDNLTCT VEPEMPCVPQ QEFVHAVEQG SGEEGYKPFQ EHLGLEPGEK QCHGQEDGGQ CPASLAPSGI
PVSCLVLLQV SDYMSISTCE CHLLMDDVVS SDTLLLTWSN IAASTLKSGI GVSVSCIVIL QWEKRSRGQE SGKSAWCPVE ASPESSRDDF SGSTSAHMPW ETPLVIAGNP LSEPTTVPQP GTQASAVVGL DLIPGCPGDP VEDMPKPPLP TPVMASPCCG SEKSKSSSSF
ASSGN WKMNGPTNCS ADNYTLDLWA PYPPDNYLYN SYRARVRAWA AVCLLCYVSI PAKCPHWKNC ISKTVLWPES QEGREGIVAR DEFPSAGPKE AYRSFSNSLS EPETWEQILR GPPGEAGYKA APVPVPLFTF QEQATDPLVD CCCGDRSSPP HPAPGNAQSS
TELRLLYQLV GQQLLWKGSF HLTYAVNIWS QCYNTTWSEW TKIKKEWWDQ LTKLLPCFLE ISVVRCVELF LTESLFLDLL APPWGKEQPL QSPCPRELGP RNVLQHGAAA FSSLLASSAV GLDREPPRSP SLGSGIVYSA TTPLRAPDPS SQTPKIVNFV
FLLSEAHTCI KPSEHVKPRA ENDPADFRIY SPSTKWHNSY IPNPARSRLV HNMKRDEDPH EAPVECEEEE GEENGGFCQQ HLEPSPPASP DPLLARHLEE APVSAPTSGY SPEKCGFGAS QSSHLPSSSP LTCHLCGHLK PGGVPLEASL SVGPTYMRVS
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 700 750 800
References 1 2 3 4 s 6 7 8 9 lo 11 12 13 14 is 16 17
Noguchi, M. et al. (1993)J. Biol. Chem. 268, 13601-13608. Park, L.S. et al. (1987) J. Exp. Med. 166, 476-488. Idzerda, R.L. et al. (1990)J. Exp. Med. 171,861-873. Galizzi, J-P. et al. (1990) Int. Immunol. 2, 669-675. Kondo, M. et al. (1993) Science 262, 1874-1877. Russell, S.M. et al. (1993) Science 262, 1880-1883. Lin, J-X. et al. (1995) I m m u n i t y 2, 331-339. Hilton, D.J. et al. (1996) Proc. Natl Acad. Sci. USA 93, 497-501. Mosley, B. et al. (1989)Cell 59, 335-348. Hou, J. et al. (1994) Science 265, 1701-1706. Izuhara, K. and Harada, N. (1993) J. Biol. Chem. 268, 13097-13102. Arai, K. et al. (1990) Annu. Rev. Biochem. 59, 783-836. Callard, R.E. and Gearing, A.J.H. (1994) The Cytokine FactsBook. Academic Press, London. Smerz-Bertling, C. and Duschl, A. (1995) J. Biol. Chem. 270, 966-970. Fernandez-Botran, R. and Vitetta, E.S. (1991)J. Exp. Med. 174, 673-681. Garrone, P. et al. (1991) Eur. J. Immunol. 21, 1365-1369. Richter, G. et al. (1995) Cytokine 7, 237-241.
195
Interleukin 5 receptor Subunits
[
CDw125
CDw131
CDw125 (~ chain) CDwl31 (tic chain)
Molecular weights Polypeptide
CDw125 CDwl31
45 557 95 707
CDw125 CDwl31 CDw125 CDwl31
6 3 unknown unknbwn
/
Carbohydrate N-linked sites O-linked
TTTTT TTT< TTTT
Human gene location CDw125" 3p26 CDwl31" 22q12.2-q13.1
COOH ~-
__ cz
Domains
CDw125
CTW
Isl
Domains
CDw131
I CRWsAc
Isl
I
cK
,
I
cK
I
WSV
I
I
sw , F3 I
COOH
WEK I SKYI
IAC
I
HTMICu
F3
CSW
I
cK
HHC,
~3b
WET
I
I
TRY, F3
ITMI cu
Tissue distribution In human and mouse the IL-5R is expressed on eosinophils and basophils. Mouse IL-5R is also expressed on B cells 1,2.
Structure The IL-5R is formed by the association of CDw125 and a common ]3 chain (CDwl31) that is also a component of the receptors for IL-3 and GM-CSF (see IL-3R) 1-3. The extracellular region of CDw125 consists of an N-terminal region of about 100 amino acids with sequence similarities to the equivalent regions in CDw123 (IL-3R a chain)and C D l l 6 (GM-CSFR a chain), followed by a cytokine receptor domain and a fibronectin type 1II domain that includes the WSXWS motif 1'2'4. Both membrane bound and secreted forms of CDw125 exist, probably arising from alternative splicing 1'4. The structure of CDwl31 is described in the IL-3R entry. The molecular weights of CDw125 and CDwl31, as determined by chemical crosslinking, are 5560 kDa and 120-140 kDa, respectively 1'4.
Ligands and associated molecules CDw125 binds IL-5 (Kd = 1 riM) and associates with CDwl31, which does not bind IL-5, to generate a functional high-affinity receptor for IL-5 ( K d = 5 0 250 pM) Z,4.
19~
IL-5R
Function IL-5 promotes the growth and differentiation of eosinophil precursors and activates mature eosinophils 2. Mouse IL-5 is also a B cell growth and differentiation factor 2's'6. The secreted form of CDw125 has antagonistic properties and is able to inhibit IL-5-induced eosinophil proliferation and differentiation 1.
Comments In vivo administration of mAbs against the IL-5R inhibits eosinophilia in
transgenic mice overexpressing IL-5 7. Database accession numbers Human CDw125 Mouse CDw125
PIR A40267 S12357
SWISSPR OT Q01344 P21183
EMBL/GENBANK X61176-X61178, X62156 D90205
REFERENCE 4 s
The accession numbers and amino acid sequence of CDwl31, which is common to the IL-3R, IL-5R and GM-CSFR, are given in the IL-3R entry.
9.
Amino acid sequence of human CDw125 MIIVAHVLLI DLLPDEKISL APKEDDYETR PPGSPGTSIV LYYRYGSWTE HSAIRPFDQL DYEVKIHNTR LWSEWSQPIY LFPPIPAPKS
LLGATEILQA LPPVNFTIKV ITESKCVTIL NLTCTTNTTE ECQEYSKDTL FALHAIDQIN NGYLQIEKLM VGNDEHKPLR NIKDLFVTTN
TGLAQVLLQW HKGFSASVRT DNYSRLRSYQ GRNIACWFPR PPLNVTAEIE TNAFISIIDD EWFVIVIMAT YEKAGSSETE
KPNPDQEQRN ILQNDHSLLA VSLHCTWLVG TFILSKGRDW GTRLSIQWEK LSKYDVQVRA ICFILLILSL IEVICYIEKP
VNLEYQVKIN SSWASAELHA TDAPEDTQYF LSVLVNGSSK PVSAFPIHCF AVSSMCREAG ICKICHLWIK GVETLEDSVF
-i 50 i00 150 200 250 300 350 400
References 1 2 3 4 s 6
Tavernier, J. et al. (1991) Cell 66, 1175-1184. Takatsu, K. et al. (1994) Adv. Immunol. 57, 145-190. Nicola, N.A. and Metcalf, D. (1991) Cell 67, 1-4. Murata, Y. et al. (1992) J. Exp. Med. 175, 341-351. Arai, K. et al. (1990) Annu. Rev. Biochem. 59, 783-836. Callard, R.E. and Gearing, A.J.H. (1994) The Cytokine FactsBook. Academic Press, London. 7 Hitoshi, Y. et al. (1991) Int. Immunol. 3, 135-139. s Takaki, S. et al. (1990) EMBO J. 9, 4367-4374.
197
IL-6R
Interleukin 6 receptor
Subunits
CD130
CD126 (~ chain) CD130 (fl chain; gpl30) =
Molecular weights Polypeptide
SDS-PAGE reduced
CD126 CD130
49 869 101 041
CD126 CD130
80 kDa 130 kDa
CD126 CD130 CD126 CD130
5 10 unknown unknown
CD126
Carbohydrate
-
N-linked sites O-linked
IITTTTT T
H u m a n gene location
''
CD126:lq21 CD 130: 5q 11
~
OOH
-~
COOH
Domains CPG
[
CD126 Isl
CEW
YSC
c2
,
I
I
CK
WQDLRH
FSC
,
I
I
F3
,
ITM ICYI
Domains CMD I LTC
cm3o
Isl
c~
,
CEWTs C
I
I
CK
,
WTN
I
I
WKT
TEY
F3
,
I
I
W'I-I"
RYL
F~
,
I
I
WDQ
KCY
F3
,
!
I
TLY
F~
,
ITMIcYi
Tissue distribution The IL-6R is expressed at high levels on activated and EBV-transformed B cells, plasma cells and myelomas but it is also present at lower levels on most leucocytes, epithelial cells, fibroblasts, hepatocytes and neural cells 1.
Structure The functional high-affinity receptor for human IL-6 is formed by the noncovalent association of two subunits, CD126 and CD130 2. CD126 contains a C2-set IgSF domain, followed by a cytokine receptor domain and a fibronectin type III domain, which includes the WSXWS motif conserved in many cytokine receptors 3-s. The second chain, CD130, does not bind IL-6 in isolation 6 and consists of a C2-set IgSF domain, a cytokine receptor domain and four fibronectin type III domains, the first
198
of which contains the WSXWS motif 2. CD130 is structurally similar to the G-CSFR (CD 114) 7
Ligands and associated molecules CD126 binds IL-6 with low affinity (Kd= 1 riM)3. CD130 binding stabilizes the CD126/IL-6 complex resulting in the formation of a high-affinity receptor (Kd = 10pM) 2
I !
!
I I
Function IL-6 is a growth factor for myelomas, B cell hybridomas, activated and EBVtransformed B cells and T cell lines 1,s. IL-6 also induces differentiation and proliferation of haematopoietic precursors, mediates the acute phase response of hepatocytes and affects diqf~rentiation of neural cell lines 1,s. CD130 mediates signal transduction and is Tyr phosphorylated in cells stimulated with IL-6 z'9. Mutational analyses indicate that a membraneproximal cytoplasmic region is important for the signal transduction activity of CD 130 9.
Comments Several other cytokines, including IL-11, leukaemia inhibitory factor (LIF), ciliary neurotrophic factor (CNTF), oncostatin M and cardiotrophin 1 (CT-1) utilize CD130 as a common signal transducer component of their receptors lo Database accession numbers PIR H u m a n CD126 JU0080 H u m a n CD 130 A3633 7 M o u s e CD126 JL0145 M o u s e CD 130 Rat CD126 A37986 Rat CD130 A44257
SWISSPR O T P08887 P40189 P22272 Q00560 P22273 P40190
EMBL/GENBANK X12830 M57230 X51975 M83336 J05668 M92340
REFERENCE 3 2 11 13 lz 14
Amino acid sequence of human CD126 MLAVGCALLA LAPRRCPAQE HPSRWAGMGR SCFRKSPLSN ESQKFSCQLA ANITVTAVAR VKDLQHHCVI SPPAENEVST AGGSLAFGTL PTPVLVPLIS
ALLAAPGAA VARGVLTSLP RLLLRSVQLH VVCEWGPRST VPEGDSSFYI NPRWLSVTWQ HDAWSGLRHV PMQALTTNKD LCIAIVLRFK PPVSPSSLGS
GDSVTLTCPG DSGNYSCYRA PSLTTKAVLL VSMCVASSVG DPHSWNSSFY VQLRAQEEFG DDNILFRDSA KTWKLRALKE DNTSSHNRPD
VEPEDNATVH GRPAGTVHLL VRKFQNSPAE SKFSKTQTFQ RLRFELRYRA QGEWSEWSPE NATSLPVQDS GKTSMHPPYS ARDPRSPYDI
WVLRKPAAGS VDVPPEEPQL DFQEPCQYSQ GCGILQPDPP ERSKTFTTWM AMGTPWTESR SSVPLPTFLV LGQLVPERPR SNTDYFFPR
-i 50 i00 150 200 250 300 350 400 449
IL-6R
Amino acid sequence of human CD 130 MLTLQTWVVQ ELLDPCGYIS TIPKEQYTII GLPPEKPKNL KAKRDTPTSC NPPHNLSVIN PPEDTASTRS EDRPSKAPSF WKSHLQNYTV ATHPVMDLKA EDGTVHRTYL PTVRTKKVGK SHTEYTLSSL VPVCLAFLLT RHNFNSKDQM GHSSGIGGSS SVQVFSRSES ESSPDISHFE AADAFGPGTE
ALFIFLTTES PESPVVQLHS NRTASSVTFT SCIVNEGKKM TVDYSTVYFV SEELSSILKL SFTVQDLKPF WYKIDPSHTQ NATKLTVNLT FPKDNMLWVE RGNLAESKCY NEAVLEWDQL TSDTLYMVRM TLLGVLFCFN YSDGNFTDVS CMSSSRPSIS TQPLLDSEER RSKQVSSVNE GQVERFETVG
TG NFTAVCVLKE DIASLNIQLT RCEWDGGRET NIEVWVEAEN TWTNPSIKSV TEYVFRIRCM GYRTVQLVWK NDRYLATLTV WTTPRESVKK LITVTPVYAD PVDVQNGFIR AAYTDEGGKD KRDLIKKHIW VVEIEANDKK SSDENESSQN PEDLQLVDHV EDFVRLKQQI MEAATDEGMP
KCMDYFHVNA CNILTFGQLE HLETNFTLKS ALGKVTSDHI IILKYNIQYR KEDGKGYWSD TLPPFEANGK RNLVGKSDAA YILEWCVLSD GPGSPESIKA NYTIFYRTII GPEFTFTTPK PNVPDPSKSH PFPEDLKSLD TSSTVQYSTV DGGDGILPRQ SDHISQSCGS KSYLPQTVRQ
NYIVWKTNHF QNVYGITIIS EWATHKFADC NFDPVYKVKP TKDASTWSQI WSEEASGITY ILDYEVTLTR VLTIPACDFQ KAPCITDWQQ YLKQAPPSKG GNETAVNVDS FAQGEIEAIV IAQWSPHTPP LFKKEKINTE VHSGYRHQVP QYFKQNCSQH GQMKMFQEVS GGYMPQ
-i 50 I00 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 896
References 1 2 a 4 s 6 7 s 9 ~o 11 12 13 14
~0(
Van Snick, J. (1990) Annu. Rev. Immunol. 8, 253-278. Hibi, M. et al. (1990)Cell 63, 1149-1157. Yamasaki, K. et al. (1988)Science 241, 825-828. Bazan, J.F. (1990) Proc. Natl Acad. Sci. USA 87, 6934-6938. Patthy, L. (1990)Cell 61, 13-14. Taga, T. et al. (1989) Cell 58, 573-581. Larsen, A. et al. (1990) J. Exp. Med. 172, 1559-1570. Callard, R.E. and Gearing. A.J.H. (1994)The Cytokine FactsBook. Academic Press, London. Murakami, M. et al. (1991) Proc. Natl Acad. Sci. USA 88, 11349-11353. Kishimoto, T. et al. (1994) Cell 76, 253-262. Sugita, T. et al. (1990)J. Exp. Med. 171, 2001-2009. Baumann, M. et al. (1990) J. Biol. Chem. 265, 19853-19862. Saito, M. et al. (1992) J. Immunol. 148, 4066-4071. Wang, Y. et al. (1992) Genomics 14, 666-672.
Interleukin 7 receptor CD127
Subunits
CD132
CD127 (~ chain) CD132 (yc chain) "-i
Molecular weights Polypeptide
CD127 CD132
49 483 39 920
CD127 CD132 CD127 CD132
6 6 unknown unknown
Carbohydrate N-linked sites O-linked
TTTTTPTTTTT TTTTT COOH
Human gene location and size CD127: 5p13; 19 kb 1 CD132: Xql3; 4.2 kb 2 FNT
Domains CD127
I
Isl
Domains 0D132
Exon boundaries
CTW
I
[
COOH
AMYj
F3
HTMICYI
WAP
QKCI [ KRYI I F3 ITMICYI Isl CK I1 12 10 I1 EIR2Tk0 LVlI1 WTEQ ADF YWY KEN FSAW
Tissue distribution The IL-7R is expressed on bone marrow lymphoid precursors, pro-B cells, thymocytes, mature T cells and monocytes 1,a
Structure The extracellular domain of CD 127 consists of an N-terminal region of about 100 amino acids with no clear sequence similarity to other proteins (see Chapter 3)followed by a fibronectin type IT[ domain containing the WSXWS motif found in many cytokine receptors &4. CD127 associates with CD132 (see IL-2R) to form a heterodimeric receptor complex at the cell surface 5. A soluble form of human CD127 is produced by alternative splicing of the CD 127 gene 1,a.
Ligands and associated molecules Two classes of IL-7R with low- (Kd = 5-10nM) and high-affinity (Kd = 100 pM) have been described 3. The association of CD127 and CD132 augments both IL-7 binding and internalization s.
iOl
IL-7R
Function IL-7 stimulates the proliferation of pro-B and pre-B cells, thymocytes and mature T cells and induces the activation of monocytes 6-8. IL-7R engagement stimulates Tyr phosphorylation and phosphatidylinositol turnover in B cell precursors and thymocytes 9,1o. Database accession numbers PIR Human CD127 A34791-C34791 Mouse CD127 D34791
SWISSPR OT P16871 P16872
EMBL/GENBANK M29696 M29697
REFERENCE a a
The accession numbers and amino acid sequence of CD 132, which is common to the IL-2R, IL-4R, IL-7R, IL-9R and IL-15R, are given in the IL-2R entry.
Amino acid sequence of human CD 127 MTILGTTFGM ESGYAQNGDL FEICGALVEV IDLTTIVKPE QEKDENKWTH SPSYYFRTPE IVWPSLPDHK EGFLQDTFPQ AGNVSACDAP LQSGILTLNP
VFSLLQVVSG EDAELDDYSF KCLNFRKLQE APFDLSVIYR VNLSSTKLTL INNSSGEMDP KTLEHLCKKP QLEESEKQRL ILSSSRSLDC VAQGQPILTS
SCYSQLEVNG IYFIETKKFL EGANDFVVTF LQRKLQPAAM ILLTISILSF RKNLNVSFNP GGDVQSPNCP RESGKNGPHV LGSNQEEAYV
SQHSLTCAFE LIGKSNICVK NTSHLQKKYV YEIKVRSIPD FSVALLVILA ESFLDCQIHR SEDVVVTPES YQDLLLSLGT TMSSFYQNQ
DPDVNTTNLE VGEKSLTCKK KVLMHDVAYR HYFKGFWSEW CVLWKKRIKP VDDIQARDEV FGRDSSLTCL TNSTLPPPFS
References 1 z a a s 6 7
Pleiman, C.M. et al. (1991) Mol. Cell. Biol. 11, 3052-3059. Noguchi, M. et al. (1993)J. Biol. Chem. 268, 13601-13608. Goodwin, R.G. et al. (1990) Cell 60, 941-951. Patthy, L. (1990)Cell 61, 13-14. Noguchi, M. et al. (1993) Science 262, 1877-1880. Arai, K. et al. (1990)Annu. Rev. Biochem. 59, 783-836. Callard, R.E. and Gearing, A.J.H. (1994)The Cytokine FactsBook. Academic Press, London. 8 Alderson, M.R. et al. (1991) J. Exp. Med. 173, 923-930. 9 Uckun, F.M. et al. (1991) Proc. Natl Acad. Sci. USA 88, 3589-3593. lo Uckun, F.M. et al. (1991) Proc. Natl Acad. Sci. USA 88, 6323-6327.
i02
-i 50 i00 150 200 250 300 350 400 439
IL-8R
Interleukin 8 receptor (IL-8RA/CDw128), IL-8RB
Other names CDw 128" CXCR 1
Molecular weights Polypeptide
CDw128 IL-8RB
39 806 40 760
CDw128 IL-8RB CD128A IL-8RB
4 3 unknown unknown
Carbohydrate N-linked sites O-linked
GOOH
Human gene location and size CDw128:2q35 IL-8RB: 2q35; -11 kb i
Tissue distribution CDw128 and IL-8RB are expressed on neutrophils 2,3. IL-8Rs are also expressed on basophils, eosinophils, a subset of T cells, monocytes, endothelial cells, keratinocytes and melanoma cells 4,s
Structure Both CDw128 and IL-8RB contain seven hydrophobic transmembrane domains and are members of the G protein-coupled receptor superfamily 2'3"6. CDw128 and IL-8RB have 77% amino acid identity and contain potential Ser and Thr phosphorylation sites near the C-terminus 2,3
Ligands and associated molecules CDw128 binds IL-8 with high affinity (Kd=3.6nM) 2. IL-8RB binds IL-8 with lower affinity and also functions as the receptor for three other IL-8-related CXC chemokines: melanoma growth-stimulating activity (MGSA/GRO), neutrophil-activating peptide 2 (NAP-2) and ENA-78 7,s.
Function IL-8 induces chemotaxis of neutrophils, basophils and T lymphocytes. It activates neutrophils and basophils and increases neutrophil and monocyte adhesion to endothelial cells 4,s. IL-8 binding to both CDw128 and IL-8RB induces a transient increase in intracellular calcium levels 2,3. The activation of phospholipase D and the respiratory burst of neutrophils in response to
i0~
IL-8R
IL-8 can be blocked with an antibody specific for CDw128, but not with an anti-IL-8RB antibody 9.
Comments The cDNA clone originally reported to encode the rabbit N-formyl peptide receptor is now identified as the rabbit homologue of CDw128 (79% identity) ,o,11.
Database accession numbers Human CDw128 Human IL-8RB Mouse IL-8RB Rat IL-8RB Rabbit CDw128 Rabbit IL-8RB
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A39445 A39446 A53677 $42096 A23669 A53752
P25024 P25025 P35343 P35407 P21109 P35344
M68932 M73969 L13239 X77797 M58021 L24445
2 a
,2
,o ,a
A m i n o acid s e q u e n c e of h u m a n C D w 1 2 8 MSNITDPQMW LLSLLGNSLV NGWIFGTFLC HLVKFVCLGC VLRILPHTFG FLLCWLPYNL NPIIYAFIGQ
DFDDLNFTGM MLVILYSRVG KVVSLLKEVN WGLSMNLSLP FIVPLFVMLF VLLADTLMRT N,FRHGFLKIL
PPADEDYSPC RSVTDVYLLN FYSGILLLAC FFLFRQAYHP CYGFTLRTLF QVIQETCERR AMHGLVSKEF
MLETETLNKY LALADLLFAL ISVDRYLAIV NNSSPVCYEV KAHMGQKHRA NNIGRALDAT LARHRVTSYT
VVIIAYALVF TLPIWAASKV HATRTLTQKR LGNDTAKWRM MRVIFAVVLI EILGFLHSCL SSSVNVSSNL
50 100 150 200 250 300 350
A m i n o acid s e q u e n c e of h u m a n IL-8RB MEDFNMESDS rEDFWKGEDLSNYSYSSTLP PFLLDAAPCEVESLE~NKYr VVIIYALVrL L S L L G N S L W L W L Y S R V G R S V T D W L L N L A L A D L L F A L T LPIW~SKW
ATRTLTQKRY GNNTANWRM L RVIFAVVLIF ILGILHSCLN SSSGHTSTTL
GW~FGTFLCK VVSLLKEVNr YSGILLLACI SWRYLAIVH
LVKFICLSIW LRILPQSFGF LLCWLPYNLV PLIYAFIGQK
GLSLLLALPV ;VPLLIMLF C LLADTLMRTQ FRHGLLKILA
LLFRRTVYSS YGFTLRTLFK VIQETCERRN IHGLISKDSL
NVSPACYEDM AHMGQKHRAM HIDRALDATE PKDSRPSFVG
References * Sprenger, H. et al. (1994) J. Biol. Chem. 269, 11065-11079.. u Holmes, W.E. et al. (1991) Science 253, 1278-1280. a Murphy, P.M. and Tiffany, H.L. (1991) Science 253, 1280-1283. 4 0 p p e n h e i m , J.J. et al. (1991) Annu. Rev. Immunol. 9, 617-648. s Callard, R.E. and Gearing, A.J.H. (1994)The Cytokine FactsBook. Academic Press, London. 6 Watson, S. and Arkinstall, S. (1994) The G-protein Linked Receptor FactsBook. Academic Press, London. r LaRosa, G.J. et al. {1992) J. Biol. Chem. 9.67, 25409.-25406. s Walz, A. et al. (1991} J. Exp. Med. 174, 1355-1362. 9 Jones, S.A. et al. {1996} Proc. Natl Acad. Sci. USA 93, 6682-6686.
504
50 i00 150 200 250 300 350 360
IL-8R
lo 11 lz 13
Thomas, K.M. et al. (1990) J. Biol. Chem. 265, 20061-20064. Thomas, K.M. et al. (1991) J. Biol. Chem. 266, 14839-14841. Bozic, C.R. et al. (1994) J. Biol. Chem. 269, 29355-29358. Prado, G.N. et al. (1994)J. Biol. Chem. 269, 12391-12394.
~05
IL-9R
Interleukin 9 receptor CD129
Subunits CD 129 (a chain) CD132 (7c chain) Molecular weights Polypeptide CD129 CD132 Carbohydrate N-linked sites
CD129 CD132 CD129 CD132
O-linked
52888 39 920
2 6 unknown unknown COOH
Human gene location and size CD129: Xq28, Yql2; -17kb 1 CD132: Xql3; 4.2 kb 2
CD129
CD132
CD132
Domains
CHW sis]
I
WSI SEC,I I FIHI cK I F3 ITMICYI
GTW
Domains Exonboundaries
COOH
I
ISlll
WAP I KRYI
QKCI
c,
12
ADF YWY
i
11
LVl
Io
I
I
WTEQ 11 ERT \o KEN FSAW
Tissue distribution The IL-9R is expressed on activated T cells, T cell lines, B cells, a megakaryoblastic leukaemia cell line (Mo7E) and both erythroid and myeloid precursors 3 - 6
Structure The functional IL-9R consists of a complex formed through the association of CD 129 and CD 132 (see IL-2R) 7. The extracellular region of CD 129 consists of a cytokine receptor domain and a fibronectin type 111 domain that contains the WSXWS motif 3. The cytoplasmic tail of CD129 (231 amino acids) contains a high percentage of Ser and Pro residues, including a stretch of nine successive Ser residues 3. Crosslinking analysis identified mouse CD129 as a 64 kDa glycoprotein, which is reduced to 54 kDa following N-glycosidase F treatment s. A soluble form of mouse CD 129 has been identified 3.
Ligands and associated molecules The IL-9R binds IL-9 with a Kd of - 1 0 0 pM as demonstrated for the receptor on a mouse T cell clone s. The complex formed between CD129 and CD132 is essential for IL-9-dependent signal transduction 7.
~0(
IL-9R
Function IL-9 promotes the growth of activated T cells and a megakaryoblastic leukaemia cell line (Mo7E), and supports the generation of erythroid and myeloid precursors 4-6. IL-9 also potentiates the IL-4-induced production of IgG and IgE from B cells 9. IL-9 stimulation of Mo7E cells induces Tyr phosphorylation of four unidentified proteins lo. Database accession numbers
Human CD129 Mouse C D 1 2 9
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
B45268 A45268
Q01113 Q01114
M84747 M84746
3 3
The accession numbers and amino acid sequence of CD132, which is common to the IL-2R, IL-4R, IL-7R, IL-9R and IL-15R, are given in the IL-2R entry.
Amino acid sequence of human CD129 MGLGRCIWEG SVTGEGQGPR GTHKCILRGS PRRHVKLDPP AWEQAQHRDH YTGQWSEWSQ FKLSPRVKRI AGTPQGALEP DVLPAGCTEW NNNYCALGCY VLAGHCQRPG
WTLESEALRR SRTFTCLTNN ECTVVLPPEA SDLQSNISSG IVGVTWLILE PVCFQAPQRQ FYQNVPSPAM CVQEATALLT RVQTLAYLPQ GGWHLSALPG LHEDLQGMLL
DMGTWLLACI ILRIDCHWSA VLVPSDNFTI HCILTWSISP AFELDPGFIH GPLIPPWGWP FFQPLYSVHN CGPARPWKSV EDWAPTSLTR NTQSSGPIPA PSVLSKARSW
CICTCVCLGV PELGQGSSPW TFHHCMSGRE ALEPMTTLLS EARLRVQMAT GNTLVAVSIF GNFQTWMGAH ALEEEQEGPG PAPPDSEGSR LACGLSCDHQ TF
LLFTSNQAPG QVSLVDPEYL YELAFKKQEE LEDDVVEEER LLLTGPTYLL RAGVLLSQDC TRLPGNLSSE SSSSSSSSSN GLETQQGVAW
-I 50 I00 150 200 250 300 350 400 450 482
References 1 z 3 4 s 6 7 8 9 lo
Kermouni, A. et al. (1995)Genomics 29, 371-382. Noguchi, M. et al. (1993) J. Biol. Chem. 268, 13601-13608. Renauld, J-C. et al. (1992) Proc. Natl Acad. Sci. USA 89, 5690-5694. Renauld, J-C. et al. (1993) Adv. Immunol. 54, 79-97. Donahue, R.E. et al. (1990)Blood 75, 2271-2275. Holbrook, S.T. et al. (1991) Blood 77, 2129-2134. Kimura, Y. et al. (1995) Int. Immunol. 7, 115-120. Druez, C. et al. (1990) J. Immunol. 145, 2494-2499. Dugas, B. et al. (1993) Eur. J. Immunol. 23, 1687-1692. Miyazawa, K. et al. (1992) Blood 80, 1685-1692.
ii
IL-10R
Interleukin 10 receptor
Molecular weight Polypeptide 60 753
r
Carbohydrate N-linked sites O-linked
6 unknown
TTTTTt TTTT
Human gene location 11
WTP
Domains Isl
IQL I GEFI
NGYt
F3
I
F3
c
ITMICYI
c =3 c
GOOH
D
Tissue distribution The IL-10R is expressed mainly by haematopoietic cells including T cells, B ceils, NK cells, monocytes and macrophages 1,2. Structure The IL-10R belongs to the class II cytokine receptor family, which includes
L : .......
C D l l 9 (IFN?R), C D l l 8 (IFNa/flR), tissue factor (CD142) and viral C D l l 9 homologues 2. The IL-10R consists of two extracellular fibronectin type III domains, followed by a transmembrane region and a 318 amino acid cytoplasmic tail 2. The WSXWS motif characteristic of Class I cytokine receptors is absent from both extracellular fibronectin type III domains. The human IL-10R has been identified as a 90-110 kDa protein by crosslinking experiments 2.
Ligands and a s s o c i a t e d m o l e c u l e s The human IL-10R binds IL-10 with high affinity (Kd= 200-250 pM)2. Function IL- 10 inhibits cytokine synthesis by activated T cells, NK cells, monocytes and macrophages and blocks the accessory cell function of macrophages. IL- 10 also co-stimulates human B cell proliferation and differentiation and synergizes with TGFfl to stimulate IgA production 1. Mouse IL-10 also enhances the
proliferation of thymocytes, T cells and mast cells, upregulates MHC Class II expression on B cells and sustains the viability of mouse B cells and mast cell i....................
lines in vitro 1,3.
Database accession numbers PIR
Human IL-1OR Mouse IL-1OR
~0~
SWISSPR OT
EMBL/GENBANK
REFERENCE
U00672 L12120
2 3
Amino acid sequence of human IL-10R
li'
MLPCLVVLLA ALLSLRLGSD HGTELPSPPSVWFEAEFFHH ISNCSQTLSY DLTAVTLDLY EVTLTVGSVN LEIHNGFILG RKVPGNFTFT HKKVKHENFS ECISLTRQYF TVTNVIIFFA FKKPSPFIFI SQRPSPETQD GSTKPSLQTE EPQFLLPDPH ICLQEPSLSP STGPTWEQQV LGHHSPPEPE VPGEEDPAAV GLGPKFGRCL VDEAGLHPPA EWSLLALSSC SDLGISDWSF SSLQSSE
A ILHWTPIPNQ HSNGYRARVR KIQLPRPKMA LLTSGEVGEF FVLLLSGALA TIHPLDEEAF PQADRTLGNG GSNSRGQDDS AFQGYLRQTR LAKGYLKQDP AHDLAPLGCV
SESTCYEVAL AVDGSRHSNW PANDTYESIF CVQVKPSVAS YCLALQLYVR LKVSPELKNL EPPVLGDSCS GIDLVQNSEG CAEEKATKTG LEMTLASSGA AAPGGLLGSF
LRYGIESWNS TVTNTRFSVD SHFREYEIAI RSNKGMWSKE RRKKLPSVLL DLHGSTDSGF SGSSNSTDSG RAGDTQGGSA CLEEESPLTD PTGQWNQPTE NSDLVTLPLI
-i 50 i00 150 200 250 300 350 400 450 500 550 557
References 1 Moore, K.W. et a|. (1993) Annu. Rev. Immunol. l l , 165-190. 2 Liu, Y. et al. (1994) J. Immunol. 152, 1821-1829. a Ho, A.S.Y. et al. (1993) Proc. N a t | Acad. Sci. USA 90, 11267-11271.
i09
Interleukin 11 receptor Subunits
CD130
IL- 11 R ~ chain CD130 (gpl30) C2 .~
Molecular weights Polypeptide
a chain CD130
SDS-PAGE reduced
43 131 101041
chain
/
CD130
130 kD
a chain CD130 chain CD130
2 10 unknown unknown
m m
f...~
13
C2 ~,
~3
Carbohydrate N-linked sites O-linked
:3
ip 0-
F.'3
Human gene location chain: 9p 13 CD 130: 5ql 1
COOH ~
Domains CPG (:z chain I sl
Domains CD130
lsl
I
02
YIC,
CMD ,[ LTC
c2
,
CTW
I
I
ARC, oK
CEW
[
I
TSC
CK
,
TYPAG
I
L
F3
WTN
I
I
TEY
F3
z:P
COOH
,
]TM I CYI
WKT
I
I
RYL
F3
,
WTT
I
!
KCY
F3
,
WDQ
I
I
TLY
F3
,
ITMICYI
Tissue distribution The IL-11R is expressed on multipotential haematopoietic progenitors, megakaryocytes, B cells, macrophages, hepatocytes, adipocytes, muscle cells and osteoclasts 1,2.
--7
Structure
~
! !
i i
!
~1(
The functional high-affinity receptor for human IL-11 is formed by the noncovalent association of two subunits, IL-11R a chain (sequence shown below) and CD130. The extracellular region of the IL-11R a chain consists of an Nterminal C2-set IgSF domain, followed by a cytokine receptor domain and a fibronectin type III domain, which includes the WSXWS motif, and is structurally related to CD126 (IL-6R a chain) and the a chain of the receptor for ciliary neurotrophic factor (CNTF). CD130, which does not bind IL-11 in isolation, consists of a C2-set IgSF domain, a cytokine receptor domain and four fibronectin type III domains, the first of which contains the WSXWS motif (see IL-6R). Alternative splicing of the IL-11R a chain gene results in a form of the molecule which lacks the entire cytoplasmic tail 2.
Ligands and associated molecules The mouse IL-11R ~ chain has a relatively low affinity for IL-11 (Kd= 10riM) and this interaction fails to transduce a biological signal. The generation of a high-affinity receptor for IL-11 (Kd = 400-800 pM) capable of signal transduction, requires co-expression of the IL-11R ~ chain and CD 130 a Function E [ E
IL-11 is a growth factor for multipotential haematopoietic progenitors. IL-11 also stimulates erythropoiesis, enhances megakaryocyte and platelet formation, promotes osteoclastogenesis, functions as a macrophage maturation/activation factor and upregulates immunoglobulin secretion from activated B cells 1,e. IL-11 also acts on various non-haematopoietic cell types, resulting in the inhibition of lipoprotein lipase activity in adipocytes, stimulation of acute phase protein synthesis by hepatocytes and the regulation of neuronal differentiation 1. Database accession numbers PIR
Human IL-11R= Mouse IL-11R= E
SWISSPR OT
$51619
EMBL/GENBANK
REFERENCE
Z38102 U14412
2 3
The accession numbers and amino acid sequence of CD130 (the fl chain of the IL-11R), which is common to the IL-6R, LIFR, CNTFR, OSMR and CT-1R are given in the IL-6R entry. A m i n o a c i d s e q u e n c e of h u m a n I L - 1 1 R ~ c h a i n MSSSCSGLSR VLVAVATALV SA
SSPCPQAWGP PGVQYGQPGR SVKLCCPGVT AGDPVSWFRD GEPKLLQGPD
SGLGHELVLA ADYENFSCTW DPLGAARCVV GLRVESVPGY PAGLEEVITD EIPAWGQLHT GILSFLGLVA
QADSTDEGTY SPSQISGLPT HGAEFWSQYR PRRLRASWTY AVAGLPHAVR QPEVEPQVDS GALALGLWLR
ICQTLDGALG RYLTSYRKKT INVTEVNPLG PASWPCQPHF VSARDFLDAG PAPPRPSLQP LRRGGKDGSP
GTVTLQLGYP VLGADSQRRS ASTRLLDVSL LLKFRLQYRP TWSTWSPEAW HPRLLDHRDS KPGFLASVIP
PARPVVSCQA PSTGPWPCPQ QSILRPDPPQ AQHPAWSTVE GTPSTGTIPK VEQVAVLASL VDRRPGAPNL
-i 50 i00 150 200 250 300 350 400
References 1 Du, X.X. and Williams, D.A. (1994)Blood 83, 2023-2030. z Ch6rel, M. et al. (1995) Blood 86, 2534-2540. a Hilton, D.]. et al. (1994) EMBO ]. 13, 4765-4775.
ill
IL-12R
Interleukin 12 receptor fl chain
Molecular weight Polypeptide
70422
F3
Carbohydrate N-linked sites O-linked
6 unknown
JF3
1T11Td1tTT
sssss
COOH WQY [ VLY
Domains
Isl
Fa
I
WET [ QEF,
Fa
PTQ AAY
I
I
Fa
,
WPA CYY
I
I
Fa
,
WAP
I
I
vAu
Fa
ITMICYI
Tissue distribution The IL-12R is expressed on activated CD4 § and CD8 § T cells, IL-2-activated CD56 + NK cells and an IL-2-dependent CD4 + T cell line (Kit225)1,2.
Structure The IL-12R is predicted to comprise a heterodimer formed between the designated IL-12R fl subunit, cloned in both human and mouse, and an as yet unidentified second chain. The human IL-12R fl chain consists of five extracellular fibronectin type IT[ domains, followed by a transmembrane domain and a relatively short (91 amino acids) cytoplasmic tail and shows a high degree of amino acid sequence similarity to CD130 (gpl30), G-CSFR (CD114) and LIFR a chain 3. Crosslinking experiments using unlabelled human IL-12 and 12sIlabelled PHA-activated lymphoblasts have identified two proteins of approximately 110 kDa and 85 kDa which may associate to form a functional IL-12R 1.
Ligands and associated molecules Three classes of IL-12 binding sites have been demonstrated on activated human T cells and the T cell line, Kit225; high-affinity (Kd=5-20pM), intermediate-affinity (Kd = 50-200 pM) and low-affinity (Kd = 2-6 nM) a. The IL-12R // chain (shown below) binds both human and mouse IL-12 with an apparent affinity of 2-5 nM a.
Function IL-12 has pleiotropic effects on NK cells and T cells, including the stimulation of cell proliferation, induction of IFN? secretion, enhancement of cytolytic responses and promotion of a Tnl-type immune response 4.
i12
IL-12R
Comments COS-7 cells transfected with the IL- 12R fl chain cDNA express both monomers and disulfide-linked dimers, or oligomers, of the IL-12R fl subunit on their surface 3. Oligomerization of the IL-12R fl subunit is independent of IL-12, while the dimers/oligomers but not the monomers bind IL-12 with low affinity (Kd = 2 - 5 riM). Database accession numbers PIR
H u m a n IL-12R fl Mouse IL-12R fl
9_
SWISSPR OT P42701
EMBL/GENBANK U03187 U23922
REFERENCE 3 5
Amino acid sequence of human IL-12R fl chain MEPLVTWVVP RTSECCFQDP HFLRCCLSSG KSPEVTLQLY RTPSSPWKLG WSSPVCVPPE APGTEVTYRL GPGLNQTWHI GQDGGLATCS EKLTLWSTVL PGVLKEYVVR AWLRGVWSQP RHLCPPLPTP KGERTEPLEK
LLFLFLLSRQ PYPDADSGSA RCCYFAAGSA NSVKYEPPLG DCGPQDDDTE NPPQPQVRFS QLHMLSCPCK PADTHTEPVA LTAPQDPDPA STYHFGGNAS CRDEDSKQVS QRFSIEVQVS CASSAIEFPG TELPEGAPEL
GAAC SGPRDLRCYR TRLQFSDQAG DIKVSKLAGQ SCLCPLEMNV VEQLGQDGRR AKATRTLHLG LNISVGTNGT GMATYSWSRE AAGTPHHVSV EHPVQPTETQ DWLIFFASLG GKETWQWINP ALDTELSLED
ISSDRYECSW VSVLYTVTLW LRMEWETPDN AQEFQLRRRQ RLTLKEQPTQ KMPYLSGAAY TMYWPARAQS SGAMGQEKCY KNHSLDSVSV VTLSGLRAGV SFLSILLVGV VDFQEEASLQ GDRCKAKM
QYEGPTAGVS VESWARNQTE QVGAEVQFRH LGSQGSSWSK LELPEGCQGL NVAVISSNQF MTYCIEWQPV YITIFASAHP DWAPSLLSTC AYTVQVRADT LGYLGLNRAA EALVVEMSWD
-i 50 i00 150 200 250 300 350 400 450 500 550 600 638
References 1 2 3 4 s
Chizzonite, R. et al. (1992) J. Immunol. 148, 3117-3124. Desai, B.B. et al. (1992) J. Immunol. 148, 3125-3132. Chua, A.O. et al. (1994) J. Immunol. 153, 128-136. Trinchieri, G. (1994) Blood 84, 4008-4027. Chua, A.O. et al. (1995) J. Immunol. 155, 4286-4294.
Note added in proof A second component of the functional IL-12R, designated IL-12R f12 subunit, has been characterized through the isolation of human and mouse cDNA clones (Presky, D.H. et al. (1996) Proc. Natl Acad. Sci. USA 93, 1400214007). These sequences are available using GENBANK accession numbers U64198 and U64199. The IL-12R fl chain described in this entry has therefore been renamed IL- 12R fl 1 chain.
Interleukin 13 receptor
Subunits IL- 13R a chain CD124 (IL-4R a chain)
czchain CD124
Molecular weights Polypeptide IL-13R ~ chain CD124 Carbohydrate N-linked sites
41457 87067
IL- 13R ~ chain CD124 IL- 13R a chain CD124
O-linked
a,,
4 6 unknown unknown
TTTTT
Human gene location CD 124:16p 11.2-p 12.1
z~ c
Domains chain
I sl
Domains CD124 Isl
I CEWcv C I n
CK
~ - - - -
CSW I IGC
CK
WSN
I
[
i
ISYi
F3
c
WSI
!
I
RQL
F3
i
c
ITMIcYI
COOH
ITMIcYI
Tissue distribution The IL-13R is expressed on B cells, monocytes, fibroblasts and endothelial cells 1,2. Structure The IL-13R a chain shows structural homology to CDw125 (IL-5R a chain), with 51% amino acid sequence similarity and 27% identity e. Similar to CDw125, the extracellular region of the IL-13R a chain consists of an Nterminal region of about 100 amino acids with sequence similarity to the equivalent portion of CDw123 (IL-3R a chain) and C D l l 6 (GM-CSFR chain), followed by a cytokine receptor domain and a fibronectin type III domain that includes the WSXWS motif 2.
Ligands and associated molecules The human IL-13R a chain expressed in COS-7 cells binds human IL-13 with high affinity (Kd=220-280pM), whereas COS-7 cells transfected with the mouse IL-13R a chain cDNA bind mouse IL-13 with a low affinity (Kd= 2 10nM) 2"3. In both species, the IL-13R ~ chain also associates with CD124 to form a receptor capable of binding both IL-13 and IL-4 with high affinity and mediating signal transduction events 2-4. This interaction may explain the apparent discrepancy in the ability of human, but not mouse, IL-13R a chain to bind IL-13 with high affinity, since untransfected COS-7 cells express
IL-13R
low levels of CD124 which might associate with transfected human, but not mouse, IL-13R a chain to form a functional high-affinity binding site for IL- 13. Crosslinking experiments have identified an IL- 13 binding protein of approximately 60-70 kDa, which probably corresponds to the IL-13R a chain 2.
Function IL-13 and IL-4 mediate similar biological functions except that, unlike IL-4, IL-13 does not regulate T cell functions 1,s.
Comments The published human and mouse IL-13R a chain sequences share the same overall topology, but show only limited amino acid sequence identity 2'3. It remains to be determined whether additional IL-13R subunits exist and whether, in fact, these molecules are true species homologues.
Database accession numbers PIR
SWISSPROT
Human IL-13R
EMBL/GENBANK
REFERENCE
X95302
2
The accession numbers and amino acid sequence of CD124 are given in the IL4R entry.
A m i n o acid s e q u e n c e of h u m a n IL-13R c~ c h a i n MAFVCLAIGC DTEIKVNPPQ GSETWKTIIT TYWISPQGIP GLDHALQCVD YFTFQLQNIV REDDTTLVTA DKQCWEGEDL FCDT
LYTFLISTTF DFEIVDPGYL KNLHYKDGFD ETKVQDMDCV YIKADGQNIG KPLPPVYLTF TVENETYTLK SKKTLLRFWL
GCTSSS GYLYLQWQPP LNKGIEAKIH YYNWQYLLCS CRFPYLEASD TRESSCEIKL TTNETRQLCF PFGFILILVI
LSLDHFKECT TLLPWQCTNG WKPGIGVLLD YKDFYICVNG KWSIPLGPIP VVRSKVNIYC FVTGLLLRKP
VEYELKYRNI SEVQSSWAET TNYNLFYWYE SSENKPIRSS ARCFDYEIEI SDDGIWSEWS NTYPKMIPEF
-I 50 I00 150 200 250 300 350 354
References 1 Zurawski, G. and de Vries, J.E. (1994)Immunol. Today 15, 19-26. 2 Caput, D. et al. (1996) J. Biol. Chem. 271, 16921-16926. 3 Hilton, D.J. et al. (1996) Proc. Natl Acad. Sci. USA 93, 497-501.
4 Lin, J-X. et al. (1995) Immunity 2, 331-339. s Punnonen, J. et al. (1993) Proc. Natl Acad. Sci. USA 90, 3730-3734.
~1~
Interleukin 14 receptor, high molecular weight B cell growth factor receptor Molecular weight SDS-PAGE reduced unreduced
48-50 kDa 48 kDa and 90 kDa
Tissue distribution The IL-14R is expressed at a low level on resting B cells, but at high levels on activated B cells and B cell leukaemias 1,2.
Structure IL- 14R cDNA clones have not yet been isolated. However the BA5 monoclonal antibody, believed to recognize the human IL-14R, immunoprecipitates a protein of approximately 90kDa which may consist of two distinct subunits 1. Crosslinking experiments using 12SI-labelled IL-14 also identified an IL-14-binding protein of approximately 90 kDa 1,e.
Ligands and associated molecules The IL- 14R on leukaemic B cells binds IL- 14 with low affinity (Ka - 20 riM)e. Bb, the activation fragment of complement Factor B, competes with IL-14 for binding to the IL- 14R a,4.
Function IL-14 induces proliferation of activated B cells and inhibits immunoglobulin secretion s'6. In B cells, levels of intracellular cAMP, diacylglycerol and calcium are increased following IL- 14 binding to the IL- 14R 4.
References 1 2 a 4 s 6
~1(
Ambrus, J.L. et al. (1988) J. Immunol. 141,861-869. Uckun, F.M. et al. (1989)J. Clin. Invest. 84, 1595-1608. Peters, M.G. et al. (1988) J. Exp. Med. 168, 1225-1235. Ambrus, J.L. et al. (1991) J. Biol. Chem. 266, 3702-3708. Ambrus, J.L. et al. (1990) J. Immunol. 145, 3949-3955. Ambrus, J.L. et al. (1993) Proc. Natl Acad. Sci. USA 90, 6330-6334.
Interleukin 15 receptor
Subunits
CD122
IL- 15R a chain CD122 (IL-2R // chain) CD132 (yc chain)
CD132
chain
Molecular weight Polypeptide
SDS-PAGE reduced
a chain CD122 CD 132
24 997 58359 39 920
CD122 CD 132
70-75 kDa 64 kDa
chain CD122 CD132 chain CD122 CD132
1
Carbohydrate N-linked sites
O-linked
TT
TTTTT I COOH COOH
4 6 probable + unknown unknown
COOH
Human gene location and size chain: 10p 14-p 15 CD122: 22ql 1.2-q13; 24.3 kb z CD132: Xql3; 4.2 kb 2
(x chain
Domains
Exon boundaries
CPP
LKC
i
i
"N2
i
IRD I1
RGI
SRQ
KEP
CVWwA C
Domains
CD122 Exon boundaries
I
Exonboundaries
WEI !
I
TQYI Io
Io i
F3
CTW
I
Domains
CD132
,
ITM I CYI I1 NGT PDSQ 1 WEEAAAL /2 ~ 0 RRR NLR PWL VQKW
I SllI /2
CK
IslI1
WAP QKCI I KRYI CK I F3 ITMICYI
12
ADF YWY
I1
LVI
10
I
12
WTEQ I1 ERT~O KEN FSAW
Tissue distribution The IL- 15R is expressed on activated T cells, T cell lines, NK cells and activated B cells 3-s.
Structure The functional IL-15R is a heterotrimeric structure comprising the IL-15R chain, CD122 (IL-2R // chain) and the common ~ chain (CD132) 3-6. The IL-15R a chain contains a single extracellular CCP domain, followed by a region rich in Ser and Thr residues, and has a 41 amino acid cytoplasmic tail that is dispensable for signal transduction 4. The IL-15R a chain is structurally related to CD25 (IL-2R a chain). Three alternatively spliced forms of the human IL-15R a chain have been cloned 4.
Ligands and associated molecules The IL- 15R ~ chain binds IL- 15 with high affinity (Kd -- 10 pM) 4,5. However, the interaction between the IL-15R a chain and CD122/CD132 is required to form a functional high-affinity IL-15R capable of mediating signal transduction 3-6. High concentrations of IL-15 (450ng/ml) can bind to, and signal through, a complex of CD122/CD132 in the absence of the IL-15R a chain 4.
Function IL-15 shares biological activities with IL-2, such as the activation and proliferation of T cells, generation of cytotoxic T cells and lymphokineactivated killer (LAK) cells, and the proliferation of NK cells and B cells 3-5,7,s.
Comments The IL-15R a chain and CD25 genes have a similar intron/exon organization and are closely linked in both human and mouse 4. IL-15R a chain m R N A is expressed on a wide range of cell types, in contrast to CD25 4,s. Since CD122 and CD132 are essential for IL-15 signalling in haematopoietic cell types and their expression patterns are more restricted than the IL-15R a chain, the possibility exists that the IL-15R a chain may associate with alternative signalling subunits in other cell types 4.
Database accession numbers PIR
SWISSPR OT
Human IL-15R~ Mouse IL-15R~
EMBL/GENBANK
REFERENCE
U31628 U22339
4 5
The accession numbers and amino acid sequences of CD122 and CD132 are given in the IL-2R entry.
Amino acid sequence of human IL-15R c~ chain MAPRRARGCR ITCPPPMSVE TNVAHWTTPS ASSPSSNNTA KNWELTASAS RQTPPLASVE
~1~
TLGLPALLLL HADIWVKSYS LKCIRDPALV ATTAAIVPGS HQPPGVYPQG MEAMEALPVT
LLLRPPATRG LYSRERYICN HQRPAPPSTV QLMPSKSPST HSDTTVAIST WGTSSRDEDL
SGFKRKAGTS TTAGVTPQPE GTTEISSHES STVLLCGLSA ENCSHHL
SLTECVLNKA SLSPSGKEPA SHGTPSQTTA VSLLACYLKS
-i 50 i00 150 200 237
IL-15R
----1
i
t
References 1 z 3 4 s 6 7 8
Shibuya, H. et al. (1990) Nucleic Acids Res. 18, 3697-3703. Noguchi, M. et al. (1993)J. Biol. Chem. 268, 13601-13608. Grabstein, K.H. et al. (1994) Science 264, 965-968. Anderson, D.M. et al. (1995) J. Biol. Chem. 270, 29862-29869. Giri, J.G. et al. (1995) EMBO J. 14, 3654-3663. Girl, J.G. et al. (1994) EMBO J. 13, 2822-2830. Carson, W.E. et al. (1994) J. Exp. Med. 180, 1395-1403. Armitage, R.J. et al. (1995) J. Immunol. 154, 483-490.
~1~
Interleukin 17 receptor
Molecular weight Polypeptide 94416 r
!.
SDS-PAGE reduced
120 kDa
Carbohydrate N-linked sites O-linked
8 unknown c c c c c
COOH
Tissue distribution Mouse IL-17R mRNA is expressed by a wide range of haematopoietic and nonhaematopoietic tissues and cell lines ~.
Structure The mouse IL- 17R is a type I transmembrane protein consisting of a 291 amino acid extracellular domain, a transmembrane domain and a very large (521 amino acids} cytoplasmic tail1. There are two acidic regions and a Ser-rich region in the cytoplasmic tail, similar to those found in CD122 {IL-2R fl chain}, CD124 (IL-4R ~ chain} and the G-CSFR {CDl14) 1. The IL-17R sequence is unrelated to previously identified cytokine receptors.
Ligands and associated molecules The IL-17R binds to IL-17 and HVS13, a herpesvirus saimiri-encoded IL-17 homologue that shows 72% amino acid sequence identity to human IL-17 ~-a.
Function IL-17 has been shown to induce secretion of IL-6 and IL-8 from flbroblasts, upregulate flbroblast expression of CD54 {ICAM-1 ), activate the transcription factor NF-xB and enhance the proliferative response of purified T cells to suboptimal concentrations of the mitogen PHA 1,a.
Comment The mouse IL-17R gene maps to chromosome 61. Database accession numbers PIR
Mouse IL-17R
i2~
SWISSPR OT
EMBL/GENBANK
REFERENCE
U31993
1
IL-17R
A m i n o acid sequence of m o u s e IL-17R MAIRRCWPRV SPRLLDFPAP VSSTQHGELV LSMLQHHRKR VPDCEDSKMK VLLESFSDSE QVQPFFSSCL ILLVGSVIVL VWIVYSADHP SRQKQEMVES FTAAMNMILP FEEVYFRIQD TQCPDWFERE SDGCLVVDVC VVEPLHLPDG TPMMSPDHLQ EEEQRQSVQS KLQRQLFFWE
VPGPALGWLL VCAQEGLSCR PVLHVEWTLQ WRFSFSHFVV MTTSCVSSGS NHSCFDVVKQ NDCLRHAVTV IICMTWRLSG LYVEVVLKFA NSKIIILCSR DFKRPACFGT LEMFEPGRMH NLCLADGQDL VSEEESRMAK SGAAAQLPMT GDAREQLESL DQGYISRSSP LEKNPGWNSL
LLLNVLAPGR VKNSTCLDDS TDASILYLEG DPGQEYEVTV LWDPNITVET IFAPRQEEFH PCPVISNTTV ADQEKHGDDS QFLITACGTE GTQAKWKAIL YVVCYFSGIC HVRELTGDNY PSLDEEVFED LDPQLWPQRE EDSEACPLLG MLSVLQQSLS QPPEWLTEEE EPRRPTPEEQ
A WIHPKNLTPS AELSVLQLNT HHLPKPIPDG LDTQHLRVDF QRANVTFTLS PKPVADYIPL KINGILPVAD VALDLLEEQV GWAEPAVQLR SERDVPDLFN LQSPSGRQLK PLLPPGGGIV LVAHTLQSMV VQRNSILCLP GQPLESWPRP ELELGEPVES NPS
SPKNIYINLS NERLCVKFQF DPNHKSKIIF TLWNESTPYQ KFHWCCHHHV WVYGLITLIA LTPPPLRPRK ISEVGVMTWV CDHWKPAGDL ITSRYPLMDR EAVLRFQEWQ KQQPLVRELP LPAEQVPAAH VDSDDLPLCS EVVLEGCTPS LSPEELRSLR
-i 5O i00 150 200 250 300 350 400 45O 5O0 55O 6OO 650 7OO 750 8OO 833
References 1 Yao, Z. et al. {1995) Immunity 3, 811-821. 2 Albrecht, J-C. et al. (1992) J. Virol. 66, 5047-5058. a Yao, z. et al. (1995) J. Immunol. 155, 5483-5486.
~21
I f
+ 9
,
Molecular weights Polypeptide 84 881 SDS-PAGE reduced unreduced
110 kDa 105 kDa
Carbohydrate N-linked sites O-linked sites
8 unknown
Human gene location and size 12q13.13, -10 kb 1.
E
~4(CD49d)/J~7
T i s s u e distribution f17 combines with the a4 subunit (CD49d)to form the a4]/7 integrin and with the aE subunit (CD103) to form the ~Efl7 integrin (HML-1). Both antigens are expressed on mucosal lymphocytes 2. Adult human lymphocytes have heterogeneous expression of a4f17, whereas expression on newborn lymphocytes is more homogeneous 3"4. a4f17 is also expressed on NK cells and eosinophils 3. ~Efl7 is expressed on 95 % of intraepithelial lymphocytes but only 1-2% of peripheral blood lymphocytes 2. Its level of expression on lymphocytes can be upregulated by lymphocyte mitogens s.
Structure The f17 integrin subunit is most closely related to the//2 integrin subunit (CDI8) including the cytoplasmic segment, in which two NPX(Y/F) motifs for potential tyrosine kinase binding are found. Most integrin ]/ subunits have 56 conserved cysteine residues in their extracellular portion; the last two cysteines are missing from the l/7 subunit 6,7. The N-terminus has been determined by protein sequencing 2.
Ligands and a s s o c i a t e d m o l e c u l e s Like the a4]/1 integrin, a4f17 binds CD106 (VCAM-1) and the CS1 region of fibronectin 8,9. In addition, a4]/7 binds the mucosal addressin MAdCAM-1 lo,11. aEfl7 binds E-cadherin on epithelial cells 12. Function The ~4fl7 integrin mediates the binding of lymphocytes to MAdCAM-1 on the high endothelial venules, thereby directing the homing of lymphocytes into
~22
Integrin fi7 subunit Peyer's patches and the intestinal lamina propria lo,11. The interaction between ~E//7 and E-cadherin may be of importance in the homing and retention of ~Efl7-expressing lymphocytes to the intestinal epithelium 12,13. Database accession numbers PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
Human
A40526
P26010
Mouse
A46271
P26011
M62880 M68892 M68903 M95632
6 7 14
,a
Amino acid sequence of human f17 integrin subunit MVALPMVLVL ELDAKIPSTG TASGEAEARR ATQLAPQRVR RVRQLGHALL LERCQSPFSF LCQEQIGWRN SRSTEFDYPS ELSEDSSNVV GKAEDRGQCN LHTLCDCNCS SPDLESGCRA EGILCGGFGR KCNRCQCLDG CAHTNVTLAL GADHTQAIVL WKQDSNPLYK
LLVLSRGES DATEWRNPHL CARREELLAR VTLRPGEPQQ VRLQEVTHSV HHVLSLTGDA VSRLLVFTSD VGQVAQALSA QLIMDAYNSL HVRINQTVTF DTQPQAPHCS PNGTGPLCSG
CQCGVCHCHA
YYGALCDQCP APILDDGWCK GCVGGIVAVG SAITTTINPR
SMLGSCQPAP GCPLEELEEP LQVRFLRAEG RIGFGSFVDK QAFEREVGRQ DTFHTAGDGK ANIQPIFAVT SSTVTLEHSS WVSLQATHCL DGQGHLQCGV KGHCQCGRCS NRTGRACECS GCKTPCERHR ERTLDNQLFF LGLVLAYRLS FQEADSPTL
SCQKCILSHP RGQQEVLQDQ YPVDLYYLMD TVLPFVSTVP SVSGNLDSPE LGGIFMPSDG SAALPVYQEL LPPGVHISYE PEPHLLRLRA CSCAPGRLGR CSGQSSGHLC GDMDSCISPE DCAECGAFRT FLVEDDARGT VEIYDRREYS
SCAWCKQLNF PLSQGARGEG LSYSMKDDLE SKLRHPCPTR GGFDAILQAA HCHLDSNGLY SKLIPKSAVG SQCEGPEKRE LGFSEELIVE LCECSVAELS ECDDASCERH GGLCSGHGRC GPLATNCSTA VVLRVRPQEK RFEKEQQQLN
-I 5O i00 150 200 250 300 350 400 450 50O 55O 600 650 70O 75O 779
References 1 2 3 4 s a 7 8 9 lo 11 la 13 14 is
Jiang, W.M. et al. (1992) Int. Immunol. 4, 1031-1040. Parker, C. M. et al. (1992) Proc. Natl Acad. Sci. USA 8 9 , 4 9 2 4 - 1 9 2 8 . Erie, D.J. et al. (1994) J. Immunol. 153, 517-528. Andrew, D.P. et al. (1996) Eur. J. Immunol. 26, 897-905. Schieferdecker, H.L. et al. (1990)J. Immunol. 144, 2541-2549. Erle, D.J. et al. (1991) J. Biol. Chem. 266, 11009-11016. Yuan, Q. et al. (1990)Int. Immunol. 2, 1097-1118; Corr. (1991)Int. Immunol. 3, 1373-1374. Lobb, R.R. and Hemler, H.E. (1994) J. Clin. Invest. 94, 1722-1728. Kilger, G. et al. (1995) J. Biol. Chem. 270, 5979-5984. Berlin, C. et al. (1993)Cell 74, 185-195. Rott, L.S. et al. (1996) J. Immunol. 156, 3727-3736. Cepek, K.L. et al. (1994) Nature 372, 190-193. Cepek, K.L. et al. (1993) J. Immunol. 150, 3459-3470. Yuan, Q. et al. (1992) J. Biol. Chem. 267, 7352-7358. Hu, M.C.T. (1992)Proc. Natl Acad. Sci. USA 89, 8254-8258.
~2~
KIR
Killer cell inhibitory receptor family, CD158
Members (Modified from refs 1-4; sequences from refs 5-10) (KIR m o l e c u l e s reactive w i t h t h e m A b s EB6 and GL183 h a v e b e e n assigned t h e C D n u m b e r s C D 1 5 8 a and CD158b, respectively.} CytoeDNA
clones
Other names a
IgSF plasmic Putative domains domain b ligands c
Reactive Closely related mAbs eDNA clones
KIR-cl.42 NKAT1, p58.1 KIR-cl.6 NKAT2, p58.2
2
long
HLA-C group 1 EB6
KIR-cl.47.11 d
2
long
HLA-C group 2 GL183
KIR-cl.43 p58.2 KIR-cl.49 NKAT5, p50.2
2 2
long short
HLA-C group 2 GL183 HLA-C group 2 GL183
NKAT2ad, NKAT2b d, NKAT2b AIg2e NKAT6 d NKAT5 Algl/Ig2e NKAT5 AIgl ~ p 183-ActId
KIR-cl.39 NKAT8 pEB6-ActI p50.1 NKAT7 NKAT9
2 2 2 2
short short short short
5.133 HLA-C group 1 EB6
KIR-cl.11f NKAT3, 3 p70 KIR-cl.2f NKB1,p70 3
long
HLA-Bw4
long
HLA-Bw4
NKAT4
3
long
HLA-A3
NKAT10
3
short
DX9, 5.133 DX9, 5.133 5.133
NKAT3 Algle NKB 1Be KIR-cl.5d, NKAT4ad, NKAT4b d
Encode identical protein sequence. b The additional peptide sequence in KIRs with long cytoplasmic domains contains two copies of the motif V/IXYXXL separated by 24 amino acids. These motifs are involved in recruiting the cytoplasmic tyrosine phosphatases SHP-1 and SHP-2. c HLA-C group 1 {with Asn77 and Lys80) includes Cw2, Cw4, Cw5, and Cw6 alleles; group 2 {with Ser77, Asn80) includes Cwl, Cw3, Cw7, and Cw8 alleles. d These clones encode proteins with _<2amino acid differences and might, therefore, represent allelic variants or cloning/sequencing errors, rather than transcripts from separate genes. e These clones are identical except for deletions consistent with alternative splicing. IKIR-cl.11 and KIR-cl.2 differ by only four amino acid substitutions.
a
Molecular weights (KIRs can be grouped into four size groups according to n u m b e r of IgSF d o m a i n s and l e n g t h s of c y t o p l a s m i c d o m a i n s (see Table}. E x a m p l e s are given f r o m each group.}
I
i
]24
Polypeptide KIR-cl.42 KIR-cl.49 KIR-cl. 11 N K A T 10
36 31 46 40
253 232 906 5 73
SDS-PAGE (unreduced and reduced} KIR-cl.42 (p58) - 5 8 k D a KIR-cl.49 (pS0) - 5 0 k D a KIR-cl.11 (p70) - 7 0 k D a N K A T 10 unknown
KIR
Carbohydrate N-linked sites
KIR-cl.42 KIR-cl.11 KIR-cl.42 KIR-cl.11
O-linked
4 4 none none
Three domain KIR (KIR-cI.11, p70)
Two domain KIR
(KIR-cl.42, p58.1)
Human gene location 19q13.4 H
T TT TTTT COOH Domains
]s I
KIR-cl.42 Domains
KIR-cI.11
Isl
CHY
I
u
ca
CWS I YRC
c2 CWS
i
I
I
,
CSS
I
YRCI , ca 1
I
c2
YRC
i
COOH
I TMlcyl
CSS I
v a ci ca
IvMIcu
Tissue distribution Individual KIRs are expressed on overlapping subsets of NK cells and T cells 1-3. Expression of specific KIRs varies considerably between individuals. For example KIR-cl.2 (NKB1)expression on NK cells varies from 0 to >75% between individuals 12. Interestingly, although the expression pattern appears to be genetically determined it is not correlated with MHC haplotype 12.
Structure KIRs are a closely related multigene family of transmembrane glycoproteins having either two or three C2-set IgSF domains in the extracellular portion. The leader and extracellular portion of the two-domain KIRs show a high degree of similarity (>74%)with the leader and two membrane-proximal domains of the three-domain KIRs. KIRs either have a longer cytoplasmic domain (76-84 amino acids) containing two V/IXYXXL motifs or a truncated cytoplasmic domain (27-39 amino acids) lacking this motif. Molecules lacking the V/IXYXXL motifs have, within their transmembrane region, a charged residue in consensus sequence KXPXTI.
Ligands and associated molecules Individual KIRs bind directly to specific MHC class I alleles 1-3. KIRs have been identified which bind to HLA-A, -B, a n d - C 4 alleles. Binding specificity correlates with sequence similarity in the extracellular region of the KIRs,
~25
KIR
"" . [
including the number of IgSF domains. Binding involves MHC Class I residues (77-83) in the C-terminal end of the al-domain a helix, adjacent to the peptide binding groove, but does not require glycosylation of the adjacent N-glycan site 13. Although KIRs do not recognize specific peptides, the nature of the peptide in the groove does affect binding 14. KIRs with two IgSF domains contain a zinc-binding motif (HExxH) at their N-terminus and have been reported to bind zinc affinity columns and to require zinc for target recognition is. Most KIRs possess V/IXYXXL motifs in their cytoplasmic domain which, when phosphorylated, mediate association with and activation of the cytoplasmic tyrosine phosphatases SHP-1 and SHP-216-1s. Ligation of KIRs leads to recruitment and activation of SHP-1, and catalytically-active SHP-1 is required for inhibition of NK cell killing 16-18.
Function KIRs on NK cells and cytotoxic T lymphocytes are involved in recognition of MHC Class I molecules on target cells. Ligation of KIRs generally inhibits killing by NK or T cells, thereby protecting target cells which express autologous MHC class I molecules 1-3. Since targets lacking suitable MHC Class I molecules are susceptible to killing, KIRs enable NK cells to recognize the 'absence of self' on cells. NK cells can express multiple, functionally independent KIRs 1-3. The precise role of KIRs on T cells is not clear but the recent observation that KIRs are preferentially expressed on memory cells is consistent with a role in limiting and/or terminating activation through the T cell receptor 2. Preliminary reports suggest that KIRs lacking the cytoplasmic I/VXXYXXL motif might transmit activation signals 1,3. Database accession numbers for human KIR cDNA clones cDNA
PIR
KIR-cl.42/NKAT 1 KIR-cl.6/NKAT2 KIR-cl.43 KIR-cl.49/NKAT5 KIR-cl.39/NKAT8 p183-ActI pEB6-ActI NKAT7 NKAT9 KIR-cl.2/NKB1 KIR-cl. 11/NKAT3 NKAT4 NKAT10
SWISSPR OT
EMBL/GENBANK
REFERENCE
P43626 P43628 P43627 P43631 P43632
U24076/L41267 U24074/L41268 U24075 U24079/L41347 U24077/L76671 X89893 X89892
s,6 s,6 s s,6 5,1o 9 9
L76672 U31416 U30274/L41269 L41270 L76661
,o 7,s 6,s 6 lo
L76670
P43629 P43630
I.
Amino acid sequence of KIR-cl.42 (NKAT1) 9
i
i
526
MSLLVVSMAC HEGVHRKPSL LIGEHHDGVS IIGLYEKPSL PAGPKVNGTF GNPSNSWPSP KNAAVMDQES PKTPPTDIIV
VGFFLLQGAW LAHPGPLVKS KANFSISRMT SAQPGPTVLA QADFPLGPAT TEPSSKTGNP AGNRTANSED YTELPNAESR
P EETVILQCWS QDLAGTYRCY GENVTLSCSS HGGTYRCFGS RHLHILIGTS SDEQDPQEVT SKVVSCP
DVMFEHFLLH GSVTHSPYQV RSSYDMYHLS FHDSPYEWSK VVIILFILLF YTQLNHCVFT
REGMFNDTLR SAPSDPLDIV REGEAHERRL SSDPLLVSVT FLLHRWCSNK QRKITRPSQR
-i 50 i00 150 200 250 300 327
KIR
Amino
j
...........
! i
I
i
,
|
J . . . . .
a c i d s e q u e n c e of K I R - c I . 1 1 ( N K A T 3 )
MSLMVVSMAC HMGGQDKPFL HGRIFQESFN RKPSLLAHPG HDGVSKANFS EKPSLSAQPG VNRTFQADFP SWPSPTEPSS VMDQEPAGNR PTDTILYTEL
VGLFLVQRAG SAWPSAVVPR MSPVTTAHAG PLVKSGERVI IGPMMLALAG PKVQAGESVT LGPATHGGTY KSGNPRHLHI TANSEDSDEQ PNAKPRSKVV
P GGHVTLRCHY NYTCRGSHPH LQCWSDIMFE TYRCYGSVTH LSCSSRSSYD RCFGSFRHSP LIGTSVVIIL DPEEVTYAQL SCP
RHRFNNFMLY SPTGWSAPSN HFFLHKEGIS TPYQLSAPSD MYHLSREGGA YEWSDPSDPL FILLLFFLLH DHCVFTQRKI
KEDRIHIPIF PVVIMVTGNH KDPSRLVGQI PLDIVVTGPY HERRLPAVRK LVSVTGNPSS LWCSNKKNAA TRPSQRPKTP
-I 50 i00 150 200 250 300 350 400 423
Comparison of KIR-cl.42 (NKAT1) and KIR-cl.49 (NKAT5, p 183-ActI) transmembrane and cytoplasmic domains
KIR-cl. 42 ILIGTSVVII LFI-LLFFLL HRWCSNKKNA AVMDQESAGN RTANSEDSDE KIR-cl. 49 VLIGTSVVKI PFTILLFFLL HRWCSNKKNA AVMDQEPAGN RTVNSEDSDE
30 30
KIR-cl.42 QDPQEVTYTQ LNHCVFTQRK ITRPSQRPKT PPTDIIVYTE LPNAESRSKV KIR-cl. 49 QDHQEVSYA -
80 39
KIR-cl.42 VSCP KIR-cl 49
84
Note: Bold, I/VXYXXL motifs; double underlined, KXPXTI motif.
References
i
9
1 z 3 4 s 6 7 8 9 lo 11 12 la 14 is 16 17 18
Moretta, A. et al. (1996) Annu. Rev. Immunol. 14, 619-648. Lanier, L.L. and Phillips, J.H. (1996) Immunol. Today 17, 86-91. Colonna, M. (1996) Curr. Opin. Immunol. 8, 101-107. D6hring, C. et al. (1996) J. Immunol. 156, 3098-3101. Wagtmann, N. et al. (1995) Immunity 2, 439-449. Colonna, M. and Samaridis, J. (1995) Science 268, 405-408. D'Andrea, A. et al. (1995) J. Immunol. 155, 2306-2310. Wagtmann, N. et al. (1995) Immunity 3, 801-809. Biassoni, R. et al. (1996)J. Exp. Med. 183, 645-650. D6hring, C. et al. (1996) Immunogenetics 44, 227-230. Baker, E.A. et al. (1995) Chromosome Res. 3, 511-512. Gumperz, J.E. et al. (1996) J. Exp. Med. 183, 1817-1827. Gumperz, J.E. and Parham, P. (1995) Nature 378, 245-248. Peruzzi, M. et al. (1996) J. Exp. Med. 184, 1585-1590. Rajagopalan, S. et al. (1995) J. Immunol. 155, 4143-4146. Burshtyn, D.N. et al. (1996) Immunity 4, 77-85. Campbell, K.S. et al. (1996)J. Exp. Med. 184, 93-100. Fry, A.M. et al. (1996) J. Exp. Med. 184, 295-300.
i27
L1
/
N C A M L1
Molecular weights Polypeptide 13 7 731 SDS-PAGE reduced
200 kDa
Carbohydrate N-linked sites O-linked sites
20 unknown
Human gene location and size Xq28; - 16 kb 1.
CEA
Domains Exon Boundaries
CNP
CIA
I YRCI Y,cl YRC ! I I I I IsII c2 I c2 I c2 /I I 12 11 11 11 12 /I VQF I SKI I GFP ,I
EYE HVM
CQV
I TQC I I I c2 11 11 I EEL
CKA
CQA
I
YFC I YSC I I I I c2 I c2 I 11 12 I' 12 I aWL I DKY I
EGA
ATN
EAA
VQL
WKP
WRP
WQP
WVP
II
KDA
VGS
F3 , " 0 F3
'~
ELM WSP
Domains Exon Boundaries
I VHY I veY I RPY I LaY I TDY I I I I I I I I I I F3 I F3 I ~3 I F3 I F3 IITMI CY 11 i 1 11 I~ 11 I~ I1 11 11 11 I1 I1\~ I EKY I KP I NV I PLD I GEE I SVK \2
VGS
AAP
DYP
GVP
SGI
TGR
I
YRS ESD
YSD
~GOOH
Tissue distribution L1 is expressed on cell bodies of post-mitotic neurons and on axons of post~ migratory neurons. It is also expressed on Schwann cells 2. L1 is present on lymphoid and granulocyte precursor cells in the bone marrow, mature T cells in the thymus and both B and T cells in the spleen a. L1 is also expressed on peripheral blood monocytes, B lymphocytes, and CD4 § T lymphocytes, but not CD8 § T lymphocytes 4. Although about 10-20% of the peripheral blood lymphocytes express both L1 and CD31, the majority express only one or the other 4.
Structure The extracellular portion of L1 is organized into six Ig C2-1ike domains followed by five flbronectin type 1II domains s. Two peptide sequences, YEGHH and RSLE (both shown in bold in the sequence below), are coded for by exons 2 and 27 of the L1 gene and appear to be specifically spliced out in haematopoietic cells 4,6,7.
~28
L1
Ligands and associated molecules L 1 mediates homotypic adhesion s. L 1 has been shown to bind the integrin CD51 / CD61 (aVfl3) 4"9. L1 binding to the integrin CD49b/CD29 (a5fll)has been demonstrated in the mouse but not in humans 9,1o. One possible explanation is that while there are two RGD motifs (RGDG and RGDS) in the sixth IgSF domain of mouse L1, only one of these (RGDG) is found in the human 4,6,11. Function
E
L E
L1 mediates homotypic adhesion between neural cells, which can be synergistically enhanced if L1 is associated with NCAM (CD56) on the membrane s. It also mediates neurite outgrowth on the ligand axonin-1 12 Activated B lymphocytes can undergo L 1-dependent homotypic aggregation 131 Since CD31 and L1 are expressed on different subsets of lymphocytes, it is proposed that CD51/CD61-L1 interaction may serve as an alternative to CD51/CD61-CD31 interaction for lymphocyte arrest and initiation of migration on endothelial cells 4. Mutations in L1 are a feature of several Xlinked neurological syndromes including hydrocephalus (HSAS), MASA syndrome, spastic paraparesis (SP1) and corpus collosum agenesis (ACC)14.
Database accession numbers Human Mouse
9.
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A41060 S05479
P32004 P 11627
X59847 X 12875
s 1,
A m i n o a c i d s e q u e n c e of h u m a n L1 a n t i g e n MVVVLRYVWP IQIPEEYEGB TRDGVHFKPK LGTAMSHEIR YWMNSKILHI IQKEPIDLRV PTPTIKSVRP SARHAYYVTV GIPVEELAKD YVVQLPAKIL QDERFFPYAN ITQGPRSTIE FIEDGRLVIH LSDLHLVTQS GNQTSTTLKL DVKGEGNETT DPFLVVSNTS GIEILNSSAV HVVVPANTTS HPEALHLECQ LRDPELRTHN FGNISATAGE YNQSSYTQWD EGWFIGFVSA DETFGEYRSL NEDGSFIGQY
LLLCSPCLL HVMEPPVITE EELGVTVYQS LMAEGAPKWP KQDERVTMGQ KATNSMIDRK SGPMPADRVT EAAPYWLHKP QKYRIQRGAL TADNQTYMAV GTLGIRDLQA KKGSRVTFTC SLDYSDQGNY QVRVSWSPAE SPYVHYTFRV NMVITWKPLR TFVPYEIKVQ LVKWRPVDLA VILSGLRPYS SNTSLLLRWQ LTDLSPHLRY NYSVVSWVPK LQPDTDYEIH IILLLLVLLI ESDNEEKAFG SGKKEKEAAG
QSPRRLVVFP PHSGSFTITG KETVKPVEVE NGNLYFANVL PRLLFPTNSS YQNHNKTLQL QSHLYGPGET ILSNVQPSDT QGSTAYLLCK NDTGRYFCLA QASFDPSLQP SCVASTELDV DHNAPIEKYD TAINKYGPGE WMDWNAPQVQ AVNSQGKGPE QVKGHLRGYN SYHLEVQAFN PPLSHNGVLT RFQLQATTKE EGQCNFRFHI LFKERMFRHQ LCFIKRSKGG SSQPSLNGDI GNDSSGATSP
TDDISLKCEA NNSNFAQRFQ EGESVVLPCN TSDNHSDYIC THLVALQGQP LKVGEEDDGE ARLDCQVEGR MVTQCEARNR AFGAPVPSVQ ANDQNNVTIM SITWRGDGRD VESRAQLLVV IEFEDKEMAP PSPVSETVVT YRVQWRPQGT PQVTIGYSGE VTYWREGSQR GRGSGPASEF GYVLSYHPLD GPGEAIVREG LFKALGEEKG MAVKTNGTGR KYSVKDKEDT KPLGSDDSLA INPAVALE
SGKPEVQFRW GIYRCFASNK PPPSAEPLRI HAHFPGIRTI LVLECIAEGF YRCLAENSLG PQPEVTWRIN HGLLLANAYI WLDEDGTTVL ANLKVKDATQ LQELGDSDKY GSPGPVPRLV EKWYSLGKVP PEAAPEKNPV RGPWQEQIVS DYPQAIPELE KHSKRHIHKD TFSTPEGVPG EGGKGQLSFN GTMALSGISD GASLSPQYVS VRLPPAGFAT QVDSEARPMK DYGGSVDVQF
-i 5O i00 150 200 250 300 350 400 450 5OO 55O 600 650 7OO 75O 80O 85O 9OO 950 I000 1050 ii00 1150 1200 1238
~2~
L1
!
,
~3(
References 1 Rosenthal, A. et al. (1994) EMBL/GenBank accession number Z29373. 2 Schachner, M. (1990) Ciba Foundation Symp. 145, 156-172. 3 Kowitz, A. et al. (1992) Eur. J. Immunol. 22, 1199-1205. 4 Ebeling, O. et al. (1996) Eur. J. Immunol. 26, 2508-2516. 5 Kobayashi, M. et al. (1991) Biochim. Biophys. Acta 1090, 238-240. 6 Reid, R.A. and Hemperly, J.J. (1992) J. Mol. Neurosci. 3, 127-135. 7 Jouet, M. et al. (1995)Mol. Brain Res. 30, 378-380 8 Kadmon, G. et al. (1990)J. Cell Biol. 110, 193-208. 9 Montgomery, A.M.P. et al. (1996) J. Cell Biol. 132, 475-485 lo Ruppert, M. et al. (1995)J. Cell Biol. 131, 1881-1891. 11 Moos, M. et al. (1988) Nature 3345, 701-703. 12 Kuhn, T.B. et al. (1991) J. Cell Biol. 115, 1113-1126. 13 Kowitz, A. et al. (1993) Clin. Exp. Metastasis 11,419-429. 14 http://dnalab-www.uia.ac.be/dnalab/ll.html
Lymphocyte activation gene 3 Molecular weights
H2
Polypeptide
51296
SDS-PAGE reduced unreduced
70 kDa 70 kDa
Carbohydrate N-linked sites O-linked
4 unknown
Human gene location and size 12p13.3; 6.6 kb 1
COOH CSP i YRA
Domains
,
I
I
Exon boundaries PVK DSG
CSF WGC I
SMT
CRL YTC,.
,
I. . . . . .
LGL
TVT
GEVwQc I
, !
i
PGA QWR
.
Tissue distribution LAG-3 is expressed in activated T and NK cells, but not in activated B cells or monocytes 2. Staining of sections identified scattered cells in germinal centres and T cell areas of lymphoid organs but not in non-lymphoid tissue 2. On cultured IL-2-dependent T cells, expression of LAG-3 was 7-fold higher on CD8 § cells than CD4 § cells 2
Structure The cDNA encodes a transmembrane glycoprotein consisting of an IgSF V-set domain followed by three IgSF C2-set domains, a transmembrane sequence and a highly charged cytoplasmic domain. The first IgSF domain contains an extra loop of about 30 amino acids and art unusual disulfide bond is proposed between strands B and G (not F) in this domain 1. Based on sequence similarities and intron/exon organization LAG-3 is likely to have shared an immediate ancestor in evolution with CD4. The CD4 and LAG-3 genes are closely linked (see page 141).
Ligands and associated molecules LAG-3 binds MHC Class II in a cell rosetting assay 3. Binding was inhibited by LAG-3 mAbs and by LAG-3-Ig 3.a. A 45 kDa protein is co-precipitated with LAG-3 from activated T cells under reducing and non-reducing conditions a.
~31
Function A role for LAG-3 in downregulating an antigen-specific response is suggested by the effects of LAG-3 mAbs or LAG-3-Ig in prolonging an antigen-specific immune response by a MHC Class U-restricted CD4 § T cell line s. However, LAG-3-deficient mice display normal MHC Class H-restricted responses 6. Crossing LAG-3-deficient mice with CD4-deficient mice revealed that LAG3 does not substitute for CD4 6.
Database accession numbers Human Mouse
9.
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
$11246
P18627
X51985 X98113
1
Amino acid sequence of human LAG-3 MWEAQFLGLL FLQPLWVAPV VPVVWAQEGA PAQLPCSPTI PLAPGPHPAAPSSWGPRPRR RGDFSLWLRP ARRADAGEYR LRASDWVILN CSFSRPDRPA PQVSPMDSGP WGCILTYRDG GLPCRLPAGV GTRSFLTAKW GTYTCHIHLQ EQQLNATVTL RFVWSSLDTP SQRSFSGPWL ELSSPGAQRS GRAPGALPAG RPRRFSALEQ GIHPRRLRAR
KPLQPGAE PLQDLSLLRR YTVLSVGPGG AAVHLRDRAL SVHWFRNRGQ FNVSIMYNLT TPPGGGPDLL AIITVTPKSF EAQEAQLLSQ HLLLFLTLGV
AGVTWQHQPD LRSGRLPLQP SCRLRLRLGQ GRVPVRESPH VLGLEPPTPL VTGDNGDFTL GSPGSLGKLL PWQCQLYQGE LSLLLLVTGA
SGPPAAAPGH RVQLDERGRQ ASMTASPPGS HHLAESFLFL TVYAGAGSRV RLEDVSQAQA CEVTPVSGQE RLLGAAVYFT FGFHLWRRQW
References 1 2 a 4 s 6
53~
Triebel, F. et al. (1990)J. Exp. Med. 171, 1393-1405. Huard, B. et al. (1994) Immunogenetics 39, 213-217. Baixeras, E. et al. (1992) I- Exp. Med. 176, 327-337. Huard, B. et al. (1995) Eur. J. Immunol. 25, 2718-2721. Huard, B. et al. (1994) Eur. J. Immunol. 24, 3216-3221. Miyazaki, T. et al. (1996)Int. Immunol. 8, 725-729.
-i 50 I00 150 200 250 300 350 400 450 470
Low-density lipoprotein receptor
E
t
.
Molecular weights Polypeptide 93 095
L~
SDS-PAGE reduced
160 kDa
L~
Carbohydrate N-linked sites O-linked
5 + abundant
Human gene location and size 19p13.2-p13.12; >45 kb 1
?
~T Domains
Exon boundaries Domains Exon boundanes Domains Exon boundaries
COOH CER . ._,., CKS , ..,,., CSQ
I
~r
'~11 L TAVG
wyI
il L
CLSV
,1
RD(fl
I1 L
'
CPPK
I~RPL Y D C /ECEG LRDcCLD Jl NVTL
CGP PDC C,].SA PDC
I,....
CGTN CER
I
GSTE I LDLL FRGVN HGFM FEDK
L
'1'
L
I
l'
CAVA
RRc?QDI~E KACI
I1
EDID
I1
VGSI
Rsc TEAE
I1N\ ~ QALG~PSRQ IVLL
Tissue distribution The LDLR is found in most tissues with highest levels of expression on hepatocytes and in the adrenal cortex 2,a. It is also present on lymphocytes, monocytes and macrophages 4. The LDLR is regulated by cholesterol levels, such that cells incubated in cholesterol-free media upregulate LDLR expression and addition of LDL or cholesterol can lead to a decrease in receptor expression 4.
~32
LDLR
Structure The mature LDLR protein is composed of several functional domains s. The N-terminal 288 amino acids of the protein contain seven Cys-rich repeats called LDLR domains. This region of the protein was shown to be important for the ligand binding function of the receptor by mutagenesis studies 6. The structure of the first LDLR domain has been determined by NMR 7. The next 350 amino acids show 33% homology to the EGF precursor and includes three EGF domains. The intron/exon boundaries for the EGF domains are preserved between the EGF precursor and the LDLR s. The first EGF domain participates in ligand binding, and it has also been shown that this entire region is important for the dissociation of ligand from receptor in an intracellular acidic compartment 9. The membrane-proximal region contains sites for O-linked carbohydrate attachment s. The cytoplasmic domain contains a short motif, based around Tyr807, which serves to localize the receptor into coated pits and to mediate internalization lo. This short motif is postulated to form a tight turn, similar to a sequence in the intracellular domain of the transferrin receptor (CD71)11. The N-terminal protein sequence has been determined for the purified bovine receptor, which is very similar to the human protein s.
Ligands and associated molecules The LDLR binds apoB-100- and apoE-containing lipoprotein particles. The receptor-ligand complex is internalized in coated pits, and the ligand dissociates from the receptor in acidic endosomes. The free LDLR is then recycled to the cell surface 2.
Function The LDLR delivers cholesterol-containing lipoprotems to cells for use in membrane biogenesis, synthesis of bile acids and steroid hormone synthesis z,3. The ability of receptor levels to be increased on activated macrophages may contribute to foam cell formation and generation of atherosclerotic plaques 12. Transgenic mice overexpressing human LDLR clear intravenously injected LDL 8-10 times more rapidly than normal mice, while their plasma concentrations of apoB-100 and apoE are reduced by greater than 90% 13.
Comments
i .........
Naturally occurring mutations which affect human LDLR expression, ligand binding, or internalization lead to familial hypercholesterolaemia which is characterized by high blood cholesterol levels and myocardial infarction (or atherosclerosis) early in life 2,3. These are reviewed in OMIM entry. 143890 (see Chapter 1 for methods to access OMIM).
Database accession numbers "
~
PlR
Human :~ 'A01383 Mouse ~ JN046i Rat S03430 i34
SWISSPR OT
EMBL/GENBANK
P01130 P35951 P35952
L00336-L00352 Z19521 X13722
REFERENCE 1,5 14 15
LDLR
Amino acid sequence of human LDL receptor MGPWGWKLRW AVGDRCERNE DFSCGGRVNR ISRQFVCDSD DCEDGSDEWP CKDKSDEENC NVTLCEGPNK NGGCSHVCND EGGYKCQCEE LIPNLRNVVA DIQAPDGLAV IVVDPVHGFM SGRLYWVDSK DIINEAIFSA LSNGGCQYLC QETSTVRLKV LGDVAGRGNE NFDNPVYQKT
TVALLLAAAG FQCQDGKCIS CIPQFWRCDG RDCLDGSDEA QRCRGLYVFQ AVATCRPDEF FKCHSGECIT LKIGYECLCP GFQLDPHTKA LDTEVASNRI DWIHSNIYWT YWTDWGTPAK LHSISSIDVN NRLTGSDVNL LPAPQINPHS SSTAVRTQHT KKPSSVRALS TEDEVHICHN
T YKWVCDGSAE QVDCDNGSDE SCPVLTCGPA GDSSPCSAFE QCSDGNCIHG LDKVCNMARD DGFQLVAQRR CKAVGSIAYL YWSDLSQRMI DSVLGTVSVA IKKGGLNGVD GGNRKTILED LAENLLSPED PKFTCACPDG TTRPVPDTSR IVLPIVLLVF QDGYSYPSRQ
CQDGSDESQE QGCPPKTCSQ SFQCNSSTCI FHCLSGECIH SRQCDREYDC CRDWSDEPIK CEDIDECQDP FFTNRHEVRK CSTQLDRAHG DTKGVKRKTL IYSLVTENIQ EKRLAHPFSL MVLFHNLTQP MLLARDMRSC LPGATPGLTT LCLGVFLLWK MVSLEDDVA
TCLSVTCKSG DEFRCHDGKC PQLWACDNDP SSWRCDGGPD KDMSDEVGCV ECGTNECLDN DTCSQLCVNL MTLDRSEYTS VSSYDTVISR FRENGSKPRA WPNGITLDLL AVFEDKVFWT RGVNWCERTT LTEAEAAVAT VEIVTMSHQA NWRLKNINSI
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 700 750 800 839
References 1 2 3 4 s 6 7 8 9 lO 11 12 13 14 is
Sudhof, T.C. et al. (1985) Science 228, 815-822. Brown, M.S. and Goldstein, J.L. (1986) Science 232, 34-47. Soutar, A.K. and Knight, B.L. (1990) Br. Med. Bull. 46, 891-916. Cuthbert, J.A. et al. (1989) J. Biol. Chem. 264, 1298-1304. Yamamoto, T. et al. (1984)Cell 39, 27-38. Esser, V. et al. (1988) J. Biol Chem. 263, 13282-13290. Daly, N.L. et al. (1995) Proc. Natl Acad. Sci. USA 92, 6334-6338. Sudhof, T.C. et al. (1985) Science 228, 893-895. Davis, C.G. et al. (1987) Nature 326, 760-765. Davis, C.G. et al. (1987) J. Biol. Chem. 262, 4075-4082. Collawn, J.F. et al. (1991) EMBO J. 10, 3247-3253. Griffith, R.L. et al. (1988)J. Exp. Med. 168, 1041-1059. Hoffman, S.L. et al. (1988) Science 239, 1277-1281. Hoffer, M.J.V. et al. (1993)Biochem. Biophys. Res. C o m m u n . 191,880-886. Lee, L.Y. et al. (1989) Nucleic Acids Res. 17, 1259-1260.
~35
LPAP
Lymphocyte phosphatase-associated phosphoprotein
Other names CD45-AP (CD45-associated protein)(mouse) LSM-1 (mouse)
Molecular weights Polypeptide SDS-PAGE reduced
21 196
29, 32 kDa (resting T cells) 30, 31 kDa (activated T cells)
Carbohydrate N-linked sites O-linked
nil nil
GOOH
Tissue distribution The expression of LPAP is restricted to resting and activated B and T lymphocytes and to lymphocyte cell lines 1.
Structure The LPAP molecule has a short extracellular region, a transmembrane region and a large C-terminal cytoplasmic domain 1. The first 50 amino acids of the cytoplasmic region share sequence similarity to WW domains, which have the potential to be involved in protein interaction a. LPAP is expressed in at least two different isoforms, LPAP29 and LPAP32, in resting T cells. These are replaced by LPAP30 and LPAP31 isoforms in activated T cells. The four isoforms appear to be generated by differential phosphorylation in vivo 1. The discrepancy between the molecular weight of the polypeptide backbone and the apparent molecular weight determined by SDS-PAGE does not result from the attachment of phosphate groups as in vitro translated material has a similar Mr 1. The N-terminus has been determined by protein sequencing 1.
Ligands and associated molecules LPAP associates non-covalently with CD45. The LPAP-CD45 interaction is mediated by the transmembrane regions of the two molecules 2,3. An extracellular ligand for LPAP has not been identified.
Function LPAP was characterized as a CD45-associated molecule that was detected following in vitro kinase reactions of CD45 immunoprecipitates 1,4. LPAP may function as an adapter molecule that links CD45 with other proteins, possibly through interactions with the putative WW domain. LPAP is unlikely to play a major role in linking CD45 to the protein tyrosine kinase Lck, because the LPAP-Lck interaction is weak 2.
53~
LPAP
Database accession numbers PIR
Human Mouse
st
SWISSPR OT
A55412 A49957
EMBL/GENBANK
X81422 U03856
REFERENCE 1 4
A m i n o acid sequence of h u m a n LPAP MALPCTLGLG MLLALPGALG SGGSAEDSVG GAALWGRTRR GGLQADPGEG GSAEALLSDL
SSSVTVVLLL LLWASPPGRW EQQCGEASSP HAFAGSAAWD
-1 LLLLLLATGL LQARAELGST EQVPVRAEEA DSARAAGGQG
ALAWRRLSRD SGGYYHPARL DNDLERQEDE QDTDYDHVAD RDSDTEGDLV LGSPGPASAG LHVTAL
50 i00 150 186
References 1 Schraven, B. et al. (1994) J. Biol. Chem. 269, 29102-29111. 2 McFarland, E.D.C. and Thomas, M.L. (1995) J. Biol. Chem. 270, 2810328107. 3 Bruyns, E. et al. (1995) J. Biol. Chem. 270, 31372-31376. 4 Takeda, A. et al. (1994) J. Biol. Chem. 269, 2357-2360.
537
ltk
Leucocyte tyrosine kinase
Molecular weights Polypeptide
90 005
SDS-PAGE
100 kDa
2
Carbohydrate N-linked sites O-linked
3 unknown
Human gene location 15q13-21
K
Tissue distribution mRNA for human ltk has been found in haematopoietic cell lines, in a neuroblastoma cell line and in placenta 1,2. In the mouse, ltk message is specific to pre-B cell lines and is highest in adult but not embryonic brain 3. Staining of mouse brain with polyclonal antiserum was specific for cerebral neurons 3. Immunohistochemical staining with peptide mAbs of Hofbauer cells in human placenta has been reported but in general the level of expression of ltk is too low to detect with mAbs 2,4.
E
Structure The extracellular domain does not show similarities with any known protein z. The cytoplasmic tyrosine kinase domain is closely related to that of c-ros 4. Alternatively spliced ltk mRNAs predict the existence of receptors which are soluble or lack a cytoplasmic domain 2. The predicted protein sequence of murine ltk begins at a CTG start codon and does not contain an identifiable signal sequence 3. ltk cDNAs isolated which predicted a protein devoid of a large extracellular domain probably represent short or aberrant cDNA clones 2
Ligands and associated molecules In transfection studies in COS cells, ltk associated with PLC-7, PI 3-kinase, GAP and raf-1 and the association was dependent on Met544 in a putative ATP binding site 4. Analysis of a chimeric receptor revealed that in a liganddependent manner, the cytoplasmic domain of ltk associates with Shc through Tyr862 and Tyr485 which are both contained within an Shc binding motif s .
Function Ltk probably functions as a receptor for an unidentified growth factor. In an in vitro kinase assay in COS cells, ltk had kinase activity and was tyrosine
phosphorylated 4. A chimeric receptor containing the cytoplasmic domain of ltk was autophosphorylated in response to ligand s.
i3~
ltk Database accession numbers Human Mouse
9.
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
S17452 S00904
P29376 P08923
D16105 X07984
2 3
A m i n o acid sequence of h u m a n ltk MGCWGQLLVW ILCSSPGSQE SWLFSTCGAS GQYLISAYGA ACPGGSPESQ VFRVRAGELE RGGAAGGGGG GGGGACTAGG TENHGEVEIR DLHKPPGPLV LELSKLRTSA GHGAFGEVYE KFRHQNIVRC MRDLLQLAQD MARDIYRASY FSLGYMPYPG RPSFASILER PPQPQELSPE
FGAAGA TFLRSSPLPL GRHGPTQTQC AGGKGAKNHL LVCLGESRAV PLLVAAGGGG WTSRAPSPQA GGGGYRGGDA RHLNCSHCPL LMVAVVATST IRTAPNPYYC GLVIGLPGDS VGLSLRATPR IAQGCHYLEE YRRGDRALLP RTNQEVLDFV LQYCTQDPDV KLKSWGGSPL
ASPSPQDPKV DGAYAGTSVV SRAHGVFVSA EEHAAMDGSE RAYLRPRDRG GRSLQEGAEG SETDNLWADG RDCQWQAELQ LSLLMVCGVL QVGLGPAQSW SPLQVAIKTL LILLELMSGG NHFIHRDIAA VKWMPPEAFL VGGGRMDPPR LNSLLPMELG GPWLSSGLKP
SAPPSILEPA VTVGAAGQLR IFSLGLGESL GVPGSRRWAG RTQASPEKLE GQGCSEAWAT EDGVSFIHPS LAECLCPEGM ILVKQKKWQG PLPPGVTEVS PELCSPQDEL DMKSFLRHSR RNCLLSCAGP EGIFTSKTDS GCPGPVYRIM PTPEEEGTSG LKSRGLQPQN
SPLNSPGTEG GVQLWRVPGP YILVGQQGED GGGGGGGATY NRSEAPGSGG LGWAAAGGFG SELFLQPLAV ELAVDNVTCM LQEMRLPSPE PANVTLLRAL DFLMEALIIS PHLGQPSPLV SRVAKIGDFG WSFGVLLWEI TQCWQHEPEL LGNRSLECLR LWNPTYRS
-I 5O I00 150 200 250 300 350 400 450 5O0 55O 600 650 7OO 75O 8OO 848
References 1 2 3 4 5
Krolewski, J.J. and Dalla-Favera, R. (1991) EMBO J. 10, 2911-2919. Toyoshima, H. et al. (1993) Proc. Natl Acad. Sci. USA 90, 5404-5408. Bernards, A. and de la Monte, S.M. (1990)EMBO J. 9, 2279-2287. Kozutsumi, H. et al. (1994) Oncogene 9, 2991-2998. Ueno, H. et al. (1995) J. Biol. Chem. 270, 2 0 1 3 5 - 2 0 1 4 2 .
~3c~
Ly-6
Sea-l, TAP, MALA-1
Molecular weights Ly-6A.2 8649 SDS-PAGE reduced unreduced
14-18 kDa 10-14 kDa
Carbohydrate N-linked sites O-linked sites
0 unknown
TTTTTTT
Mouse locus location and gene sizes 15 band E, -3 kb (Ly-6A.2/E. 1, Ly-6C. 1); -4 kb (Ly-6F. 1, Ly-6G. 1 ) 1-a. Domains
CYE
Isll
Ly-6C.1 Exon boundaries
I
RAQ
L6
I1 EDS
DLC 'IGI
The Ly-6 genetic complex The mouse Ly-6 locus spans approximately 630 kb and contains at least 18 genes 1,2. Hybridization studies show that several of these genes are related to Ly-6. The Ly-6 lymphoid antigens were originally characterized with alloantisera, but subsequent cDNA cloning showed that they were the products of two genes, namely Ly-6A/E and Ly-6C 4,5. Recently two further genes (Ly-6F and Ly-6G) have been identified in the region which are clearly related to Ly-6A/E and Ly-6C 3. The mouse thymocyte B cell antigen (ThB) also has sequence homology (-25 %) to the Ly-6A/E and Ly-6C molecules and its gene has been mapped to chromosome 15, raising the possibility that it too may lie within the Ly-6 complex 6. The gene encoding the antigen Ly-6B has yet to be characterized. No human Ly-6 homologues have been identified to date.
Tissue distribution I I !
The four mouse Ly-6 genes have different expression patterns, as do the alleles of the Ly-6A/E gene 3-s. Ly-6A/E is expressed on haematopoietic stem cells 7. The allele Ly-6A.2 is also present on 30% of CD4-/CD8thymocytes, 50-70% of peripheral lymphocytes, and on all activated B and T cells. In contrast, the allele Ly-6E.1 is expressed on only 10-15% of peripheral lymphocytes and on activated T and B cells. Ly-6C is expressed on bone marrow cells, monocytes, neutrophils and 50% of peripheral CD8 § T cells. Ly-6G.1 is expressed in the bone marrow. Transfection studies suggest that the Ly-6G.1 cDNA does not encode a protein that is recognized by anti-Ly-6B antibodies s. Ly-6F.1 mRNA is not detected in the bone marrow, spleen or thymus, but is detected in several non-lymphoid tissues including testes 3.
Ly-6 Structure Ly-6A/E and Ly-6C are both GPI-anchored cell surface glycoproteins 4, as are the predicted proteins encoded by the Ly-6F.1 and Ly-6G.1 genes 3. The Ly-6 gene products share -50% amino acid sequence identity. The N-termini of Ly-6C.2 and Ly-6E.1 have been confirmed by protein sequencing 8,9. The sites of attachment of the GPI anchors are not known, but they are predicted to be Asn76/79 based on known attachment sites of similar GPI-anchored proteins 4,~o. The Ly-6 antigens are similar to CD59 and presumably have a similar fold/0
Function The expression patterns of Ly-6A/E, Ly-6C and Ly-6G suggest a role in the development and maturation of lymphocytes 3. Anti-Ly-6 antibodies can activate T cells but this generally requires secondary antibody cross-linking and phorbol esters, and is dependent on TCR/CD3 expression 11 This activation requires the GPI anchor 12. B cell activation can be induced by anti-Ly-6 antibodies in the presence of IL-4 and IFN713. Ly-6 expression is induced in a complex manner by interferons and TNF s,14.
Database accession numbers M o u s e A.2
Mouse Mouse Mouse Mouse Mouse Rat A Rat B Rat C
E.1 C. 1 C.2 F.I G.1
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
A32506 A31935 A35921 A25708
P05533
M18184 J03636 M37707 X04653 M21734 M18466 X70922 X70920 M30692 M30689 M30690
is
P05533 P09568 P09568 P35460 P35461
A45835 D45835
/6 17 18
19 8 3 3 2o 20 20
A m i n o a c i d s e q u e n c e of m o u s e Ly-6 a n t i g e n s A. 2 MDTSHTTKSC E.I
C.I
C.2
LLILLVALLC
--ST-A --ST-A
F. 1 - - S C . . . . . .
AERAQG
-i -i
G V ..... V-
-i
G
-i -i
G. 1 - - S C . . . . . .
V ..... V-
A.2
FETSCPSITC
PYPDGVCVTQ
EAAVIVDSQT
C. i - Q - - E . . . . .
I ..... AV--
RAS--F-IA-
NIEL-E---R--L-TRQ--S
F. 1 . . . . N - L - - S
LGIA-K
E.I
C.2
LECYQCYGVP
-Q--E .....
G. 1 . . . . N - I - - -
I ..... AV--
P .... NTT--
-i
RAS--F-IA-
A--IS-
-FS--F--AL
G
NIEL-E---R
RKVKNNLCLP
--L-TRQ--S
QVEL ..... R ..... K--F-
-IE ..... HR S---S .....
50 50 50 50
50 50
~41
~-6 A.2
ICPPNIESME
E.I
ILGTKVNVKT
....................
C. 1 F--AGV...P
I
C. 2 F--AGV...P
F. 1 F--A-L-N
G. 1 ---TTLDNTA.2
E.IA
VAVPNGGSTW
-RDPNIRER -KDPNIRER
......
-T-NA
T---N
.....
TMAGVLLFSL
C.2 A---TA
G.I
79
....
S .....
....
K .....
....
S .....
Y--K
.....
SSVLLQTLL
............................
C. 1 A - - - T A
F.I
SCCQEDLCN
APFST
-V .....
.......
A---T---S
TR---LN-
I
G--F .....
V ......
F-
79
76 76
79 79
+29 +29 +29 +29 +29 +29
Dashes represent residues identical to that in the top sequence; dots represent a gap in alignment.
References 1 2 3 4 s 6 7 8 9 lo 11 12 13 14 is 16 17 18 19 2o
~4~
LeClair, K.P. et al. (1987) Proc. Natl Acad. Sci. USA 84, 1638-1643. Kamiura, S. et al. (1992) Genomics 12, 89-105. Fleming, T.J. et al. (1993) J. Immunol. 150, 5379-5390. Shevah, E.M. and Korty, P.E. (1989) Immunol. Today 10, 195-200. Rock, K.L. et al. (1989) Immunol. Rev. 111, 195-224. Gumley, T.P. et al. (1992) J. Immunol. 149, 2516-2518. van de Rijn, M. et al. (1989) Proc. Natl Acad. Sci. USA 86, 4634-4638. Palfree, R.G. et al. (1988) J. Immunol. 140, 305-310. Reiser, H. et al. (1987) Proc. Natl Acad. Sci. USA 84, 3370-3374. Kieffer, B. et al. (1994) Biochemistry 33, 4471-4482. Sussman, J.J. et al. (1988)J. Immunol. 140, 2520-2526. Su, B. et al. (1991) J. Cell Biol. 112, 377-384. Codias, E.K. and Malek, T.R. (1990) J. Immunol. 144, 2197-2204. Malek, T.R. et al. (1989) J. Immunol. 142, 1929-1936. Palfree, R.G. et al. (1987) Immunogenetics 26, 389-391. Reiser, H. et al. (1988) Proc. Natl Acad. Sci. USA 85, 2255-2259. Khan, K.D. et al. (1990) Mol. Cell. Biol. 10, 5150-5159. LeClair, K.P. et al. (1986) EMBO J. 5, 3227-3234. Bothwell, A.L.M. et al. (1988) J. Immunol. 140, 2815-2820. Friedman, S. et al. (1990) Immunogenetics 31, 104-111.
Ly-9 Molecular weights Polypeptide
NH2
67 278 v
SDS-PAGE reduced
90-120 kDa $
Carbohydrate N-linked sites O-linked
8 unknown
v
s c 2 s
Domains
COOH LNI
I sl
CTV
YHA
v
I
LEF
YTC
c2
I
CSV
YHA
v
I
FTC
c2
ITMIcu
Tissue distribution Expressed on mouse thymocytes and mature T and B cells 1.
Structure
I
Ly-9 is an IgSF domain-containing glycoprotein with structural features which place it within the CD2 family, which includes CD48, CD58, 2B4 and CD1501,2. Ly-9 differs from the other CD2 family members in having four rather than two IgSF domains. Domains 1 and 3 are very similar to each other, as are domains 2 and 4, suggesting that Ly-9 arose from a progenitor with one V and one C2 domain, such as CD481. The Ly-9 locus lies within 1100 kb of the CD48 gene on mouse chromosome 13. It is possible that Ly-9 and CD48 arose by gene duplication from a c o m m o n ancestor with a second duplication step leading to the four domain structure of Ly-9. The gene for a putative h u m a n homologue of Ly-9 lies within 410 kb of the CD48 locus in h u m a n chromosome 1q22 3,4.
Function Unknown.
Database accession numbers Mouse Human
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A46500
Q01965
M84412 L42621
4
1
~4~
Amino acid sequence of mouse Ly-9 MSQQQIFSPI KETPPTVISG ILDKGYNGRL LHIYEKLQKP LNTYDGSHTL RKTAAGKTVV ADSRRKPKGS VRHFTLLVYK LQNKAVMSQG PERNKRFWLL EPPTGHGQFS TRMHSTANSR DMAYIQVSLN PYL
LWIPLLFLIM MLGGSVTFSL KVSEDGYSLY QIIVESVTPS RVSQSVCDPD GILGEPVTLP EERRVRTSDQ RLEKPSVTKS KSHLNVSWES LLLVLLLLML VLSQRYEKLD NQLYDLVTHQ VQGETPLPQK
GLGASG NISKDAEIEH MSNLTKSDSG DTDSCTFTLI LPYTCKAWNP LEFRATRATK DQSLKISQLK PVHMMNGICE GEHLPNFTCT IGGYFILRKK MSAKTTRHQP DIAHALAYEG KEDSNTIYCS
IIWNCPPKAL SYHAQINQKN CTVKGTKDSV VSQNSSQPVR NVVWVLNTSV MEDAGPYHAY VVLTCSVDGG AHNPVSNSSS KQCSSLATRY TPTSDTSSES QVEYEAITPY VQKPKKTAQT
ALVFYKKDIT VILTTNKEFT QYSWTREDTH IWQFCTGASR ISQERRGAAT VCSEASRDPS GNNVTYTWMP QFSSGTICSG RQAEVPAEIP SATTEEDDEK DKVDGSMDEE PQQDAESPES
-i 50 i00 150 200 250 300 350 400 450 500 550 600 603
References 1 2 3 4
i44
Sandrin, M.S. et al. (1992) J. Immunol. 149, 1636-1641. Davis, S.J. and van der Merwe, P.A. (1996) Immunol. Today 17, 177-187. Kingsmore, S.F. et al. (1995) Immunogenetics 42, 59-62. Sandrin, M.S. et al. (1996)Immunogenetics 43, 13-19.
Members Mouse
Rat
Ly-49A (A1, YE1/48) Ly-49B Ly-49C (5E6) Ly-49D Ly-49E Ly-49F Ly-49G (LGL-1) Ly-49H rLy-49.9 rLy-49.12 rLy-49.29
Molecular weights (Ly-49A) Polypeptide 30 648
z
SDS-PAGE reduced unreduced
44-50 kDa 85-95 kDa
Carbohydrate N-linked sites O-linked
3 unknown
IT
s
NH2
NH2
Gene location and size Mouse: chromosome 61; N19 kb 2 Rat: chromosome 4 3 Domain Exon boundaries
CQS ,
RRC
GNC
I
ITGR KLA KIF ELKF
Ly-49 a n d t h e N K g e n e c o m p l e x The NK gene complex (NKC) is a chromosomal region containing several genes and multigene families which encode cell surface C-type lectins expressed on NK cells 4,s. It was first identified in the mouse (chromosome 6) 4 and subsequently in humans (chromosome 12) 4 and the rat (chromosome 4) 3. Ly-49, NKR-P1 and CD69 genes have been identified in the mouse NKC. NKR-P1, CD69, CD94 and NKG2 genes have been identified in the human NKC. Ly-49 and NKR-P1 genes have been identified in the rat NKC 3,s. cDNA clones have been obtained corresponding to eight mouse and three rat Ly-49 genes, but no human Ly-49 homologues have been identified 3'4'6. In the mouse multiple alleles have been identified for some genes (Ly-49A, Ly49C v, and Ly-49G 8)and mice heterozygous for Ly49A and Ly-49C may undergo allelic exclusion at these loci 7. Four probable splice varients of Ly-49G have been identified 9,1o, one of which (Ly-49G2) corresponds to LGL-1 8
~45
Tissue distribution
i
Ly-49A is expressed at low levels on mouse thymocytes and T cells and at higher levels on 15-20% of NK cells. Ly-49C is expressed on a similar proportion (-26%) of NK cells, with -5% of cells expressing both Ly-49A and C 1~ Ly-49G (LGL-1) is expressed on -40% of NK cells with 8% expressing both Ly-49G and Ly49A 8. Because of a lack of suitable reagents, the expression pattern of other mouse and rat Ly-49 molecules is not known.
Structure Proteins within the rat and mouse Ly-49 gene families share 62-90% 3 and 48-91% 9 amino acid identity, respectively, and there is 46-81% identity between the rat and mouse Ly-49 families 3. Sequence comparisons suggest that all members have a similar overall structure 9,1o. They are type II transmembrane proteins with no signal peptide and an extracellular C-type lectin domain. Ly-49A, Ly-49C and Ly-49G (LGL-1)have been shown to be expressed as disulfide-linked homodimers 4"8. The cytoplasmic regions show some heterogeneity which may reflect different signalling functions. For example ITIM motifs ((I/V)XYXX(L/V)) are found in mouse Ly-49A, C, F, G1, and G4 and rat Ly-49.911 and putative Gi-binding motifs are found in mouse Ly-49A and D and rat Ly-49.9 3.
I
Ligands and associated molecules Ly-49A has been shown to bind to the MHC Class I alleles H-2D d and H-2D k s whereas Ly-49C interacts with the H - 2 D b'd'k'ands alleles 1~ Preliminary evidence suggests that Ly-49G (LGL-1) interacts with H-2D d and H-2L d 8. Ly49A and Ly49C have both been shown to interact with carbohydrates, presumably through their C-type lectin domains, and there is suggestive evidence that their interaction with MHC Class I molecules involves carbohydrate determinants 13"14. Peptides comprising the phosphorylated ITIM motif of Ly-49A bind to the cytoplasmic tyrosine phosphatases SHP-1 (PTP1C) and SHP-2 (PTP1D, SYP) 11, suggesting a mechanism for the inhibitory effect of this receptor.
Function Several members of the Ly-49 family (Ly-49A, Ly-49C, and Ly-49G) have been implicated in the recognition by murine NK cells of MHC Class I molecules on target cells 6,8. Of these the role of Ly-49A is the best documented 6. In general, ligation of Ly-49 members by MHC Class I molecules inhibits NK cell function, consistent with the notion that Ly-49 molecules are involved in recognition of the 'missing self' on target cells s. No human Ly-49 homologues have been identified. Instead CD94/NKG-2 receptors and IgSF domain-containing killer inhibitory receptors (KIR) appear to have an equivalent function on human NK cells s
~4~
Ly-49 family
Database accession numbers PIR
Mouse LY-49A Mouse LY-49B Mouse LY-49C Mouse LY-49D Mouse LY-49E Mouse LY-49F Mouse LY-49G 1 Mouse LY-49G2 Mouse LY-49G3 Mouse LY-49G4 Mouse LY-49H Rat Ly-49.9 Rat Ly-49.12 Rat Ly-49.29
SWISSPROT
EMBL/GENBANK
REFERENCE
P20937
M25775 U10304 U10305 U10090 U10091 U10092 U10093 U10094 U10095 U12890 U12889 U56863 U56822 U56824
1;16 17 9 9 9 9 9 9 9 lo
I49058 I49059
lO
3 3 3
Amino acid sequence of mouse Ly-49A MSEQEVTYSM GIFCFLLLVA LKNKSMECDL YYFVMDRKTW YDNKKKDWAW CICGKRLDKF
VRFHKSAGLQ VSVLAIKIFQ LESLNRDQNR SGCKQTCQSS IDNRPSKLAL PH
KQVRPEETKG YDQQKKLQEF LYNKTKTVLD SLSLLKIDDE NTRKYNIRDG
PREAGYRRCS LNHHNNCSNM SLQHTGRGDK DELKFLQLVV GCMLLSKTRL
FHWKFIVIAL QSDINLKDEM VYWFCYGMKC PSDSCWVGLS DNGNCDQVFI
50 I00 150 200 250 262
References 1 z 3 4
Yokoyama, W.M. et al. (1990)J. I m m u n o l . 145, 2 3 5 3 - 2 3 5 8 . Kubo, S. et al. (1993)Gene 136, 3 2 9 - 3 3 1 . Dissen, E. et al. (1996)J. Exp. Med. 183, 2 1 9 7 - 2 2 0 7 . Yokoyama, W.M. and Seaman, W.E. (1993) Annu. Rev. I m m u n o l . 11, 613-635. s G u m p e r z , J.E. and Parham, P. (1995) N a t u r e 378, 2 4 5 - 2 4 8 .
6 Yokoyama, W.M. et al. (1995) Semin. Immunol. 7, 89-101. 7 8 9 lo 11 12 13 14 is 16 17
Held, W. et al. (1995) N a t u r e 376, 3 5 5 - 3 5 8 . Mason, L.H. et al. (1995)J. Exp. Med. 182, 2 9 3 - 3 0 3 . Smith, H.R.C. et al. (1994) J. I m m u n o l . 153, 1068-1079. Brennan, J. et al. (1994) J. Exp. Med. 180, 2 2 8 7 - 2 2 9 5 . Olcese, L. et al. (1996) J. I m m u n o l . 156, 4 5 3 1 - 4 5 3 4 . Colonna, M. (1996)Curr. Opin. I m m u n o l . 8, 101-107. Brennan, J. et al. (1995) J. Biol. C h e m . 270, 9 6 9 1 - 9 6 9 4 . Daniels, B.F. et al. ( 1 9 9 1 ) I m m u n i t y 1 , 7 8 5 - 7 9 2 . Chan, P.Y. and Takei, F. (1989) J. I m m u n o l . 142, 1727-1736. Yokoyama, W.M. et al. (1989)J. I m m u n o l . 143, 1379-1386. Wong, S. et al. (1991) J. I m m u n o l . 147, 1417-1423.
~4~
Mac-2-BP Other names Mac-2 binding protein CyAP (cyclophilin C-associated protein) L3 antigen MAMA
Molecular weights Polypeptide
63 277
SDS-PAGE reduced unreduced
97 kDa 97 kDa
COOH
Carbohydrate N-linked sites O-linked
~ ~ ~ 7 unknown
~ ~ ~
Human gene location 17q25 Domain
GRV
VVCl
so
I
[
Is]
I
Tissue distribution Human Mac-2-BP is present in extracellular fluids with a high concentration in breast milk and is secreted by cultured tumour cells 1,2. Mouse Mac-2-BP is expressed on the surface of activated macrophages and mRNA is found in many organs but not brain 2. Mouse Mac-2-BP mRNA is strongly increased in response to adherence and moderately to TNF and IFN? 2.
Structure Mac-2-BP contains an N-terminal scavenger receptor cysteine-rich domain 1-3. Mac-2-BP is susceptible to cleavage and there is a dibasic cleavage site at residue 4171. There is no obvious transmembrane sequence or a site for the attachment of a GPI anchor. The native molecular size of human Mac-2-BP is in the order of several million kDa 1.
Ligands and associated m o l e c u l e s Mac-2 BP binds galectin 3 (formerly known as Mac-2) 1 and to cyclophilin C 4. Binding of Mac-2-BP to galectin 3 is lactose dependent 1. Binding of cyclophilin C to Mac-2-BP is inhibited by cyclosporin 4. Mac-2-BP is reported to bind to CD 14 in the presence of LPS s.
Function Unknown.
~4~
Database accession numbers PIR
SWISSPROT
Human Mouse
Amino
EMBL/GENBANK
REFERENCE
L13210 L16894
4
1
a c i d s e q u e n c e of h u m a n M a c - 2 - B P
MTPPRLFWVW VNDGDMRLAD NATQALGRAA AGVVCTNETR FCGHTVILTA ITLSSVKCFH ATGDALLEKL LALLKAVDTW HEALFQKKTL TDSSWSARKS FLFQDKRVSW ENKALMLCEG FRTVIRPFYL
LLVAGTQG GGATNQGRVE FGQGSGPIML STHTLDLSRE NLEAQALWKE KLASAYGARQ CLQFLAWNFE SWGERASHEE QALEFHTVPF QLVYQSRRGP SLVYLPTIQS LFVADVTDFE TNSSGVD
IFYRGQWGTV DEVQCTGTEA LSEALGQIFD PGSNVTMSVD LQGYCASLFA ALTQAEAWPS VEGLVEKIRF QLLARYKGLN LVKYSSDYFQ CWNYGFSCSS GWKAAIPSAL
CDNLWDLTDA SLADCKSLGW SQRGCDLSIS AECVPMVRDL ILLPQDPSFQ VPTDLLQLLL PMMLPEELFE LTEDTYKPRI APSDYRYYPY DELPVLGLTK DTNSSKSTSS
SVVCRALGFE LKSNCRHERD VNVQGEDALG LRYFYSRRID MPLDLYAYAV PRSDLAVPSE LQFNLSLYWS YTSPTWSAFV QSFQTPQHPS SGGSDRTIAY FPCPAGHFNG
-i 50 i00 150 200 250 300 350 400 450 500 550 567
References 1 z 3 4 s
Koths, K. et al. (1993) J. Biol. Chem. 268, 14245-14249. Chicheportiche, Y. and Vassalli, P. (1994) J. Biol. Chem. 269, 5512-5517. Resnick, D. et al. (1994)Trends Biochem. Sci. 19, 5-8. Friedman, J. et al. (1993) Proc. Natl Acad. Sci. USA 90, 6815-6819. Yu, B. and Wright, S.D. (1995)J. Inflammation 45, 115-125.
~4~
Macrophage lectin I
I
Other names Macrophage asialoglycoprotein binding protein (M-ASGP-BP)(rat) Macrophage galactose/N-acetylgalactosamine-specific lectin (MMGL)(mouse) Molecular weights Polypeptide 32 93 7
|
I
I SDS-PAGE !
i i
I I
reduced Carbohydrate N-linked sites O-linked
34.5-36.5 kDa
1 unknown
Domain
CQL
DVC
TTT<;TTTT NH2
Tissue distribution This has not been established by immunohistological techniques, however RNA analysis suggests that in the mouse the molecule is expressed at low levels by resident peritoneal macrophages and is upregulated on thioglycollate-elicited peritoneal macrophages 1. Similarly, in the rat the antigen appears to be restricted to macrophages 2. ]
I
Structure Human macrophage lectin (HML) is a type II transmembrane molecule containing a single extracellular C-type lectin domain 3. It is the homologue of the rat macrophage asialoglycoprotein receptor (M-ASGP-BP), the mouse macrophage galactose/N-acetylgalactosamine-specific lectin (MMGL) and is closely related to other human C-type lectins (human hepatic lectin 1 (HHL-1); 56% identity and human hepatic lectin 2 (HHL-2); 45% identity). The C-type lectin domains of the lectins have the greatest similarity: 69% between HML and HHL-1, and 63% between HML and HHL-2 3. The cytoplasmic domains of the lectins are more poorly conserved 3. Additionally, the HML and HHL- 1/HHL-2 proteins differ through the presence of a 24 amino acid insertion in the extracellular neck region of HML which is also found in the mouse and rat homologues 1,3,4. In the rat the N-terminus of M-ASGP-BP has been confirmed by protein sequencing s.
Ligands and associated molecules HML is a lectin that binds galactose and N-acetylgalactosamine carbohydrate groups in a calcium-dependent manner 3. A weaker interaction between HML and fucose is also observed.
i5~
Function The presence of a YENF internalization motif in the cytoplasmic domain suggests that the lectin is likely to be involved in receptor-mediated endocytosis. Recombinant HML expressed in E. coli has been shown to strongly bind glycopeptides carrying three consecutive N-acetylgalactosamine-Ser/Thr residues known as Tn antigen, which is a human carcinoma-associated epitope, implicating a role in recognition by tumoricidal macrophages 3.
Database accession numbers PIR Human Mouse Rat
SWISSPR OT
EMBL/GENBANK
REFERENCE
P49300 P49301
D50532 $36676 J05495
3 1 4
A m i n o acid s e q u e n c e of h u m a n m a c r o p h a g e lectin MTRTYENFQY LLVIICVVGF IASLKAEVEG STEGTCCPVN QNFVQKYLGS GLGGGEDCAH
LENKVKVQGF QNSKFQRDLV FKQERQAVHS WVEHQDSCYW AYTWMGLSDP FHPDGRWNDD
KNGPLPLQSL TLRTDFSNFT EMLLRVQQLV FSHSGMSWAE EGAWKWVDGT VCQRPYHWVC
LQRLRSGPCH SNTVAEIQAL QDLKKLTCQV AEKYCQLKNA DYATGFQNWK EAGLGQTSQE
LLLSLGLGLL TSQGSSLEET ATLNNNGEEA HLVVINSREE PGQPDDWQGH SH
50 i00 150 200 250 292
References 1 Sato et al. (1992) J. Biochem. 111, 331-336. z Kawasaki, T. et al. (1986)Carbohydrate Res. 151, 197-206.
3 Suzuki, N. et al. (1996) J. Immunol. 156, 128-135. 4 Ii, M. et al. (1990) J. Biol. Chem. 265, 11295-11298. s Ii, M. et al. (1988) Biochem. Biophys. Res. Commun. 155, 720-725.
~51
MAdCAM-1
Mucosal addressin cell adhesion molecule 1 [
Molecular weights Polypeptide
40 909
SDS-PAGE reduced unreduced
58-66 kDa 54-62 kDa
Carbohydrate N-linked sites O-linked
2 + m m
Mouse gene location
Octamer repeats
10c 1-2; -3.5 kb i m
Domains Isl
CTA RLRvc I [ LYCN c~ I c2 I
mucin-like
,,.,
ITMICu
COOH
Tissue distribution MAdCAM-1 is expressed at high levels on high endothelial venules of Peyer's patches and mesenteric lymph nodes and on flat-walled venules within the gut lamina propria 1-s. It is also expressed on vascular endothelium in mammary glands, pancreas and the spleen marginal sinus. Expression is induced in vitro by TNFa and IL-1 and MAdCAM-1 has been detected on blood vessels within areas of chronic inflammation a.
Structure The N-terminal half of the extracellular region contains two C2-set IgSF domains. These domains are closely related to the two membrane-distal IgSF domains of CD106 (VCAM-1), which binds the integrins a4fll (CD49d/CD29) and a4f17n,6. MAdCAM-1 and CD106 are structurally related to three other integrin-binding molecules: CD50, CD54 and CD102. Features shared by these molecules include an atypical disulfide between the B-C and F-G loops and an (I/L)(D/E)(S/T)XL motif (LDTSL in MAdCAM-1) in the C - D loop 6. The IgSF domains are followed by a mucin-like region rich in Ser, Thr, and Pro residues, which includes eight copies of the octameric repeat (P/S)PDTTS(Q/P)E 4. The two IgSF domains, but not the mucin-like portions, are reasonably well-conserved between species 4.
~52
Ligands and associated molecules MAdCAM-1 binds the integrin a4f17through its IgSF portion 7,8. It can also bind CD62L (L-selectin) through poorly characterized sialoglycoconjugates present in the mucin-like portion of a subpopulation of MAdCAM-1 molecules 9. The ~4fl7 binding site incorporates the LDTSL motif in the C-D loop of domain 1, which also forms part of the binding site on other integrin-binding IgSF domains 6,8. Function Through its interaction with the lymphocyte adhesion molecules CD62L and a4f17, MAdCAM-1 contributes to the recirculation of naive lymphocytes to Peyer's patches and mesenteric lymph nodes, and the homing of a subpopulation of activated or memory lymphocytes to the lamina propria of the gut mucosa 2,s. MAdCAM-1 is involved in the initial tethering and rolling of lymphocytes on endothelial surfaces (through CD62L and a4f17 binding) as well as the subsequent activation-induced arrest of these cells (through a4//7 binding) prior to extravasation s.
i
Database accession numbers PIR Human Mouse
SWISSPR OT
$33601
EMBL/GENBANK U43628 L21203
REFERENCE 4
10
A m i n o a c i d s e q u e n c e of h u m a n M A d C A M MDFGLALLLA QSLQVKPLQV AVQSDTGRSV TVSPAALVPG QEEEEEPQGD QAIPVLHSPT SQEPPDTTSQ ISQAGPTQGE CRHLAEDDTH
GLLGLLLG EPPEPVVAVA LTVRNASLSA DPEVACTAHK EDVLFRVTER SPEPPDTTSP EPPDTTSPEP VIPTGSSKPA PPASLRLLPQ
LGASRQLTCR AGTRVCVGSC VTPVDPNALS WRLPPLGTPV EPPNTTSPES PDKTSPEPAP GDQLPAALWT VSAWAGLRGT
LACADRGASV GGRTFQHTVQ FSLLVGGQEL PPALYCQATM PDTTSPESPD QQGSTHTPRS SSAVLGLLLL GQVGISPS
QWRGLDTSLG LLVYAFPDQL EGAQALGPEV RLPGLELSHR TTSQEPPDTT PGSTRTRRPE ALPTYHLWKR
-i 50 i00 150 200 250 300 350 388
References
I
! !
....
1 2 3 4 s 6 7 8 9 lo
Sampaio, S.O. et al. (1995) J. Immunol. 155, 2477-2486. Picker, L.J. and Butcher, E.C. (1992) Annu. Rev. Immunol. 10, 561-591. Sikorski, E.E. et al. (1993) J. Immunol. 151, 5239-5250. Shyjan, A.M. et al. (1996) J. Immunol. 156, 2851-2857. Butcher, E.C. and Picker, L.J. (1996) Science 272, 60-66. Jones, E.Y. et al. (1995) Nature 373, 539-544. Berlin, C. et al. (1993)Cell 74, 185-195. Briskin, M.J. et al. (1996)J. Immunol. 156, 719-726. Berg, E.L. et al. (1993) Nature 366, 695-698. Briskin, M.J. et al. (1993) Nature 363, 461-464.
~53
Mannose receptor
Macrophage m a n n o s e receptor I
Molecular weights Polypeptide 164120 =
SDS-PAGE reduced
175-190 kDa
Carbohydrate N-linked sites O-linked
8 unknown
CL CL CL
Human gene location and size 10p13; -70 kb 1
CL CL
CL " CL
COOH
CAF
Domains
[
I sl
I
CKR
Domains
c,
CQQ
I
I
c,
I
c,
c,
CLR
INCI
|
I
c,
I
IHCi
]
I
CRA
LKCI
]
I
AHCI
]
CNN
RGCI
c,
CKL
DTCI
]
CRK
LEG] l
cL
CIG
INCI
[
I
GYCI , I F2 I
c,
ITMIcu
Tissue distribution The mannose receptor is expressed on mature tissue macrophages, hepatic sinusoidal cells and in the placenta 2-4. Its expression on macrophages is upregulated in response to IL-4, IL-13 or anti-inflammatory steroids and downregulated by IFN7 or IFN~ s-8
Structure The mannose receptor is a type I membrane glycoprotein. The N-terminus has a Cys-rich domain that shows sequence similarity to the B subunit of the plant protein ricin D, followed by a fibronectin type II domain, eight C-type lectin domains, transmembrane and cytoplasmic regions 2'3'9. The overall
~54
molecular organization of the mannose receptor is similar to that of the M-type receptor for secretory phospholipases A~, DEC-205 and a fourth, widely expressed, member of this C-type lectin family lo.
Ligands and associated molecules
E
The mannose receptor binds oligomannose-containing carbohydrates. In addition, the Cys-rich domain of the mouse mannose receptor, fused to the Fc region of human IgG1, binds to macrophage populations in the splenic marginal zone (metallophilic macrophages) and the lymph node subcapsular sinus 11. The ligand(s) for the Cys-rich domain is also expressed in germinal centres of the spleen and follicular areas of lymph nodes in immunized mice 11.
Function
i
The mannose receptor mediates phagocytosis by macrophages of microorganisms with cell walls containing oligomannose carbohydrates. COS-1 cells transfected with full length mannose receptor cDNA bind and internalize Candida albicans yeast particles. Cells transfected with a cDNA mutated to delete the cytoplasmic region express the protein, but do not ingest particles 3
Comment The human mannose receptor gene is divided into 30 exons 1. Due to the complex nature of the gene, with no simple correlation between the 26 exons that encode the eight lectin C-type domains, the positions of the intron/exon boundaries are not shown on the domain diagram. Database accession numbers Human Mouse
PIR A36563 A48925
SWISSPR O T P22897
EMBL/GENBANK J05550 Z11974
REFERENCE 2 12
A m i n o a c i d s e q u e n c e of h u m a n m a n n o s e r e c e p t o r MRLPLLLVFA LLDTRQFLIY VAFKLCLGVP YGNRQEKNIM AFPFKFENKW LWNKDPLTSV LTSSLTSGLW LNPGKNAKWE YAGHCYKIHR NDELWIGLND KDGYWADRGC GHTLSTFAEA DIQTKGTFQW EKAKFVCKHW EKKTWFESRD
SVIPGAVL NEDHKRCVDA SKTDWVAITL LYKGSGLWSR YADCTSAGRS SYQINSKSAL IGLNSLSFNS NLECVQKLGY DEKKIQRDAL IKIQMYFEWS EWPLGYICKM NQTCNNENAY TIEEEVRFTH AEGVTHPPKP FCRALGGDLA
VSPSAVQTAA YACDSKSEFQ WKIYGTTDNL DGWLWCGTTT TWHQARKSCQ GWQWSDRSPF ICKKGNTTLN TTCRKEGGDL DGTPVTFTKW KSRSQGPEIV LTTIEDRYEQ WNSDMPGRKP TTTPEPKCPE SINNKEEQQT
CNQDAESQKF KWECKNDTLL CSRGYEAMYT DYDTDKLFGY QQNAELLSIT RYLNWLPGSP SFVIPSESDV TSIHTIEELD LRGEPSHENN EVEKGCRKGW AFLTSFVGLR GCVAMRTGIA DWGASSRTSL IWRLITASGS
RWVSESQIMS GIKGEDLFFN LLGNANGATC CPLKFEGSES EIHEQTYLTG SAEPGKSCVS PTHCPSQWWP FIISQLGYEP RQEDCVVMKG KKHHFYCYMI PEKYFWTGLS GGLWDVLKCD CFKLYAKGKH YHKLFWLGLT
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 7OO
i5E
YGSPSEGFTW CEHLNNWICQ ETMDNARAFC DKKFAWMDGS AFICQRHNSS NWQEARKACI TFLWTDGRGV DTCDSKRGYI AETYCKLHNS TDKWRVRYTN PATEPPQLPG SLVSIESAAE NWNTGDPSGE LLTTKADTRK PQEGAFENTL
SDGSPVSYEN IQKGQTPKPE KRNFGDLVSI KVDYVSWATG INATTVMPTM GFGGNLVSIQ HYTNWGKGYP CQTRSDPSLT LIASILDPYS WAADEPKLKS RCPESDHTAW SSFLSYRVEP RNDCVALHAS MDPSKPSSNV YFNSQSSPGT
WAYGEPNNYQ PTPAPQDNPP QSESEKKFLW EPNFANEDEN PSVPSGCKEG NEKEQAFLTY GGRRSSLSYE NPPATIQTDG NAFAWLQMET ACVYLDLDGY IPFHGHCYYI LKSKTNFWIG SGFWSNIHCS AGVVIIVILL SDMKDLVGNI
NVEYCGELKG VTEDGWVIYK KYVNRNDAQS CVTMYSNSGF WNFYSNKCFK HMKDSTFSAW DADCVVIIGG FVKYGKSSYS SNERVWIALN WKTAHCNESF ESSYTRNWGQ LFRNVEGTWL SYKGYICKRP ILTGAGLAAY EQNEHSVI
DPTMSWNDIN DYQYYFSKEK AYFIGLLISL WNDINCGYPN IFGFMEEERK TGLNDVNSEH ASNEAGKWMD LMRQKFQWHE SNLTDNQYTW YFLCKRSDEI ASLECLRMGS WINNSPVSFV KIIDAKPTHE FFYKKRRVHL
References 1 2 3 4 s 6 7 s 9 lo 11 12
Kim, S.J. et al. (1992)Genomics 14, 721-727. Taylor, M.E. et al. (1990) J. Biol. Chem. 265, 12156-12162. Ezekowitz, R.A.B. et al. (1990) J. Exp. Med. 172, 1785-1794. Lennartz, M.R. et al. (1987) J. Biol. Chem. 262, 9942-9944. Mokoena, T. and Gordon, S. (1985)J. Clin. Invest. 75, 624-631. Shepherd, V.L. et al. (1985) J. Biol. Chem. 260, 160-164. Stein, M. et al. (1992) J. Exp. Med. 176, 287-292. Doyle, A.G. et al. (1994) Eur. J. Immunol. 24, 1441-1445. Harris, N. (1994) Biochem. Biophys. Res. Commun. 198, 682-692. Wu, K. et al. (1996)J. Biol. Chem. 271, 21323-21330. Martinez-Pomares, L. et al. (1996) J. Exp. Med. 184, 1927-1937. Harris, N. et al. (1992) Blood 80, 2363-2373.
750 8OO 85O 900 950 I000 1050 ii00 1150 1200 1250 1300 1350 1400 1438
MARCO Molecular weights Polypeptide
52 730
SDS-PAGE reduced unreduced
80 kDa 210 kDa
Carbohydrate N-linked sites O-linked
2 unknown
represents one third of the collagen-like region
Domains
GRA
I Cu
i
I
So
VEC I
--~ NH 2 NH 2 NH 2
Tissue distribution MARCO is expressed by macrophages in the marginal zone of spleen and in the medullary cord of lymph nodes but not in liver or lung 1.
Structure MARCO is a type II glycoprotein and is expressed as a disulfide-linked trimer at the cell surface 1. MARCO has a 49 amino acid N-terminal cytoplasmic sequence, a 25 amino acid transmembrane sequence, a 75 amino acid putative spacer sequence containing two cysteine residues, followed by 270 amino acids predicted to be a collagen-like domain and at the C-terminus, a scavenger receptor cysteine-rich SF domain 1-a. The predicted structure of MARCO is similar to scavenger receptor but differs in that MARCO has a longer collagenous domain where scavenger receptor has a coiled-coil region followed by a short collagen-like region 1-a. The scavenger receptor domains of MARCO and scavenger receptor type I show 49% amino acid sequence identity and each contain six cysteines 1-3
Ligands and associated molecules COS cells expressing MARCO bind acetylated low-density lipoprotein and bacteria, specifically, E. coli and S. a u r e u s 1,a.
~57
MARCO
Function Unknown. Database accession numbers PIR
Mouse
SWISSPROT
A55840
EMBL/GENBANK
REFERENCE
U18424
1
A m i n o acid s e q u e n c e of m o u s e M A R C O MGSKELLKEE AVMAIHLILL QWAPKTHLVP ERGSPGPKGA GEAGLQGLTG LGLPGNKGDM KPGVQGVPGP GDTGIQGQKG KGSSGQQGQK DNNDATVFCR SWGNHNCVHN
DFLGSTEDRA TAGTALLLIQ RAQGLQALQA PGAPGIPGLP APGKQGATGA GMKGDTGPMG QGAPGLSGAK TKGESGVPGL GEKGQKGESF MLGYSRGRAL EDAGVECS
DFDQAMFPVM VLNLQEQLQM QLSWVHTSQE GPAAEKGEKG PGPRGEKGSK SPGAQGGKGD GEPGRTGLPG VGRKGDTGSP QRVRIMGGTN SSYGGGSGNI
ETFEINDPVP LEMCCGNGSL QLRQQFNNLT AAGRDGTPGV GDIGLTGPKG AGKPGLPGLA PAGPPGIAGN GLAGPKGEPG RGRAEVYYNN WLDNVNCRGT
KKRNGGTFCM AIEDKPFFSL QNPELFQIKG QGPQGPPGSK EHGTKGDKGD GSPGVKGDQG PGIAGVKGSK RVGQKGDPGM EWGTICDDDW ENSLWDCSKN
References 1 Elomaa, O. et al. (1995) Cell 80, 603-609. 2 Resnick, D. et al. (1994) Trends Biochem. Sci. 19, 5-8. 3 Pearson, A.M. {1996)Curt. Opin. Immunol. 8, 20-28.
~5[
50 I00 150 200 250 300 350 400 450 500 518
Macrophage colony-stimulating factor receptor, CDll5 Other names Colony-stimulating factor (CSF)-I receptor c-fms proto-oncogene Molecular weights Polypeptide 105619 SDS-PAGE reduced
~c2:
150 kDa
--0
Carbohydrate N-linked sites O-linked
C2
11 unknown
v~
Human gene location and size 5q33.2-q33.3; -60 kb i
K COOH Domains CVG
l sl
I
YRC,
02
CLL
I
I
YQCl
c2
CSA
I
I
ysc,
c2
LKV
I
I
YSF,
c2
CAA
I
I
v
YEC,
[ TM I
K
I
Tissue distribution C D l l 5 is expressed on monocytes and their progenitors, macrophages, placental cells and choriocarcinoma cells ~.
Structure CD115 is encoded by the c-fms proto-oncogene 2. It belongs to subclass III within the family of growth factor receptors with tyrosine kinase activity, which also includes C D l l 7 (c-kit) and the PDGF receptors type A and B (CD140a and CD140b) l'a. The extracellular domain of C D l l 5 consists of five IgSF domains (four C2-set and one V-set). The cytoplasmic region contains a protein tyrosine kinase domain, interrupted by an insertion of about 70 amino acids ~-3. This insertion is necessary for the association of
~5~
C D l l 5 with phosphatidylinositol 3-kinase (PI 3-kinase) 4. Lys593 (mature numbering) is predicted to be the site of ATP binding (*) and is preceded by the conserved GXGXXG motif 3 as shown below: 9r
FGKTLGAGAFGKVVEATAFGLGKE DAVLKVAVKMLKSTAH
Ligands and associated molecules The binding of M-CSF to CD 115 (Kd: 50 pM) induces or stabilizes dimerization of the receptor s. This results in kinase activation and Tyr phosphorylation of C D l l 5 itself and other cytoplasmic proteins, including PI 3-kinase 1"6. CD115 is also associated with a G protein, stimulates the translocation of protein kinase C to the membrane, and induces phosphatidylcholine hydrolysis and gene expression (reviewed in ref. 6).
Function M-CSF stimulates the survival, proliferation and differentiation of monocytes and macrophages and their bone marrow progenitors 1'7. Human C D l l 5 acquires transforming activity following mutation of Leu278 (equivalent to 301 in the precursor form) to Ser, and this oncogenic activity is increased by further substitutions near the C-terminus 1. The loss of both alleles of CD 115 has been demonstrated in patients with myelodysplastic syndrome 8. Database accession numbers
Human Mouse
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
A24533 S01880
P07333 P09581
X03663 X06368
2 9
Amino acid sequence of human CD 115 MGPGVLLLLL EPSVPELVVK ATFQNTGTYR LLPCLLTDPV DYQCSALMGG CSASSVDVNF GNYSCVASNV MVEAYPGLQG SEAGRYSFLA YPQPNVTWLQ VETLEHNQTY SIMALLLLLL KWEFPRNNLQ ADEKEALMSE FLRRKAEAML VEMRPVSTSS CIHRDVAARN PESIFDCVYT QMAQPAFAPK TNLPSSSRSG
~6C
VATAWHGQGI PGATVTLRCV CTEPGDPLGG LEAGVSLVRV RKVMSISIRL DVFLQHNNTK QGKHSTSMFF FNWTYLGPFS RNPGGWRALT CSGHTDRCDE ECRAHNSVGS LLLLYKYKQK FGKTLGAGAF LKIMSHLGQH GPSLSPGQDP NDSFSEQDLD VLLTNGHVAK VQSDVWSYGI NIYSIMQACW GSGSSSSELE
PVI GNGSVEWDGP SAAIHLYVKD RGRPLMRHTN KVQKVIPGPP LAIPQQSDFH RVVESAYLNL DHQPEPKLAN FELTLRYPPE AQVLQVWDDP GSWAFIPISA PKYQVRWKII GKVVEATAFG ENIVNLLGAC EGGVDYKNIH KEDGRPLELR IGDFGLARDI LLWEIFSLGL ALEPTHRPTF EESSSEHLTC
ASPHWTLYSD PARPWNVLAQ YSFSPWHGFT ALTLVPAELV NNRYQKVLTL SSEQNLIQEV ATTKDTYRHT VSVIWTFING YPEVLSQEPF GAHTHPPDEF ESYEGNSYTF LGKEDAVLKV THGGPVLVIT LEKKYVRRDS DLLHFSSQVA MNDSNYIVKG NPYPGILVNS QQICSFLQEQ CEQGDIAQPL
GSSSILSTNN EVVVFEDQDA IHRAKFIQSQ RIRGEAAQIV NLDQVDFQHA TVGEGLNLKV FTLSLPRLKP SGTLLCAASG HKVTVQSLLT LFTPVVVACM IDPTQLPYNE AVKMLKSTAH EYCCYGDLLN GFSSQGVDTY QGMAFLASKN NARLPVKWMA KFYKLVKDGY AQEDRRERDY LQPNNYQFC
-I 5O i00 150 200 250 300 350 400 450 50O 55O 600 650 70O 750 800 85O 90O 949
References 1 Sherr, C.J. (1990) Blood 75, 1-12. z Coussens, L. et al. (1986) Nature 320, 277-280.
3 4 s 6 7 8 9
Ullrich, A. and Schlessinger, J. (1990) Cell 61, 203-212. Ghosh Choudhury, G. et al. (1991)J. Biol. Chem. 266, 8068-8072. Li, W. and Stanley, E.R. (1991) EMBO J. 10, 277-288. Vairo, G. and Hamilton, J.A. (1991) Immunol. Today 12, 362-369. Nicola, N.A. (1989)Annu. Rev. Biochem. 58, 45-77. Boultwood, J. et al. (1991) Proc. Natl Acad. Sci. USA 88, 6176-6180. Rothwell, V.M. and Rohrschneider, L.R. (1987)Oncogene Res. 1, 311-324.
Multidrug resistance protein 1, P-glycoprotein (P-gp)
Other names P170 Multidrug transporter Molecular weights Polypeptide 141504 SDS-PAGE reduced
170 kDa
Carbohydrate N-linked sites O-linked
3 unknown
NH 2
/~ Nucleotide binding site
d(~ Nucleot, binding site
x
~.~
Human gone location and size 7q21.1; > 100 kb 1 COOH The MDR 1 gone contains 28 introns, of which 26 interrupt the protein coding sequence (see ref. 1 for a detailed description of the gone structure). Tissue distribution The P-glycoprotein (P-gp) product of the MDR1 gone is expressed in small intestine, colon, kidney, liver and adrenal, with very low levels of expression in most other tissues. Normal peripheral blood and bone marrow cells express very low amounts of P-gp, but it is expressed in practically all haematopoietic progenitor cells with the highest levels in pluripotent stem cells 1. Structure MDR 1 belongs to the ABC superfamily of ATP-binding transport proteins that includes the product of the cystic fibrosis gone (CFTR), the Plasmodium falciparum multidrug resistance protein (pfMDR) and a large number of bacterial periplasmic transport proteins 2. MDR 1 cDNA encodes a glycoprotein
with 12 transmembrane domains. The protein consists of two halves which share a high degree of sequence similarity. The genomic organization of MDR1 suggests that this gone arose by fusion of two related, but independently evolved genes, rather than by gone duplication 1. Each half of the protein consists of a short hydrophilic N-terminal sequence, a long hydrophobic region containing six transmembrane segments, and a relatively hydrophilic region containing an ATP binding cassette of about 200 amino acids a. Function P-gp has been shown t o utilize ATP to pump hydrophobic drugs out of cells, thus decreasing their intracellular concentration and hence their toxicity 1. The MDR1 gone is amplified in multidrug-resistant cell lines 1. However the actual physiological role of P-gp is not clear, although recent evidence suggests two possibilities that are not mutually exclusive. First, P-gp appears to be a flippase which translocates, or "flips", phospholipids from the inner leaflet of the lipid bilayer to the outer leaflet, or vice versa, to maintain the
~62
MDR1
asymmetric distribution of different phospholipids in cell membranes 4. The flippase model would explain how P-gp can export a wide range of drugs but not most normal cellular constituents: hydrophobic drug molecules could initially intercalate into the inner leaflet of the bilayer, then interact with Pgp and be "flipped" from the inner to the outer leaflet and subsequently into the aqueous phase 4 Secondly, P-gp appears to regulate cell volume by modulating the activity of an endogeonous chloride channel, rather than being itself an ion channel s.
Database accession numbers Human Rat Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A25059
P08183 P43245 P06795
M14758 M81855 M14757
6 7 8
A33719
A m i n o acid s e q u e n c e of h u m a n MDR1 g e n e p r o d u c t MDLEGDRNGG MVVGTLAAII TGFFMNLEED FFHAIMRQEI FTGFIVGFTR GAVAEEVLAA AAFLLIYASY SIEAFANARG YPSRKEVKIL VSVDGQDIRT EKAVKEANAY ILLLDEATSA FDDGVIVEKG LEMSSNDSRS MKLNLTEWPY NSNLFSLLFL VSWFDDPKNT FIYGWQLTLL IENFRTVVSL FSYAGCFRFG AKISAAHIIM PVLQGLSLEV IKRLNVQWLR EANIHAFIES EATSALDTES VKEHGTHQQL
AKKKNFFKLN HGAGLPLMML MTRYAYYYSG GWFDVHDVGE GWKLTLVILA IRTVIAFGGQ ALAFWYGTTL AAYEIFKIID KGLNLKVQSG INVRFLREII DFIMKLPHKF LDTESEAVVQ NHDELMKEKG SLIRKRSTRR FVVGVFCAII ALGIISFITF TGALTTRLAN LLAIVPIIAI TQEQKFEHMY AYLVAHKLMS IIEKTPLIDS KKGQTLALVG AHLGIVSQEP LPNKYSTKVG EKVVQEALDK LAQKGIYFSM
NKSEKDKKEK VFGEMTDIFA IGAGVLVAAY LNTRLTDDVS ISPVLGLSAA KKELERYNKN VLSGEYSIGQ NKPSIDSYSK QTVALVGNSG GVVSQEPVLF DTLVGERGAQ VALDKARKGR IYFKLVTMQT SVRGSQAQDR NGGLQPAFAI FLQGFTFGKA DAAQVKGAIG AGVVEMKMLS AQSLQVPYRN FEDVLLVFSA YSTEGLMPNT SSGCGKSTVV ILFDCSIAEN DKGTQLSGGQ AREGRTCIVI VSVQAGTKRQ
KPTVSVFSMF NAGNLEDLMS IQVSFWCLAA KINEVIGDKI VWAKILSSFT LEEAKRIGIK VLTVFFSVLI SGHKPDNIKG CGKSTTVQLM ATTIAENIRY LSGGQKQRIA TTIVIAHRLS AGNEVELENA KLSTKEALDE IFSKIIGVFT GEILTKRLRY SRLAVITQNI GQALKDKKEL SLRKAHIFGI VVFGAMAVGQ LEGNVTFGEV QLLERFYDPL IAYGDNSRVV KQRIAIARAL AHRLSTIQNA
RYSNWLDKLY NITNRSDIND GRQIHKIRKQ GMFFQSMATF DKELLAYAKA KAITANISIG GAFSVGQASP NLEFRNVHFS QRLYDPTEGM GRENVTMDEI IARALVRNPK TVRNADVIAG ADESKSEIDA SIPPVSFWRI RIDDPETKRQ MVFRSMLRQD ANLGTGIIIS EGAGKIATEA TFSFTQAMMY VSSFAPDYAK VFNYPTRPDI AGKVLLDGKE SQEEIVRAAK VRQPHILLLD DLIVVFQNGR
5O I00 150 200 250 300 350 400 450 50O 55O 600 650 7O0 75O 8OO 85O 9OO 95O i000 1050 Ii00 1150 1200 1250 1280
References 1 Gottesman, M.M. and Pastan, I. (1993) Annu. Rev. Biochem. 62, 385-427.
2 Higgins, C.F. (1995) Cell 82, 693-696. 3 Kast, C. et al. (1996) J. Biol. Chem. 271, 9240-9248.
4 van Helvoort, A. et al. (1996) Cell 87, 507-517. 5 6 7 8
Valverde, M.A. et al. (1996) EMBO J. 15, 4460-4468. Chen, C.J. et al. (1986)Cell 47, 381-389. Silverman, J.A. et al. (1991) Gene 106, 229-236. Gros, P. et al. (1986)Cell 47, 371-380.
;6~
M H C Class I
Major histocompatibility complex Class I antigen
Other names HLA-A, -B and-C (human) H-2K, -D, -L (mouse) RT1A, RT1C (rat)
Molecular weights Polypeptide HLA-A2 ~ chain fl~-microglobulin
38412 11 731
SDS-PAGE reduced
44 kDa
S
chain fl2-microglobulin
12kDa
~TTT ~ T ~ T T ~
a chain fl~-microglobulin chain fl~-microglobulin
nil nil nil
1
~~ r ,~,~,~,~
Carbohydrate N-linked sites O-linked
S
COOH
Human gene location chain: 6p21.3 fl2-microglobulin: 15q21-q22.2 The organization of genes within the human MHC complex is reviewed in ref. 1. HLA-A2 ~ chain
Domain Exonboundaries
od AGS
CWA
1~3 YT(~
or2. AGS
TDA
ITMIcY/
WEP/
\CK"V
SDR ASS J32-microglobulin
Domain
Exon boundaries
Isl
CYV I YAC!
I1 QRT
cl
I I1 WDR
Tissue distribution The "classical" MHC Class I molecules (HLA-A, -B and -C in man) are expressed on most nucleated cells but expression varies on different cell types. Interferons a, fl and 7 and tumour necrosis factor ~ increase the expression of MHC Class I molecules e. Expression can be low on virusinfected or tumour cells a. "Non-classical" MHC molecules generally have a broad distribution 4. HLA-G is expressed only on cytotrophoblasts 4.
lq i64
Structure MHC Class I molecules consist of heterodimers of highly polymorphic chains non-covalently associated with the invariant fl~-microglobulin
subunit s-7. fl~-Microglobulin and the ~3 domain are Ig-related and of the Cl-set. The ~1 and ~2 domains form a platform consisting of a single fl pleated sheet topped by ~ helices s-7. A groove between the two ~ helices binds peptides s-7. The polymorphic amino acids of Class I molecules are concentrated along the peptide binding groove s-7. Endogenous proteins in the cell are degraded by proteasomes and resultant peptides transported by TAP proteins to be assembled in MHC Class I molecules before being expressed stably on the cell surface 2"8. Peptides are usually nine amino acid residues long and anchored via residues, typically 2 and 9, in pockets on the peptide binding groove s-7. "Nonclassical" MHC molecules have a similar structure to "classical" MHC molecules 4
Ligands and associated molecules Peptide antigen bound to MHC Class I antigens is recognized by the e/fl TCR (on CD8 § T cells) and 7/~ TCR heterodimers 9"1~ The affinity of the interaction between the TCR and the MHC/peptide complex is in the range 10 .7-10 -4 M 11. CD8 interacts with the non-polymorphic e3 domain 12. MHC Class I antigens bind the human and mouse NK receptors, CD158 IgSF molecules and Ly-49 C-type lectins respectively 3'13.
Function
E
The "classical" MHC Class I molecules present endogenously synthesized peptides to CD8 § lymphocytes, which are usually cytotoxic T cells 2'14 MHC Class I molecules expressed on thymic epithelial cells regulate the positive and negative selection of CD8 + T cells during T cell maturation 14. Expression of Class I molecules depends on the expression of fig-microglobulin and mice lacking a functional fl2-microglobulin gene do not express Class I molecules on the cell surface. These animals lack mature CD4-CD8 § T cells and have defective cell-mediated cytotoxicity 14 Recognition of MHC Class I by NK receptors can protect from lysis by NK cells 3,14
Comments 1 Allogeneic MHC molecules on transplanted organs can induce potent graft rejection 15 2 Certain auto-immune diseases are linked to MHC Class I haplotype, e.g. ankylosing spondylitis is linked to HLA-B27 is. Database accession numbers Human HLA-A2~ Human fl~-microglobulin Rat fl2-microglobulin Mouse fi2-microglobulin
PIR
SWISSPR OT
EMBL/GENBANK REFERENCE
A02191 A02179 A26842 A02182
P01892 P01884 P07151 P01887
M32322 M17986 Y00441 X01838
16 17 is 19
~6~
MHC Class I
Amino acid sequence of human HLA-A2 ~ chain MAVMAPRTLV GSHSMRYFFT WIEQEGPEYW CDVGSDWRFL HVAEQLRAYL LRCWALSFYP SGQEQRYTCH AVVAAVMWRR
9.
LLLSGALALT SVSRPGRGEP DGETRKVKAH RGYHQYAYDG EGTCVEWLRR AEITLTWQRD VQHEGLPKPL KSSDRKGGSY
QTWA RFIAVGYVDD SQTHRVDLGT KDYIALKEDL YLENGKETLQ GEDQTQDTEL TLRWEPSSQP SQAASSDSAQ
TQFVRFDSDA LRGYYNQSEA RSWTAADMAA RTDAPKTHMT VETRPAGDGT TIPIVGIIAG GSDVSLTACK
ASQRMEPRAP GSHTVQRMYG QTTKHKWEAA HHAVSDHEAT FQKWAAVVVP LVLFGAVITG V
Amino acid sequence of human fl2-microglobulin MSRSVALAVL ALLSLSGLEA IQRTPKIQVY SRHPAENGKS NFLNCYVSGF HPSDIEVDLL KNGERIEKVE HSDLSFSKDW SFYLLYYTEF TPTEKDEYAC RVNHVTLSQP KIVKWDRDM
References 1 z 3 4 s 6 7 8 9 lo 11 12 la 14 is 16 17 18 19
i6~
-i 50 i00 150 200 250 300 341
Trowsdale, J. (1995)Immunogenetics 41, 1-17. York, I.A. and Rock, K.L. (1996) Annu. Rev. Immunol. 14, 369-396. Gumperz, J.E. and Parham, P. (1995) Nature 378, 245-248. Shawar, S. et al. (1994) Annu. Rev. Immunol. 12, 839-880. Stern L. and Wiley, D.C. (1994) Curr. Biol. 2, 245-251. Wilson, I.A. and Fremont, D.H. (1993) Semin. Immunol. 5, 75-80. Young, A.C.M. et al. (1995) FASEB J. 9, 26-36. Lehner, P.J. and Cresswell, P. (1996)Curr. Opin. Immunol. 8, 59-67. Garcia, K.C. et al. (1996) Science 274, 209-219. Garboczi, D.N. et al. (1996) Nature 384, 134-141. Fremont, D.H. et al. (1996) Curr. Opin. Immunol. 8, 93-100. Zamoyska, R. (1994)Immunity 1,243-246. Lanier, L.L. and Phillips, J.H. (1996) Immunol. Today 17, 86-91. Raulet, D.H. (1994) Adv. Immunol. 55, 381-421. Vyse, T.J. and Todd, J.A. (1996) Cell 85, 311-318. Bjorkman, P.J. et al. (1987) Nature 329, 506-512. G~sson, D. et al. (1987)J. Immunol. 139, 3133-3138. Mauxion, F. and Kress, M. (1987) Nucleic Acids Res. 15, 7638. Daniel, F. et al. (1983) EMBO J. 2, 1061-1065.
-i 50 99
Major histocompatibility complex Class II antigen Other names HLA-DP, -DQ and-DR (human) I-A and I-E (mouse) RT1B and RT1D (rat)
Molecular weights SDS-PAGE reduced
chain //chain
33-35 kDa 28-30 kDa
N-linked sites
~chain
2
O-linked
flchain chain fl chain
1 nil nil
Carbohydrate
s C1
T~TTT~ TTT~ ~ ~ ~
~ ~
COOH COOH
Human gene location 6p21.3 The organization of genes within the human MHC complex is reviewed in ref. 1. Domains
MHC Class II
l,sl
Exonboundaries
I Sli
[32
c1
I
Domains
MHC Class II 13
[31
131
132 G1
IzMIc i ITMIoY'
Exon boundaries
Tissue distribution MHC Class II molecules are expressed on dendritic cells, B cells, monocytes, macrophages, myeloid and erythroid precursors and some epithelial cells. MHC Class II is expressed on activated T cells in human and rat e. Expression of MHC Class II is regulated by cytokines including interferon 7, which also induces expression on fibroblasts, epithelial and endothelial cells 2.
Structure MHC Class II molecules are heterodimers of non-covalently associated ~ and// chains. Both chains are comprised of two IgSF domains and have transmembrane sequences and short cytoplasmic tails. Multiple alleles of ~ and 1~ chains exist and they are highly polymorphic. Because of the extensive polymorphism of MHC Class II molecules, accession numbers and sequence data are not included in this entry. Crystal structure of the MHC Class II antigen HLA-DR1 is a similar structure to that of the MHC Class I antigen HLA-A2
~67
MHC Class II
(see page 566) 3. The crystals contained dimers of the aft heterodimer. Polymorphic residues are positioned at the peptide antigen binding site 3,4. Unlike MHC Class I, length is not critical for peptides bound to MHC Class II and they make several contacts along the length of the peptide binding groove 4. MHC Class II molecules are first associated with CD74 (the invariant or Ii chain) in the endoplasmic reticulum. CD74 is degraded in the endosomal/lysosomal pathway leaving a CLIP (MHC Class II-associated Ii chain) peptide attached to the MHC Class II molecule. The CLIP peptide.is dissociated by binding to Class H-related HLA-DM and replaced by peptides generated from exogenous proteins s.
Ligands and associated molecules Peptide antigen bound to MHC Class II bind to TCR on CD4 § cells. The affinity of the interaction between the TCR and the MHC/peptide complex is in the range of 10 -z- 10-4M 6. Superantigens as intact proteins bind to MHC Class II differently from peptides 6. CD4 interacts directly with non-polymorphic residues on MHC Class II and both domain 1 and 2 of the fl chain are implicated in binding 7,8.
Function MHC Class II molecules present exogenously derived antigen to CD4 § T lymphocyes, which are usually helper T cells 9. MHC Class II molecules expressed on thymic stromal cells play a key role in the positive and negative selection of CD4 § T cells during thymopoiesis and genetically engineered mice that do not express MHC Class II antigens lack CD4 § T cells in the periphery lo. Signalling can occur through MHC Class II 11.
Comments Certain HLA Class II molecules are associated with auto-immune diseases such as coeliac disease, insulin-dependent diabetes mellitus, rheumatoid arthritis, myasthenia gravis, multiple sclerosis and pemphigus vulgaris 12.
References 1 z 3 4 s 6 7
o
8 9 lo 11 12
.
~6~
Trowsdale, I. (1995)Immunogenetics 41, 1-17. Seddon, B. and Mason, D. (1996) Int. Immunol. 8, 1185-1193. Brown, ].H. et al. (1993) Nature 364, 33-39. Stern L. and Wiley, D.C. {1994) Structure 2, 245-251. Cresswell, P. (1996) Cell 84, 505-507. Fremont, D.H. et al. {1996) Curr. Opin. Immunol. 8, 93-100. Littman, D.R. (ed.)(1996)The CD4 Molecule. Curr. Top. Microbiol. Immunol. 205, 19-46. Sakihama, T. et al. {1995) Immunol. Today, 16, 581-587. Rudensky, A.Y. (1995) Semin. Immunol. 7, 399-409. Cardell, S. et al. (1994) Adv. Immunol. 55, 423-440. Scholl, P.R. and Geha, R.S. (1994} Immunol. Today 15, 418-422. Vyse, T.]. and Todd, J.A. (1996) Cell, 85, 311-318.
! i
Molecular weight Polypeptide 88 465
i
! i
|
! I i i.......................
i
SDS-PAGE reduced unreduced
89 kDa 75 kDa
Carbohydrate N-linked sites O-linked
5 unknown GOOH
Tissue distribution MS2 mRNA is expressed in macrophages and macrophage cell lines 1. The level of expression of MS2 transcripts is upregulated following macrophage stimulation with LPS, phorbol ester or lymphokines. Structure MS2 is a type I membrane glycoprotein. The extracellular region contains a Cys-rich domain with no clear similarities to other Cys-rich domains. The cytoplasmic region contains a large number of Pro, Arg and Lys residues and encodes tandem repeats with homology to corresponding sequences in CD2 and CD122 (IL-2Rfl chain) 1. Based on amino acid sequence similarity, the MS2 molecule is predicted to be a member of the zinc-dependent metalloprotease family. Ligands and associated molecules Unknown. Function The function of MS2 is unknown. Database accession numbers Mouse
PIR A60385
SWISSPR O T Q05910
EMBL/GENBANK X13335
REFERENCE 1
A m i n o a c i d s e q u e n c e of m o u s e M S 2 MLGLWLLSVL VAPGPPLPHV VFTLHLRKNR SAASISTCAG GVKDTNLNDL REAVRQRVLE NFLSWREQNL DHSKNSIGVA SKFPRIFSRC
WTPA KQYEVVWPRR DLLGSSYTET LRGFFRVGST GPRALEIYRA VVNHVDKLYQ QGQHPHDNVQ STMAHELGHN SKIDLESFVT
LAASRSRRAL YSAANGSEVT VHLIEPLDAD QPRNWLIPRE ELSFRVVLVG LITGVDFIGS LGMSHDEDIP KPQTGCLTNV
PSHWGQYPES EQLQEQDHCL EEGQHAMYQA TRYVELYVVA LEIWNKDKFY TVGLAKVSAL GCYCPEPREG PDVNRFVGGP
LSYALGTSGH YQGHVEGYEG KHLQQKAGTC DSQEFQKLGS ISRYANVTLE CSRHSGAVNQ GGCIMTESIG VCGNLFVEHG
-i 50 i00 150 200 250 300 350 400
i6~
MS2 EQCDCGTPQD KDKCDLEEFC GPGARVAADS FSSNHGVCHA CSAKCNNHGV VVLVILVAAM LPAKNRPPDP IPNQFRPDPP PKVALKVPIQ
CQNPCCNATT DGRKPTCPED CYTFSIPPGC LGTGSNIDTF CNHKRECHCH
VIVAGIVIIR SETVSTNQPP TKPLPELKPK KR
CQLVKGAECA AFQQNGTPCP NGRMYSGRIN ELVLQGTKCE KGWAPPNCVQ KAPRQIQRRS RPIAKPKRPP QVKPTFAPPT
SGTCCHECKV GGYCFDGSCP RCGALYCEGG EGKVCMDGSC RLADVSDEQA VAPKPISGLS PAPPGAVSSS PPVKPGTGGT
Reference 1 Yoshida, S. et al. (1990) Int. Immunol. 2, 585-591.
i7(
KPAGEVCRLS TLAQQCRDLW QKPLERSFCT QDLRVYRSEN ASTSLPVSVV NPLFYTRDSS PLPVPVYAPK VPGATQGAGG
450 500 550 600 650 700 750 800 812
NKG2 family Molecular weight Polypeptide (NKG2-A)
CD94
NKG2
26 270
SDS-PAGE unreduced reduced
70 kDa (NKG2A/CD94 heterodimer) 43 kDa (NKG2A plus 30 kDa CD94)
Carbohydrate N-linked sites O-linked
3 (NKG2-A) none
TtT ?
9NH2
Human gene location 12p12.3-p13.1 (all members)1,2
NKG2A
tTT
Domain
YHC
CPE
Ic',' I~-~ I
,
c,
I
I
Tissue distribution NKG2 transcripts are expressed in NK cell lines and in some T cell clones and lines a. NKG2 proteins are expressed as disulfide-linked heterodimers with CD94 on the surface of NK cells and T cells 4. Different NK cell clones may express one or more NKG2 glycoproteins. Structure NKG2 was originally identified as five closely related cDNAs (NKG2-A, NKG2-B, NKG2-C, NKG2-D and NKG2-E). NKG2-A and -B are alternatively spliced transcripts of the same gene 1,2. NKG2 genes encode type II integral membrane proteins with an extracellular C-type lectin domains. Molecules of the NKG2 family are structurally related to several other molecules (Ly49 family, NKR-P1 family, CD69 and CD94) encoded within the mouse and/or human NK gene complexes which contribute to NK cell function (see Ly-49) s. NKG2 proteins are disulfide-bonded to CD94 proteins and expressed as heterodimers on the surface of NK cells and some T cells 4. Transfection studies suggest that NKG2 glycoproteins may also be expressed as disulfide-linked homodimers 6 The cytoplasmic domain of NKG2A/B has two (I/V)XYXXL motifs but these are absent from other NKG2 molecules. .......
|
I
!
!
i!
Ligands and associated molecules CD94/NKG2 receptors have been implicated in the recognition of HLA-A, -B and-C 4"7, but there is currently no evidence for direct binding. A soluble form of NKG2-C binds to an unidentified ligand on K562 cells and binding correlated with their susceptibility to NK cell lysis 6. NKG2A has been shown to associate with the cytoplasmic tyrosine phosphatase SHP-1, presumably through its cytoplasmic (I/V)XYXXL motifs 9
i71
NKG2 family
Function CD94/NKG2 receptors have been implicated in activation or inhibition of NK cell cytotoxicity and cytokine secretion4"S. Whereas NKG2A is inhibitory, NKG2C activates NK cells 9.
Database accession numbers NKG2A NKG2B NKG2C NKG2D NKG2E
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
PT0372 PT0373 PT0374 PT0374
P26715 P26716 P26717 P26718
X54867 X54868 X54860 X54870 L14542
3 3 3 3
Amino acid sequence of human NKG2A MDNQGVIYSD DFQGNDKTYH HNNSSLNTRT NSSLLSIDNE SDNAELNCAV
LNLPPNPKRQ CKDLPSAPEK QKARHCGHCP EEMKFLSIIS LQVNRLKSAQ
QRKPKGNKSS LIVGILGIIC EEWITYSNSC PSSWIGVFRN CGSSIIYHCK
ILATEQEITY LILMASVVTI YYIGKERRTW SSHHPWVTMN HKL
AELNLQKASQ VVIPSTLIQR EESLLACTSK GLAFKHEIKD
Note: (I/V)XYXXL motifs in bold.
References 1 2 3 4 s 6 7 s 9
Yabe, T. et al. (1993) Immunogenetics 37, 455-460. Plougastel, B. et al. (1996) Immunogenetics 44, 286-291. Houchins, J.P. et al. (1991) J. Exp. Med. 173, 1017-1020. Lazetic, S. et al. (1996) J. Immunol. 157, 4741-4745. Gumperz, J.E. and Parham, P. (1995) Nature 378, 245-248. D~ichler, M. et al. (1995) Eur. J. Immunol. 25, 2923-2931. Phillips, J.H. et al. (1996) I m m u n i t y 5, 163-172. P4rez-Villar, J.J. et al. (1995) J. Immunol. 154, 5779-5788. Houchins, J.P., L.L. Lanier, E.C. Niemi, J.H. Phillips, and J.C. Ryan, submitted.
50 i00 150 200 233
OX2
MRC OX2
Molecular weights Polypeptide 27 928 SDS-PAGE reduced
i!
NH2 iKv---, I
47 kD (rat thymocytes) 41 kD (rat brain)
Carbohydrate N-linked sites O-linked
~~C2
6 nil
---O
T~T~ ~
Human gene location and size ne.~~ 3q12-q13; 8 kb ~ It is probable that a further 5' exon exists in the human ge . CSL CSA Domains I YMq I vl~ v 1 c2 IllTMlC~'I Exon boundaries QVQ YVQ KGY
~
~~ ~~
Isll
T i s s u e distribution = = = =
= =
= = = = = = = = =
= =
= =
The OX2 antigen was originally named MRC OX2 after the first antibody. Human OX2 mRNA is expressed in normal brain and in B cell lines, but not in normal liver, T cell lines, a myeloma line or a monocyte cell line 1. In the rat, OX2 is expressed in thymocytes, B cells, follicular dendritic cells, vascular endothelium, trophoblasts, neurons and some smooth muscle as assessed by mAb binding (reviewed in ref. 2). OX2 is also expressed on activated T cells (Seddon, B., unpublished). Structure The extracellular region consists of two IgSF domains, followed by a hydrophobic transmembrane domain and a cytoplasmic domain 2. The carbohydrate composition of rat thymus and brain OX2 has been determined 3. Ligands and a s s o c i a t e d m o l e c u l e s Recombinant soluble OX2 antigen binds to an unidentified ligand on rodent peritoneal macrophages (Preston, S., Brown, M.H. and Barclay, A.N., unpublished). Function Unknown. Database accession numbers Human Rat
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A47639 A02114
P41217 P04218
M17229 X01785
4
1
OX2
9.
Amino acid sequence of human OX2 VIRMPFSHLS QVQVVTQDER NHGVVIQPAY ISGTACLTVY NSTVTLSHPN KGYWFSVPLL
TYSLVWVMAA EQLYTTASLK KDKINITQLG VQPIVSLHYK GTTSVTSILH LSMFSLVILL
VVLCTA CSLQNAQEAL LQNSTITFWN FSEDHLNITC IKDPKNQVGK VLISILLYWK
IVTWQKKKAV ITLEDEGCYM SATARPAPMV EVICQVLHLG RHRNQDRGEL
SPENMVTFSE CLFNTFGFGK FWKVPRSGIE TVTDFKQTVN SQGVQKMT
References 1 2 3 4
~74
McCaughan, G.W. et al. (1987) Immunogenetics 25, 329-335. McCaughan, G.W. et al. (1987)Immunogenetics 25, 133-135. Barclay, A.N. and Ward, H.A. (1982) Eur. J. Biochem. 129, 447-458. Clark, M.J. et al. (1985) EMBO J. 4, 113-118.
-i 50 i00 150 200 248
CD134L
Molecular weights Polypeptide
21051
SDS-PAGE reduced
32-34 kDa
Carbohydrate N-linked sites O-linked
4 unknown
Human gene location
NH2
1q251
LTS
Domains
Icu ITMI
I
I
LIL
T
i
"]
Tissue distribution OX40L is expressed on activated T and B lymphocytes 1-3. It is also expressed on HTLV-1 transformed and activated B lymphoblastoid and monocytic cell lines 2,4 and vascular endothelial cells s. Messenger RNA for mouse OX40L is widespread 1.
Structure OX40L is a member of the TNF superfamily 2'6. Like other members of this superfamily, it is a type II membrane protein expressed as a trimer with the similarity to TNF being in the C-terminal extracellular region 1,2,4.
Ligand and associated molecules OX40 ligand binds to CD 134 and there is no evidence for another ligand 7. The affinity of monomeric CD134 for OX40L is 190nM and of trimeric OX40L for CD 134 on the surface of activated T cells is 0.2 nM a. Three CD 134 receptors bind one OX40L molecule a.
Function Crosslinking OX40L on activated B cells stimulates proliferation and Ig production 3. Similarly, blocking the OX40L-CD 134 interaction reduced IgG production, suggesting a role in differentiation into plasma cells 9. OX40L mAbs block binding of activated T cells to endothelial cell lines s. OX40L binding to CD 134 on T cells co-stimulates proliferation 1,2
Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A39680
P23510 P43488
X79929 U12763
2 3
~7~
OX40L
Amino acid sequence of human OX40L MERVQPLEEN QVSHRYPRIQ YLISLKGYFS LNVTTDNTSL
VGNAARPRFE SIKVQFTEYK QEVNISLHYQ DDFHVNGGEL
RNKLLLVASV KEKGFILTSQ KDEEPLFQLK ILIHQNPGEF
IQGLGLLLCF TYICLHFSAL KEDEIMKVQN NSVIINCDGF KVRSVNSLMV ASLTYKDKVY CVL
References 1 2 3 4 s 6 7 s 9
i7t
Baum, P.R. et al. (1994) EMBO J. 13, 3992-4001. Gruss, H.-J. and Dower, S.K. (1995) Blood 85, 3378-3404. Stuber, E. et al. (1995) Immunity 2, 507-521. Godfrey, W.R. et al. (1994) J. Exp. Med. 180, 757-762. Imura, A. et al. (1996)J. Exp. Med. 183, 2185-2195. Armitage, R.J. (1994)Curr. Opin. Immunol. 6, 407-413. A1-Shamkhani, A. et al. (1996) Eur. J. Immunol. 26, 1695-1699. A1-Shamkhani, A. et al. (1997) J. Biol. Chem. 272, 5275-5282. Stuber, E. and Strober, W. (1996)J. Exp. Med. 183, 979-989.
50 i00 150 183
EC 3.1.4.1, EC 3.6.1.9 Molecular weights Polypeptide
99 930
SDS-PAGE reduced unreduced
115-120 kDa 220 kDa
$-S
Carbohydrates N-linked sites O-linked sites
9 unknown NH2
Human gene location 6q22-q23
Domains
lcu
CFE ETcCGE I ,I svc, So I So I
I
Tissue distribution PC- 1 is expressed on plasma cells 1. It is also expressed in non-lymphoid tissues including epithelial cells in testis, salivary gland, kidney, brain capillaries and bone chondrocytes i.
Structure PC-1 is a disulfide linked homodimeric type II membrane protein 2-4. The membrane-proximal region contains two somatomedin-B like domains, followed by a catalytic domain characteristic of 5' nucleotidases. Two consensus sequences of EF-hand like divalent cation binding motifs are found between residues 265-294 and 739-767 s.
Function PC-1 is an ecto-enzyme with alkaline phosphodiesterase I (EC 3.1.4.1) and nucleotide pyrophosphatase (EC 3.6.1.9) activities 6,7. PC-1 was also found to have autophosphorylation activity s, and autophosphorylation inactivates its other two enzymatic activities 9. It has been suggested that autophosphorylation at low ATP concentration is a regulatory mechanism which prevents depletion of nucleotides when they are scarce 9.
Database accession numbers PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
Human
A39216
P22413
Mouse
A27410
P06802
M57736 D 12485 J02700
a 4 lo
i77
PC-1
Amino acid sequence of human PC-1 MDVGEEPLEK KEVKSCKGRC RCGEKRLTRS CPAGFETPPT PTKTFPNHYS GEPIWVTAKY VLQWLQLPKD LMDGLKELNL PAARLRPSDV KSDRIEPLTF PGFKHGIEAD KHPKEVHPLV KHETLPYGRP FSTEDFSNCL YSEALLTTNI YDGRCDSLEN TLAFILPHRT QRKEPVSDIL
AARARTAKDP NTYKVLSLVL SVCVLTTILG FERTFGNCRC DAACVELGNC CLDYQETCIE L C A C S D D C K D KGDCCINYSS VCQGEKSWVE LLFSLDGFRA EYLHTWGGLL PVISKLKKCG IVTGLYPESH GIIDNKMYDP KMNASFSLKS QGLKSGTFFW PGSDVEINGI FPDIYKMYNG ERPHFYTLYL EEPDSSGHSY GPVSSEVIKA HRCLNLILIS DHGMEQGSCK KYIYLNKYLG PDKYYSFNYE GIARNLSCRE PNQHFKPYLK YLDPQWQLAL NPSERKYCGS GFHGSDNVFS TFENIEVYNL MCDLLNLTPA PNNGTHGSLN QCPFTRNPRD NLGCSCNPSI LPIEDFQTQF RVLQKENTIC LLSQHQFMSG YSQDILMPLW YQDFRIPLSP VHKCSFYKNN TKVSYGFLSP VPMYQSFQVI WRYFHDTLLR KYAEERNGVN LRQKRRVIRN QEILIPTHFF IVLTSCKDTS DNSESCVHGK HDSSWVEELL MLHRARITDV KLKTHLPTFS QED
CIFGLKPSCA PEHIWTCNKF EPCESINEPQ TYTKNMRPVY KEKFNPEWYK SVPFEERILA LQRVDGMVGM DVKNIKVIYG HFLPKRLHFA NMQALFVGYG HLLKNPVYTP NLTVAEEKII TSYTVDRNDS PQLNKNSSGI VVSGPVFDFD QTPLHCENLD EHITGLSFYQ
References 1 2 3 4 s 6 7 8 9 ~o
Harahap, A.R. and Goding, J.W. (1988)J. Immunol. 141, 2317-2320. Goding, J.W. and Shen, F.W. (1982) J. Immunol. 129, 2636-2640. Buckley, M.F. et al. (1990) J. Biol. Chem. 265, 17506-17511. Funakoshi, I. et al. (1992) Arch. Biochem. Biophys. 295, 180-187. Belli, S.I. et al. (1994} Biochem. J. 304, 75-80. Rebbe, N. et al. (1991) Proc. Natl Acad. Sci. USA 88, 5192-5196. Belli, S.I. and Goding, J.W. (1994) Eur. J. Biochem. 226, 433-443. Belli, S.I. et al. (1995) Eur. J. Biochem. 228, 669-676. Stefan, C. et al. (1996) Eur. J. Biochem. 241, 338-342. van Driel, I.R. and Goding, J.W. (1987) J. Biol. Chem. 262, 4882-4887.
5O i00 150 200 250 300 350 400 450 5O0 550 600 650 700 750 8OO 850 873
Molecular weights Polypeptide 29 279
!.....................
SDS-PAGE reduced
50-55 kDa
Carbohydrate N-linked sites O-linked sites
4 unknown
Human gene location 2q37.3 Domain
CSF
[_@___
OH YLC
ml
l l C m ~
Tissue distribution Mouse PD-1 was isolated by subtractive hybridization from the T cell line 2B4.11 and the haematopoietic progenitor cell line LyD9 under conditions designed to detect genes expressed following the induction of programed cell death (PCD). Analysis of mRNA indicates that the expression of PD-1 is associated with cells undergoing classical apoptosis, but not with cells undergoing non-apoptotic PCD 1. PD-1 is expressed on 3-5% of normal thymocytes but on -35% of CD4-/CD8- cells. PD-l-expressing thymocytes can be divided into two distinct populations 2: the high-level PD-1-expressing thymocytes are of the ?~ TCR lineage; PD-1 is also expressed at the transition phase between CD4-/CD8- to CD4+/CD8 § stage on thymocytes of the aft TCR lineage. Expression of PD-1 on thymocytes, as well as T cells in the spleen and lymph nodes, can be stimulated in v i v o with anti-CD3 antibodies 2'3. PD-1 expression is not readily detected in the brain, heart, lung, spleen or kidney 1. Structure PD-1 is a type I membrane protein with one IgSF domain in the extracellular region. The cytoplasmic segment contains two ITAM motifs 1,2.
Function From mRNA analysis it appears that PD-1 expression is strongly associated with the apoptotic programmed cell death 1,3. It has been suggested that PD1 may play a role in clonal selection of lymphocytes 4. Database accession numbers PIR
Human Mouse
$28029
SWISSPR OT
EMBL/GENBANK
Q02242
L27440 X67914
REFERENCE 4 1
~7~
Amino acid sequence of human PD-1 MQIPQAPWPVVWAVLQLGWR PGWFLDSPDR PWNPPTFSPA SPSNQTDKLA AFPEDRSQPG YLCGAISLAP KAQIKESLRA VGVVGGLLGS LVLLVWVLAV VDYGELDFQW REKTPEPPVP PRSAQPLRPE DGHCSWPL
LLVVTEGDNA QDCRFRVTQL ELRVTERRAE ICSRAARGTI CVPEQTEYAT
TFTCSFSNTS PNGRDFHMSV VPTAHPSPSP GARRTGQPLK IVFPSGMGTS
ESFVLNWYRM VRARRNDSGT RSAGQFQTLV EDPSAVPVFS SPARRGSADG
References 1 z a 4
~8s
Ishida, Y. et al. (1992) EMBO J. 11, 3887-3895. Nishimura, H. et al. (1996)Int. Immunol. 8, 773-780. Agata, Y. et al. (1996) Int. Immunol. 8, 765-772. Shinohara, T. et al. (1994) Genomics 23, 704-706.
-i 50 i00 150 200 250 268
RT6 RT6.1
Molecular weights SDS-PAGE reduced unreduced
RT6.1 RT6.2 RT6.1 RT6.2
24, 25, 21, 21,
27, 30-35 kDa 28 kDa 23, 27-32 kDa 24 kDa
RT6.1 RT6.2 RT6.1 RT6.2
1 0 unknown nil
Carbohydrate N-linked sites O-linked
Human gene location llq13
Tissue distribution
i .......
RT6 is a specific marker for peripheral T lymphocytes. In the rat, RT6 is expressed on the majority of mature peripheral T cells, but not on thymocytes or any other haematopoietic cells. Recent thymic emigrants are RT6-/Thy-1 § and these mature to RT6+/Thy-1 - cells 1. Intestinal intraepithelial T cells express RT6 at particularly high levels 2. At the mRNA level, the expression of mouse Rt-6 is similar to that in the rat 3. The human RT6 gene is transcriptionally inactive 4.
Structure RT6.1 and RT6.2 are the products of separate alleles of the rat RT6 locus i. Both are GPI-anchored and a proportion of RT6.1 is N-glycosylated. The cDNA sequences of RT6.1 and RT6.2 differ at 18 positions leading to 12 amino acid substitutions and the presence of a glycosylation site in the translated sequence of RT6.1 1,s. In the mouse, two closely linked genes encode different RT6 proteins, Rt6-1 and Rt6-2, both of which are polymorphic 3,6. Remarkably, the single copy gene for human RT6 is a pseudogene as a result of three premature in-frame stop codons 4. RT6 shows sequence homology to a family of bacterial toxins that function as mono-ADP-ribosyltransferases 7.
Ligands and associated molecules RT6 co-immunoprecipitates with the Src family tyrosine kinases Fyn and Lck in rat T cells 8.
Function RT6 has NAD-metabolizing activity and can undergo auto-ADP-ribosylation on arginine residues. RT6 may also ADP-ribosylate other target proteins at the cell surface thereby modulating their function (reviewed in ref. 7). This may help to explain the susceptibility to autoimmune disease of some
i81
RT6
experimental animals which have deficient RT6 expression 7. Incubation of mouse cytotoxic T cells with NAD suppresses their ability to lyse target cells, an effect mediated through a GPI-anchored ADP-ribosyltransferase. This has prompted speculation that the transferase is RT6, suggesting a possible link between low RT6 expression and enhanced T cell autoreactivity 7.
Database accession numbers PIR
Human Rat RT6.1 Rat RT6.2 Mouse Rt6-1
S08464 A34866 S12738
SWISSPR OT
P17982 P20974 P17981
EMBL/GENBANK
REFERENCE
X65050 X52082
4 5 1 9
X52991
Amino acid sequences of the allelic forms of rat RT6 RT6.2
-i -i
RT6.1 L T G P L M L D T A P N A F D D Q Y E G CVNKMEEKAP LLLKEDFNKS E K L K V A W E E A RT6.2 Q .... MN A
50 50
RT6.1 KKRWNNIKPS M S Y P K G F N D F H G T A L V A Y T G S I G V D F N R A V R E F K E N P G Q F RT6.2 R A
i00 i00
RT6.1 H Y K A F H Y Y L T R A L Q L L S N G D CHSVYRGTKT R F H Y T G A G S V R F G Q F T S S S L RT6.2
150 150
RT6.1 SKTVAQSPEF FSDDGTLFII KTCLGVYIKE F S F Y P D Q E E V L I P G Y E V Y Q K RT6.2 --K .... Q . . . . . H -R
200 200
RT6.1 V R T Q G Y N E I F L D S P K R K K S N YNCLYSS RT6.2 ---C-
227 227
RT6.1 A G T R E S C V S L F L V V L T S L L V QLLCLAEP RT6.2 --A P
+28 +28
RT6.1 MPSNICK~F~ T~T,IQQVTG
Residues different between RT6.1 and RT6.2 are shown with identical residues indicated by dashes.
References
.
.
.
.
.
.
.
.
.
.
.
.
.
~82
1 2 3 4 5 6 7 s 9
Koch, F. et al. (1990) Proc. Natl Acad. Sci. USA 87, 964-967. Fangmann, J. et al. (1991) Eur. J. Immunol. 21, 753-760. Prochazka, M. et al. (1991) Immunogenetics 33, 152-156. Haag, F. et al. {1994) J. Mol. Biol. 243, 537-546. Haag, E et al. (1990) Nucleic Acids Res. 18, 1047. Koch-Nohe, F. et al. (1995) Immunogenetics 41, 152-155. Koch-Nolte, E et al. (1996) Immunol. Today 17, 402-405. Rigby, M.R. et al. (1996) Diabetes 45, 1419-1426. Koch, F. et al. (1990) Nucleic Acids Res. 18, 3636.
Stem cell antigen 2, thymic shared antigen 1 (TSA-1)
Molecular weights Polypeptide 8828 Carbohydrate
~ ~ ~ ~ ~
N-linked sites O-linked
1 unknown
~ ~ CFS
Domain
I
Isl
,6
~ ~
SFC I
IGI
Tissue distribution Sca-2 is expressed on intrathymic lymphoid precursors, immature thymocytes, B220 + bone marrow cells and mature B cells, but is absent from mature thymocytes and peripheral T cells 1,2. Structure Sca-2 is a small Cys-rich GPI-linked protein containing a single Ly-6 domain 3. Ligands and a s s o c i a t e d m o l e c u l e s Unknown. Function Fetal thymic organ cultures repopulated with very early pre-T cells (CD44-CD25-TCRafl-CD4-CD8-) and treated with MTS 35 (anti-TSA-1/ Sca-2 mAb)show a skewed development towards TCRafl+CD4-CD8 + cells, and a decreased number of TCR~fl+CD4+CD8 - cells 4. This suggests a role for Sca-2 in the positive selection of thymocytes towards the CD4 § or CD8 § pathways.
or lineage c o m m i t m e n t
Comments The deduced amino acid sequences of Sca-2 and TSA-1 are identical, except at position - 7 within the N-terminal leader sequence where a Gly residue has
been reported for Sca-2 and an Arg residue has been reported for TSA-1 3,4. The mouse Tsa-1 (Sca-2) locus is linked to Ly6 on chromosome 15 4. Database accession numbers PIR
Mouse Chicken
SWISSPR OT
148910
EMBL/GENBANK
REFERENCE
U04268 L34554
3
A m i n o acid s e q u e n c e of m o u s e Sca-2 MSATSNMRVF LMCFSCTDQK SPICPSENVN AGLGLRASIP
LPVLLAALLG NNINCLWPVS LNLGVASVNS LLGLGLLLSL
MEQVHS CQEKDHYCIT LSAAAGFGNV NLGYTLNKGC Y C C Q S S F C N F SA LALLQLSP
-1 50 82 +28
i85
References 1 Wu, L. et al. (1991) J. Exp. Med. 174, 1617-1627. 2 Spangmde, G.J. et al. (1988) J. Immunol. 141, 3697-3707. 3 Classon, B.J. and Coverdale, L. (1994) Proc. Natl Acad. Sci. USA 91, 5296-5300. 4 MacNeil, I. et al. (1993) J. Immunol. 151, 6913-6923.
i84
SR-A1 and SR-AII Molecular weights Polypeptide type I type II SDS-PAGE reduced unreduced
49762 39583
77 kDa 220 kDa
I Carbohydrate
, !
N-linked sites O-linked
7 unknown
SCAVENGER RECEPTOR I
S C A V E N G E R R E C E P T O R II
c o l l a g e n like
alpha helical coiled coil
i
ax
L
TTTT ?TT~ NH2 NH2 NH2
3 T~T NH 2 NH 2 NH 2
i85
Human gene location 81
Scavenger Receptor I Domains
GRV
[CYITM [
I
I
I
So
VTCI I
Tissue distribution The major sites of scavenger receptors I and II expression are tissue macrophages 1-3. Inflammatory stimuli upregulate expression 1. Scavenger receptor has been identified in atherosclerotic lesions of human tissues 1,e.
Structure Both receptors are type II glycoproteins and are present as trimers at the cell surface 1. It is not known if naturally occurring receptors are homotrimers or heterotrimers of type I and II proteins. The bovine sequence was determined first and is 70% identical to the human sequence 1,4. The mature type I receptor has a 50 amino acid N-terminal cytoplasmic sequence, a 26 amino acid transmembrane sequence, a 75 amino acid sequence with no homology to other proteins, a 121 amino acid sequence predicted to have an a helical coiled-coil structure, a 69 amino acid collagen-like domain, and a C-terminal scavenger receptor cysteine-rich domain. The type II receptor is identical to the type I receptor to the end of the collagen-like sequence, but the Cterminal scavenger receptor domain is replaced by a 17 amino acid segment s. The two scavenger receptors are generated by alternative mRNA splicing 1. The type I receptor has an overall structure similar to MARCO, both having a C-terminal domain scavenger receptor cysteine-rich domain e'4. Scavenger receptor has a shorter collagenous region than MARCO 2.
Ligands and associated molecules Scavenger receptors I and II bind modified low-density lipoproteins 1,2. Both receptors have a similar broad specificity for polyanionic ligands including cell surface components of gram-positive and gram-negative bacteria ~'e. There is evidence from mutagenesis studies for their ligand binding properties being located in the collagenous region ~. Scavenger receptors I and II bind fl-amyloid fibrils 6.
Function Roles in clearance of microbes and damaged or apoptotic cells and in recirculation are postulated 1,3.
i8~
Database accession numbers H u m a n type I H u m a n type II M o u s e type I M o u s e type II
9.
Amino
PIR A38415 B38415 A38860
SWISSPR O T P21757 P21759 P30204 P30204
EMBL/GENBANK D90187 D90188 M36817, M59445 M36818, M59446
REFERENCE 7 7 8 8
a c i d s e q u e n c e of h u m a n s c a v e n g e r r e c e p t o r I
MEQWDHFHNQ QEDTDSCSES VKFDARSMTA LLPPNPKNSP SLQEKLKSFK AALIALYLLV FAVLIPLIGI VAAQLLKWET KNCSVSSTNA NDITQSLTGK GNDSEEEMRF QEVFMEHMSN MEKRIQHILD MEANLMDTEH FQNFSMTTDQ RFNDILLQLS TLFSSVQGHG NAIDEISKSL ISLNTTLLDL QLNIENLNGK IQENTFKQQE EISKLEERVY NVSAEIMAMK EEQVHLEQEI KGEVKVLNNI TNDLRLKDWE HSQTLRNITL IQGPPGPPGE KGDRGPTGES GPRGFPGPIG PPGLKGDRGA IGFPGSRGLP GYAGRPGNSG PKGQKGEKGS GNTLTPFTKV RLVGGSGPHE GRVEILHSGQ WGTICDDRWE VRVGQVVCRS LGYPGVQAVH KAAHFGQGTG PIWLNEVFCF GRESSIEECK IRQWGTRACS HSEDAGVTCT L
Amino
a c i d s e q u e n c e of h u m a n
MEQWDHFHNQ AALIALYLLV GNDSEEEMRF RFNDILLQLS IQENTFKQQE TNDLRLKDWE PPGLKGDRGA DHIRAGPS
QEDTDSCSES FAVLIPLIGI QEVFMEHMSN TLFSSVQGHG EISKLEERVY HSQTLRNITL IGFPGSRGLP
50 i00 150
200
250 300
350 400 450 451
s c a v e n g e r r e c e p t o r II
VKFDARSMTA VAAQLLKWET MEKRIQHILD NAIDEISKSL NVSAEIMAMK IQGPPGPPGE GYAGRPGNSG
LLPPNPKNSP KNCSVSSTNA MEANLMDTEH ISLNTTLLDL EEQVHLEQEI KGDRGPTGES PKGQKGEKGS
SLQEKLKSFK NDITQSLTGK FQNFSMTTDQ QLNIENLNGK KGEVKVLNNI GPRGFPGPIG GNTLRPVQLT
50 i00 150 200 250 300 350 358
References 1 z 3 4 s 6 7 s
Kreiger, M. and Herz, J. (1994) Annu. Rev. Biochem. 63, 601-637. Pearson, A.M. (1996) Curr. Opin. Immunol. 8, 20-28. Hughes, D.A. et al. (1995) Eur. J. Immunol. 25, 466-473. Resnick et al. (1994)Trends Biochem. Sci. 19, 5-8. Ashkenas, J. et al. (1993) J. Lipid Res. 983-1000. Khoury, J. E1 (1996) Nature 382, 716-719. Matsumoto, A. et al. (1990) Proc. Natl Acad. Sci. USA 87, 9133-9137. Freeman, M. et al. (1990) Proc. Natl Acad. Sci. USA 87, 8810-8814.
~8~
Sheep erythrocyte receptor
I I !
Molecular weights Polypeptide
181065
SDS-PAGE unreduced reduced
175 kDa 185 kDa
V
C2 :
Carbohydrate
I
i
s
,X
N-linked sites O-linked
15 nil
xoi
Human gene location 20p131
Tt Tt ttttt COOH
c~F
Domains
I
~1 ' v CTV Iu I C~
COV
C~
CST I ,Lc
'1
c~
CSV
I Y"c
I
YNF
C~
I
c~
C'V
I YTC
I
c.v CST CSV I YTC I YFC I YTC
C~
c2
CTV
I YRc
I
I
02
c2
CE0
I Ysc
I
I
C2
I
CTV
I u
I
C.S C.V I u165 I u
i Yvc
I
C~ I'
c~
I
C'L
I u
C~ '
c~ I
C.G I YAc
c2
I
CRY I u
! ' c~ ' IT~lCu
Tissue distribution Sialoadhesin is expressed on subsets of macrophages in both bone marrow and secondary lymphoid organs 2,3.
Structure Sialoadhesin is the eponymous member of a structurally related group of IgSF domain-containing sialic acid binding proteins called the sialoadhesin family, which includes CD22, CD33 and myelin-associated glycoprotein (MAG) 4. Members of this family share -35 % identity between their 2-4 membranedistal IgSF domains. Like other members of the sialoadhesin family, sialoadhesin is predicted to have an unusual disulfide bond between // strands B and E in domain 1 and a disulfide bond between domains 1 and 2 4 Full-length sialoadhesin has 17 IgSF domains in the extracellular regional
i88
Sialoadhesin
Two splice variants have been identified which encode soluble proteins truncated after the 3rd and 16th IgSF domain, respectively. Domains 4-17 of sialoadhesin is made up of seven homologous tandem repeats consisting of a short and a long IgSF domain. The N-terminus of the mature polypeptide has been confirmed by peptide sequencing 3. Ligands and associated molecules Sialoadhesin binds to sialoglycoconjugates NeuAca2 --, 3Galfll --, 3(4)GlcNAc and NeuAca2--* 3Galfll --* 3GalNAc which are present on glycoproteins and glycolipids. The sialic acid binding site has been localized to the GFCC'C"// sheet of the membrane-distal IgSF domain and includes a conserved arginine (residue 97)s. Function Cell-adhesion studies suggest that sialoadhesin can mediate preferential recognition by macrophages of cells of the granulocytic lineage 6. The biological role of such interactions are unknown but trophic and/or phagocytic functions have been suggested 6. Sialoadhesin can also mediate adhesion to lymphocytes in spleen section overlay assays 7. Database accession numbers Mouse
PIR $50065
SWISSPR OT
EMBL/GENBANK Z36293
REFERENCE 3
A m i n o a c i d s e q u e n c e of m o u s e s i a l o a d h e s i n MCVLFSLLLL TWGVSSPKNV VIHSGDPKLV SDSNRWLDVK KQVSLQWRGQ FSLGAHSSRK PAVSAVQWAR LSLHVFMAEV DAHASTLHLP TFLETQAGLV SAPNSLRLEI AEVVEGQAVT DAGSYYCRTQ RGILLCHVDS VEIQKPVLED EANLTCNGNQ CRLLTEDGAQ LAHLSLFRGD YHCEATNILG GTSYSWYQDG PVSLHVSYTP ASTLQGADEL QASAAADFDA GQQLLGARSI
ASVFSLGQT QGLSGSCLLI DKRFRGRAEL GTTVTVTTDP DPTHSVTSSF EVYLQVPHAP DGVNLGVTGH KMNPAGPVLE AVTRADTGFY GILHCSVVSE RDLQPADSGE LSCRSGLSPA AGPNTSGPSL DPPAQLRLLH EGVYLCEASN EVAVSPANFS LSAPVVLSVL HLLATNLEPQ SANSSLFFQV RPLQESTSST RHVTLSALLS AGSNPRLHVT QAVRVTVWPN TLPSVKVLDA
PCIFSYPADV MGNMDHKVCN SPPTITIPEE QSLEPTGSYH KGVEILLSSS VLRLFSAAWN NETVTLLCST FCEVQNAQGS PLATVVLSHG YTCLAVNSLG PDTRFSWYLN PTVLTVFYPP KGHVVATSLP TLGNSSAAAS WFRNGVLWTQ YAPDPPKLSA RPSHGRIQAK RGAWVRFTIT LRIAAISLRQ TDPERLGHLV VLPNELRLQI ATVQEGQQVN TSYRCGVGLP
PVSNGITAIW LLLKDLKPED LREGMERNFN QTTLHMALSW GRNILPGDPV DSGAYTCQAT PKEAPQELRY ERSSPLSVVV GLTLASNSGE NSTSSLDFYA GALLLEGSSS RKPTFTARLD SRCGSCSQRT FNAKATVLVI GSLETVRLQL LLDVGQGHMA ATANSLQLEV ELREGQAVVL AGAYHCQAQA CSVQSDPPAQ HFPELEDDGT LTCLVWSTHQ GHAPHLSRPV
YYDYSGKRQV SGTYNFRFEI CSTPYLCLQE QDHGRTLLCQ TLTCRVNSSY NDMGSLVSSP SWYKNHILLE RYPPLTPDLT NDFNPRFRIS NVARLLINPS SLLLPAASST LDTSGVGDGR KVSRTSNSLH TPSNTLREGT LARTDAAVYA VFICTVDSYP RELGLVDSGN SCQVPTGVSE PDTAIASLAA LQLFHRNRLV YTCEASNTLG DSLSYTWYKG TLDVLHAPRN
-i 50 i00 150 200 250 300 350 400 450 500 550 600 650 7OO 75O 8OO 85O 9OO 95O i000 1050 ii00 1150
~8~
Sialoadhesin LRLTYLLETQ RLELQDPRPS EPVTVTCEDP QVHDTQGTRS AEMVLSHNGK VCTAQHTLGS PTGNSSFTWF APVMLRVLYP VASNQLHDAP SASAYFGTRA LNEDENSAEM
GRQLALVLCT NEGLYSCSAH AALSSALYAW SRPASLQILY VLAASHERHS ISTTQRLLTE WNRHRLHSAP PKTPTLIVFV TKPHIRVTAP LHQLQLFQRL ATKKNTIQEE
VDSRPPAQLT SPLGKANTSL FHNGHWLQEG APRDAVLSSF SASGIGHIQV TDIRVTAEPG VPTLSFTPVV EPQGGHQGIL PNALRVDIEE LWVLGFLAGF VVAAL
LSHGDQLVAS ELLLEGVRVK PASSLQFLVT RDSRTRLMVV ARNALRLQVQ LDWPEGTALN RAQAGLYHCR DCRVDSEPLA LGPSNQGEYV LCLLLGLVAY
STEASVPNTL MNPSGSVPEG TRAHAGAYFC IQCTVDSEPP DVTLGDGNTY LSCLLPGGSG ADLPTGATTS ILTLHRGSQL CTASNTLGSA HTWRKKSSTK
References 1 z 3 4 s 6 7
~9C
Mucklow, S. et al. {1995) Genomics 28, 344-346. Crocker, P.R. et al. (1991) EMBO J. 10, 1661-1669. Crocker, P.R. et al. (1994) EMBO J. 13, 4490-4503. Crocker, P.R. et al. (1996) Biochem. Soc. Trans. 24, 150-156. Vinson, M. et al. (1996) J. Biol. Chem. 271, 9267-9272. Crocker, P.R. et al. (1995) J. Clin. Invest. 95, 635-643. van den Berg, T.K. et al. (1992) J. Exp. Med. 176, 647-655.
1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1675
Other names Mpl (myeloproliferative leukemia)
I
CK
Molecular weights Polypeptide 68 555 SDS-PAGE reduced
82-84 kDa
Carbohydrate N-linked O-linked
4 unknown
-4
ci{
I
Human gene location and size lp34; 17 kb 1 c
COOH Domains Exon boundaries
CFWlYVC , QDV
PRE
WEE NSY I ~ VGL
REAS
CQWsRc 7QHsRY I ,
AVA
DRY
AVR
WKVL
TAW/ Si~PK YRR
Tissue distribution Thrombopoietin receptor is restricted to the megakaryocyte lineage and is expressed on megakaryocytes, platelets and weakly on CD34 § primitive stem cells 2
Structure Thrombopoietin receptor was originally defined as the cellular counterpart of the myeloproliferative leukemia virus oncogene 3,4. Thrombopoietin receptor belongs to the cytokine receptor superfamily s.
Ligands and associated molecules Thrombopoietin was revealed as a ligand by virtue of its stimulatory activity on a cell line expressing thrombopoietin receptor a.
Function Thrombopoietin binding to its receptor stimulates growth and differentiation of megakaryocytes progenitors a. Mice lacking thrombopoietin receptor are deficient in megakaryocytes and their precursors a. Mutations of thrombopoietin receptor such as the introduction of cysteines into the predicted dimer interface result in an activated, ligand-independent phenotype with
;91
functional consequences similar to those observed in the myeloproliferative disease induced by mplv in mice 7. Binding of thrombopoietin to thrombopoietin receptor activates the JAK-STAT signal transduction pathway 8. Database accession numbers Human Mouse
PIR
SWISSPR OT
EMBL/GENBANK
REFERENCE
A45266 $35317
P40238 G08351
M90102 Z22649
3 4
Amino acid sequence of human thrombopoietin receptor MPSWALFMVT QDVSLLASDS ACPLSSQSMP FVDSVGLPAP PKNSTGPTVI SPSREASALT VTVDLPGDAV DRYPIWENCE HSYLGSPFWI RYTGEGHQDW SWSDPTRVET HALWPSLPDL ERTPLPLCSS YLPLSYWQQP
---7 References 1 2 3 4 5 6 7 8
i. . . . . .!. . . . .
;9~
SCLLLAPQNL EPLKCFSRTF HFGTRYVCQF PSIIKAMGGS QLIATETCCP AEGGSCLISG ALGLQCFTLD EEEKTNPGLQ HQAVRLPTPN KVLEPPLGAR ATETAWISLV HRVLGQYLRD QAQMDYRRLQ
AQVSS EDLTCFWDEE PDQEEVRLFF QPGELQISWE ALQRPHSASA LQPGNSYWLQ LKNVTCQWQQ TPQFSRCHFK LHWREISSGH GGTLELRPRS TALHLVLGLS TAALSPPKAT PSCLGTMPLS
EAAPSGTYQL PLHLWVKNVF EPAPEISDFL LDQSPCAQPT LRSEPDGISL QDHASSQGFF SRNDSIIHIL LELEWQHPSS RYRLQLRARL AVLGLLLLRW VSDTCEEVEP VCPPMAESGS
LYAYPREKPR LNQTRTQRVL RYELRYGPRD MPWQDGPKQT GGSWGSWSLP YHSRARCCPR VEVTTAPGTV WAAQETCYQL NGPTYQGPWS QFPAHYRRLR SLLEILPKSS CCTTHIANHS
Mignotte, V. et al. (1994) Genomics 20, 5-12. Debili, N. et al. (1995) Blood 85, 391-401. Vigon, I. et al. (1992) Proc. Natl Acad. Sci. 89, 5640-5644. Skoda, R.C. et al. (1993) EMBO J. 12, 2645-2653. Sprang, S.R. and Bazan, J.F. (1993) Curr. Opin. Struct. Biol. 3, 815-827. Alexander, W.S. et al. (1996) Blood 87, 2162-2170. Alexander, W.S. et al. (1995) EMBO J. 14, 5569-5578. Pallard, C. et al. (1995) EMBO J. 14, 2847-2856.
-i 50 I00 150 200 250 300 350 400 450 500 550 600 610
WC1 (cattle) antigen Molecular weights Polypeptide
154197
SDS-PAGE reduced unreduced
220 kDa 220 kDa
Carbohydrate N-linked sites O-linked sites
17 unknown
4G
H Domains GRV I
l.sl z
GRV VVC j
so
GRV I
I
1L
GQV VVC t
so.,
VVC GQV
so
J l !
GRV
~1~so .......~[,l/s~
VIC GRV
so
VIC
VIC GWL
I t ..... I I i I il so i so.,,
VIC Gi-WL
IWC GRV
ISC GRV
VRC
I] i So
i
so
.....1 I . S o
lr,MI
VNi
cu
I
Tissue distribution The WC 1 antigens are expressed exclusively on CD4-/CD8- 75 T lymphocytes (>90% of circulating 75 T cells) in cattle 1, sheep (the T19 antigen) 2, and pig 3. WC1 § cells are found in the thymus, lymph nodes, spleen, in the dermis of the skin and the lamina propria of the gastrointestinal tract 1. The T19 antigen in the sheep has a similar distribution 2. Multiple WC1 genes (estimated to be in the order of 10 genes within a region of-1Mb) are found in cattle 4 which encode heterogenous WC1 antigens with overlapping
~92
II
epitopes. Using mAb specific for each of three WC antigens (WC1.1, WC1.2 and WC 1.3) it was found that the WC 1.1 and WC 1.2 antigens are expressed on nonoverlapping subsets of peripheral blood 7~ T cells, whereas the WC1.3 § subset is contained within the WCI.1 § subset 4. The number of T19 genes in sheep is estimated to be between 50 and 100 s. WC1 genomic sequences have been identified in human and mouse but, in contrast to ruminants, there appear to be only one or two copies of the gene 4.
Structure WC 1.1 is a type I membrane protein 6. The extracellular region consists of 11 SRCR domains. Domains 2-6 and domains 7-11 appear to be coded for by a higher order repeating element with a gap between the second and third domains (i.e. between repeats 3 and 4, and between repeats 8 and 9 of WCI.1). A similar type of repeating element is found in CD163 7. The two other WC1 cDNA clones that have been isolated (WC1.2 and WC1.3) are highly homologous (--85% identity)to WCI.14.
Function The function of WC 1 is not known. It has been speculated that they may be the CD4/CD8 homologue of 7~ T cells 2,4. Alternatively, they could be involved in the control of 7~ T cell homing to various tissues 4. Database accession numbers Cattle (WCI.1)
PIR
SWISSPR O T
EMBL/GENBANK
REFERENCE
A46496
P30205
X63723
6
Amino acid sequence of cattle WCI.1 antigen
!
59~
MALGRHLSLR QALELRLKDG AIGFPGGAYF RDAGVVCSGF ELGCGKAVSV HHSGSAQVVC ANVICRQLGC LGGPDCSHGN RQLRLVDGGG NATGSAHFGA ICSEFLALRM CGDSGTLNSS EEAYIWCADS CKQLGCGEAL NCNHQEDAGV QVICAELGCG PGGTCLHSGA WSLANANVVC CPVTALGGPD YCSDSRQLRL CGEALNATGS
GLCVLLLGTM VHRCEGRVEV GPGLGPIWLL VRLAGGDGPC LGHELFRESS SAYSEVRLMT GVAISTPGGP TASVICSGNQ PCAGRVEILD GSGPIWLDNL VSEDQQCAGW VALREGFRPQ RQIRLVDGGG DATVSSFFGT ICSGFVRLAG KAVSVLGHMP AQVVCSVYTE RQLGCGVAIS CSHGNTASVI VDGGGPCGGR AHFGAGSGPI
VGG KHQGEWGTVD YTSCEGTEST SGRVEVHSGE AQVWAEEFRC NGSSQCEGQV HLVEEGDQIL IQVLPQCNDS QGSWGTICDD NCTGKESHVW LEVFYNGTWG WVDRIQCRKT RCSGRVEILD GSGPIWLDEV GDGPCSGRVE FRESDGQVWA VQLMKNGTSQ TPRGPHLVEG CSGNHTQVLP VEILDQGSWG WLDDLNCTGK
GYRWTLKDAS VVCRQLGCGA VSDCEHSNIK DYRNDGYNHG AWIPVSDGNF TLATAQIICA EGEEPELWVC PRVPCPGGTC EMNISGQWRA LCASHWSLAN TARFHCSGAE SFLWSCPVTA VSQPTGSAAS EDSAPYCSDS GWDLDDARVV CRQLGCGEAL RCPSRGWGQH NCRHKQDAGV SVCRNPMEDI TVSTICRQLG DTSLWQCPSD PWNYNSCSPK QGSWGTICDD RWDLDDARVV NCRGEESQVWRCPSWGWRQH VHSGEAWTPV SDGNFTLPTA EEFRCDGGEP ELWSCPRVPC CEGQVEMKIS GRWRALCASH GDQISTAQFH CSGAESFLWS QCNDFLSQPA GSAASEESSP TICDDDWDLD DARVVCRQLG ESHVWRCPSR GWGRHDCRHK
-i 5O i00 150 200 250 300 350 400 450 5OO 55O 6OO 65O 70O 750 8O0 850 9OO 95O i000
WC1 EDAGVICSEF CRQLGCGDSG SCSPKEEAYI GSWGTVCDDS CGGRESSLWD SLPGIFSLPG AVYEELDYLL EVPPEKEDGV ELSALGTSPV
LALRMVSEDQ SLNTSVGLRE SCEGRRPKSC WSLAEAEVVC CVAEPWGQSD VLCLILGSLL TQKEGLGSPD RSSQTGSFLN TFS
QCAGWLEVFY GSRPRWVDLI PTAAACTDRE QQLGCGQALE CKHEEDAGVR FLVLVILVTQ QMTDVPDENY FSREAANPGE
NGTWGSVCRS QCRKMDTSLW KLRLRGGDSE AVRSAAFGPG CSGVRTTLPT LLRWRAERRA DDAEEVPVPG GEESFWLLQG
PMEDITVSVI QCPSGPWKYS CSGRVEVWHN NGSIWLDEVQ TTAGTRTTSN LSSYEDALAE TPSPSQGNEE KKGDAGYDDV
1050 ii00 1150 1200 1250 1300 1350 1400 1413
References 1 Clevers, H.C. et al. (1990) Eur. J. Immunol. 20, 809-817. z Mackay, C.R. et al. (1989) Eur. J. Immunol. 19, 1477-1483. 3 C a r l M.M. et al. (1994)Immunology 81, 36-40.
4 Wijngaard, EL.J. et al. (1994) J. Immunol. 152, 3476-3482. s Walker, I. D. et al. (1994)Immunology 83, 517-523. 6 Wijngaard, P.L.J. et al. (1992) J. Immunol. 149, 3273-3277. 7 Law, S.K.A. (1993) Eur. J. Immunol. 23.2320-2325.
~9~
A A1 (Ly-49A), 545, 546, 547 114/A10, 431-2 domain, 6, 46 5All (chicken)(CD147), 408-9 A38L (poxvirus)see CD47 ACT35 antigen s e e CD 134 a-actinin, 299 Adenosine deaminase (ADA), 194 ADP-ribosyl cyclase s e e CD38 Affinity chromatography, 18, 19 Agglutination assay, 20 A I M s e e CD69 Albumin, 239 ALCAM s e e CD 166 ALIGN program, 35, 40 Alloantigens, mouse, 19-20 Amino acid sequence (protein sequence), 15, 16 databases, 12-16 molecular analysis, 23 superfamily concept, 25-6 see also specific
antigens
Aminopeptidase A [EC 3.4.11.7](APA; Glutamyl aminopeptidase; BP-1/6C3 (mouse); gpl60), 437 symbol used, 10 Aminopeptidase N s e e CD13 AMPS Database, 40 fl-amyloid fibrils, 586 Ankyrin, 241 Antibodies monoclonal, 2, 22-3 xenogeneic, 20-1, 23 Antigenicity, carbohydrate structures, 107-11 Antithrombin III, 407 APA (Aminopeptidase), 437-8 apCAM (Mollusc)see CD56 Apo-1 s e e CD95 Architecture, leucocyte surface, 101-23 ART- 1, 20
B B cell antigen receptor (mIg) complex CD79 B 1 s e e CD20; CD80 B4 s e e CD19 2B4, 433-4
i9~
see
B7 s e e CD80 B7-1 s e e CD80 B7-2 s e e CD86 B29 s e e CD79/BCR B70 s e e CD86 B220 s e e CD45 Basigin (mouse)(CD147), 408-9 4-1BB ILA (CDw137), 404-5 4-1BBL (CDw137L), 404, 435-6 B cell antigen receptor (BCR) s e e CD78 B cell intracellular signalling molecules, 333 BCM1 s e e CD48 BEN s e e CD 166 Ber-H2 antigen s e e CD30 Ber-Mac3 (CD163), 427-8 B-G, 439-40 BGP (CD66a), 310, 311,312 Biliary glycoprotein (CD66a), 310, 311, 312 BLA (CD77), 331 BLAST database, 13, 16, 34 Blast-1 s e e CD48 Blast-2 s e e CD23 BL-CAM s e e CD22 BLITZ programme, 13, 16 BLOSUM 62 matrix, 35, 36 ~BP s e e Galectin 3 BP- 1/6C3 (Aminopeptidase A), 43 7-8 Bp35 s e e CD20 Burkitt's lymphoma-associated antigen (BLA CD77), 331 C C2, 41 C2a, 282 C3b, 158, 161, 184, 218, 249, 282 C3d, 184 C3d-receptor s e e CD21 C4, 42 C4b, 218, 249, 282 C5a, 349 C5a receptor s e e CD88 C6, 3, 60, 79 C8, 290 C9, 290 C33 s e e CD82 Cadherin, 32, 3 7, 115 s e e a l s o E-cadherin
Calcium, binding activity, EGF, 43 CALLA s e e CD 10 CAMPATH-1 (CD52), 274-5 Carbohydrate, 3-8, 10 antigenicity, 107-11 lectins binding to, in Ca2+-dependent reaction, s e e Lectin C-type leucocyte membrane proteins, 114-5 structures, 106-12 Carcinoembryonic antigen (CEA)family s e e CD66 CBP-30 s e e Galectin 3 CBP-35 s e e Galectin 3 CCP (Complement control protein), domains 3-8, 32, 34, 38, 39, 41-2, 66 symbol used, 10 s e e a l s o T a b l e 1, C h a p t e r 1
CD antigens monoclonal antibodies in studies of, 22-3 numbering, 2 CD1 (T6), 132-3 monoclonal antibodies, 22 CDla, 132, 133 CDlb, 132, 133 CDlc, 132 CDld, 132 CD2 (T11; Leucocyte function antigen-2; LFA-2), 134-5, 245, 253, 254, 276, 288, 289, 290, 373, 374 amino acid sequence, 26 architecture of the cell surface, 119, 120, 121, 122 integration into cell membrane, 105 protein structure, 26 CD3/TCR (T cell receptor complex), 134, 137-9, 142, 373, 376 ? chain, 137, 138, 139 chain, 13 7, 138, 139 chain, 137, 138 chain, 137, 138, 139 chain, 137, 138, 139 associated enzymes, 114 carbohydrate structures, 106 integration into membrane, 103 monoclonal antibodies, 22, 23 CD4 (T4; L3T4 (mouse); W3/25(rat)), 141-2, 245, 337, 339, 371,376, 568
amino acid sequence, 26 architecture of the cell surface, 119, 120, 121, 122 associated enzymes, 114 carbohydrate structures, 113 integration into cell membrane, 102 monoclonal antibodies, 22, 23 protein structure, 26 CD5 (T1; Leu-1; Ly-1), 135, 143-4, 323 CD6 (T12; Tpl20), 145-6, 429, 430 CD6L s e e CD 166 CD7 (gp40; Tp41 ), 147-8 CD8 (T8; Lyt2/3 (mouse)), 149-50, 337, 339, 371,376, 565 a chain, 149-50 ]/chain, 149-50 architecture of the cell surface, 119, 120 monoclonal antibodies, 22 protein structure, 26 CD9 (MRP-1; DRAP27 (monkey)), 152-3, 304, 337 CD 10 (Common acute lymphoblastic leukaemia antigen (CALLA); Neutral endopeptidase [EC 3.4.24.11] (NEP); Neprilysin; Enkephalinase; gp 100), 154-5 associated enzymes, 113 integration into cell membrane, 102 CD 11/CD 18 complex, 177, 304 CD 1l a (LFA-1 a subunit; Integrin aL subunit), 156-7, 269, 278, 347, 377 architecture of the cell surface, 122 CD1 lb (Mac-1 (Mo-I,CR3)a subunit; Integrin aM subunit), 158-9, 169, 189, 190, 210, 270, 278-9, 347, 348, 3 77 CD 11 c (p 150,95 a subunit; Integrin aX subunit), 161-2, 189, 190, 210, 270, 279 CD 11 d (Integrin aD subunit), 163-4 CDw 12, 165 CD 13 (Aminopeptidase N (EC 3.4.11.2); gpl50; p161 (mouse)), 166-7 associated enzymes, 113 CD 14, 158, 169-70, 548 CD 15 (Lewis x (LeX);3-fucosyl-N-acetyllactosamine 3-FAL, 171,295, 299, 302, 473
i97
CD15s (Sialyl Lewis x (sLeX)), 172 CD 16 (Fc?RIII), 158, 173-5 CDwl 7 (Lactosylceramide (LacCer)), 176 CD 18 (Integrin f12 subunit), 156, 158, 159 161, 163, 169, 177-8, 190, 210, 269, 278, 304, 347, 348, 377, 473 architecture of the cell surface, 122 CD 11/CD 18 complex, 60, 177, 304 CD19 (B4; Leu-12), 179-80, 184, 224, 276, 331,337 architecture of the cell surface, 122 CD20 (B1; Bp35; Ly-44 (mouse)), 181-2, 276, 337, 339 function, 112 integration into cell membrane, 104 CD21 (CR2; EBV-receptor; C3d-receptor; Complement receptor type 2 (CR2); Epstein-Barr virus (EBV) receptor), 179, 180, 183-4, 189, 224, 276, 337 architecture of the cell surface, 122 CD22 (BL-CAM; Leu-14; Lyb8), 186-8, 245, 329 form, 186, 187, 188 fl form, 186, 187 architecture of the cell surface, 122 carbohydrate structures, 111 CD23 (Fc~RII; BLAST-2), 158, 184, 189-91,373 CD24 (Heat stable antigen (HSA); M1/69-JIId (mouse)), 192-3 CD25 a chain (Tac antigen; p55), 486, 487, 488 CD26 (Dipeptidyl peptidase IV (EC 3.4.14.5); Tpl03; Thymocyteactivating molecule (THAM) (mouse)), 194-6, 245 associated enzymes, 113 CD27, 197-8, 318 CD27L s e e CD70 CD28 (Tp44), 199-200, 335, 345 architecture of the cell surface, 120, 122 CD29 (Integrin fll subunit), 152, 201-2, 255, 257-8, 260-1,262-3, 265, 267, 276, 304, 320, 337, 339, 387, 529 CD30 (Ki-1; Ber-H2 antigen), 204-5, 417 CD30L s e e CD 153
~9~
CD31 (Platelet endothelial cell adhesion molecule 1 (PECAM-1)), 206-7, 271,272 CD32 (FcTRII; Fc receptor for aggregated IgG), 158, 209-12 architecture of the cell surface, 122 associated enzymes, 114 CD33, 213-4 carbohydrate structures, 111 CD34 (Sgp90), 215-6, 299 CD35 (Complement receptor type 1; CR1), 184, 217-20 architecture of the cell surface, 121 CD36 (Platelet glycoprotein IV; FAT (rat), 221-2 integration into cell membrane, 101 CD37, 224-5, 276, 337, 339 integration into cell membrane, 102, 104 CD38 (T10; ADP-ribosyl cyclase; Cyclic ADP-ribose hydrolase), 226-7 associated enzymes, 113 CD39, 228-9 associated enzymes, 113 CD40, 230-1, 3 73, 419 architecture of the cell surface, 120, 121,122 CD40L s e e CD 154 CD41 (GPIIb of the GPIIb/IIIa complex; Integrin aIIb subunit), 232-3, 251, 293, 294 CD42, domains, 66 CD42a (GPIX), 235-6 CD42b (GP1B), 235-7 chain, 235, 236, 237 fl chain, 235, 236, 237 CD42c, domains, 67 CD43 (Leukosialin; Sialophorin; Ly-48(mouse); W3/13(rat)), 238-9 architecture of cell surface, 119, 120, 121, 122 carbohydrate structures, 106, 107, 111 monoclonal antibodies, 22 CD44 (Phagocytic glycoprotein 1; Pgp- 1; Lymphocyte homing receptor; Hermes antigen; In(Lu)-related p80; Extracellular matrix receptor type III (ECMRIII); Hutch-1; Ly-24 (mouse); p85; HCAM), 9, 240-2 architecture of the cell surface, 119
CD45 (Leucocyte common antigen; L-CA; B220; T200; Ly-5; EC 3.1.3.48), 135, 194, 195, 244-6, 536 architecture of the cell surface, 119, 121, 122 associated enzymes, 113 carbohydrate structures, 106, 107, 111 serology, 20, 21 CD45-AP (CD45-associated protein) (mouse) (LPAP), 536-7 CD46 (Complement membrane cofactor protein (MCP), 248-9 CD47 (Integrin-associated protein (IAP); OV-3; A38L (poxvirus)), 251-2, 271 integration into cell membrane, 104 CD48 (Blast-l; HuLy-m3; BCM1; MEM-102; OX-45), 134-5, 253-4 architecture of the cell surface, 120 CD49, domains, 60 CD49a (al integrin subunit; VLA-I subunit), 201,255-6 CD49b (Integrin a2 subunit; VLA-2 subunit; Ia subunit of platelet GP Ia-IIa), 201,257-8, 529 CD49c (Integrin a3 subunit; VLA-3 subunit), 152, 201,260-1,304, 320, 33 7 CD49d (Integrin a4 subunit; VLA-4 subunit), 152, 201,262-3, 276, 304, 337, 339, 387 CD49e (Integrin a5 subunit; VLA-5 (fibronectin receptor) a subunit; Ic subunit of GPIc-IIa), 152, 201, 265-6 CD49f (Integrin a6 subunit; VLA-6 subunit; Ic subunit of GPIc-IIa), 152, 163, 201,267-8, 304, 337, 382 CD50 (ICAM-3; ICAM-R), 156, 161,163, 269-70, 377 CD51 (~ subunit of vitronectin receptor; integrin aV subunit), 201, 251, 271-2, 293, 294, 529 CD52 (CAMPATH-1), 274-5 CD53 (OX-44 (rat)), 135, 181,224, 276-7, 33 7, 339 integration into cell membrane, 104 CD54 (ICAM-1), 156, 158, 161, 163, 239, 278-9, 377, 520 architecture of the cell surface, 121, 122
CD55 (Complement decay acclerating factor (DAF), 281-2, 367 CD56 (Neural cell adhesion molecule (NCAM) isoform; NKH-1 antigen; Leu-19 antigen; Fasciclin II (Drosophila); apCAM; (Mollusc), 284-6, 529 protein structure, 26 CD57 (HNK-1; Leu-7 antigen), 287 CD58 (LFA-3), 134, 288-9 architecture of the cell surface, 120, 122 CD59 (Complement protectin; MIRL; H19; MACIF; HRF20; P-18), 135, 290-1 protein structure, 26 CD60 (UM4D4), 292 CD61 (Integrin//3 subunit), 206, 232-3, 251,252, 271-2, 293-4, 529 CD62 protein structure, 26 CD62E (ELAM-1), 172, 215, 216, 295-6, 298, 299, 301,390, 425, 452, 453 carbohydrate structures, 111 CD62L, (LECAM-1; LAM-1; Lymph node homing receptor; MEL-14 antigen; Leu-8; TQ1), 172, 215, 216, 295, 298-9, 301-2, 425, 473, 553 architecture of the cell surface, 123 associated enzymes, 113 carbohydrate structures, 111 CD62P (P-selectin; PADGEM; GMP- 140), 172, 192, 287, 295, 298-9, 301-3, 425 architecture of the cell surface, 120, 12l carbohydrate structures, 111 CD63 (ME491; MLA1; PTLGP40; Granulophysin), 152, 304, 337 CD64 (FcTRI; High-affinity Fc7 receptor), 306-8 CD65 (Ceramide dodecasaccharide 4c), 309 CD66 (Carcinoembryonic antigen (CEA) family), 310-3 CD66a (BGP; Biliary glycoprotein; NCA-160), 310, 311,312 CD66b (CGM6; W272; NCA-95), 310, 311,312,313 CD66c (NCA; NCA-90), 310, 311, 312, 313
;9~
CD66d (CGM 1), 310, 311, 312, 313 CD66e (CEA), 310, 311,312, 313 CD67 (CD66b), 310, 311, 312, 313 CD68 (Macrosialin (mouse)), 314-5 CD69 (AIM; EA 1; MLR-3; Leu-23), 316-7 CD70 (CD27L), 197, 318 CD71 (Transferrin receptor; T9), 320-1 CD72 (Lyb-2; Ly32.2; Ly-19.2 (mouse)), 323-4 CD73 (Ecto-5'-nucleotidase L-VAP-2 (lymphocyte vascular adhesion protein 2)), 325-6 CD74 (MHC Class II-associated invariant chain (Ii or Iv)), 327-8 CDw75, 329 CDw76, 330 CD77 (Globotriaocylceramide (Gb3); Ceramide trihexoside; pK blood group antigen; Burkitt's lymphoma-associated antigen (BLA)), 331 CDw78 s e e MHC Class II CD79/BCR (B cell antigen receptor (mIg) complex; CD79a; CD79b; B29; mb-1), 187, 332-4 architecture of the cell surface, 122 CD80 (B7; B7-1; B1), 199, 335-6, 415 architecture of the cell surface, 120, 122 CD81 (TAPA-1), 152, 179, 181, 184, 224, 276, 304, 337-8, 339 architecture of the cell surface, 122 carbohydrate structures 106 CD82 (R2; C33; IA4; 4F9; KAI1), 181,224, 276, 337, 339 CD83 (HB15), 341 CDw84, 343 CD85, 344 CD86 (B7-2; B70), 199, 335-6, 345-6, 415 CD87 (Urokinase plasminogen activator receptor (uPAR); Mo3), 158, 347-8 CD88 (C5a receptor), 349-50 integration into cell membrane, 103 CD89 (Fca receptor; IgA receptor), 351-2 CD90 (Thy-1; Theta), 245, 353-4 architecture of the cell surface, 119 carbohydrate structures, 106, 107, 110, 111 integration into cell membrane, 102, 104-5
~OC
CD91 (a2-Macroglobulin receptor; LDL receptor-related protein), 355-8 CDw92, 359 CD93, 360 CD94 (Kp43), 361-2, 571 CD95 (Fas; Apo-1), 363-4, 456 CD96 (Tactile), 365-6 CD97, 281,367-8 integration into cell membrane, 104 CD98 (4F2; FRP-1; RL-388(mouse)), 369-70 CD99 (MIC2; E2; 12E7; HuLy-m6; FMC29), 371-2 CD 100, 3 73-4 CD101 (V7; p126), 375-6 CD102 (ICAM-2), 156, 377-8 integration into cell membrane, 102 CD 103 (Integrin aE subunit; HML-1 antigen), 379-80 CD 104 (Integrin//4 subunit), 381-3 CD 105 (Endoglin), 384-5 CD106 (VCAM-1; INCAM-110), 262, 386-8, 522 protein structure, 27 CD 107a (Lysosome-associated membrane protein- 1 (lamp- 1); lgp-120 (rat)), 389-90 CD107b (lamp-2; lgp-110 (rat)), 389-90 CDwl08, 392 CD 109 (Gova/b alloantigen), 393 CD110, 9 CDlll, 9 CD112, 9 CD113, 9 CD 114 (G-CSFR), 467-8 CD 115 s e e M-CSFR CD116, 471 CD117 (c-kit; mast/stem cell growth factor receptor; Steel factor receptor), 394-6, 443 CD118, 9 CD119 ~ chain, 479, 480 CD120, 89, 114, 118 CD120a (TNFRI; TNF-R55), 397-8 CD120b (TNFRII; TNF-R75), 397-9 CD121a, 482, 483, 484 CDwl21b, 482, 483, 484 CD122 (IL-2R fl chain), 517, 518 CD122 fl chain (p75), 486, 487, 488
CDw123, 45 chain, 490, 491,496, 497 CD124 (IL-4R a chain), 494, 514, 515 CD124 ~ chain, 493, 494, 495 CDw125, 496 CD126 ~ chain, 498, 499 CD 127 ~ chain, 501,502 CDw128 s e e IL-8R CD128A, domain, 50, 53 CD128B, domain, 50, 53 CD129, 9 CD129 ~ chain, 506, 507 CD130 (gpl30), 510, 511 /~ chain (gpl30), 498, 499, 500 CDwl31, 470, 471,496 tic chain, 490, 491,492, 496, 497 CD 132 7c chain (cytokine receptor common 7 chain, p64) IL-2R, 486, 487, 488 IL-4R, 493, 494, 495 IL-7R, 501,502 IL-9R, 506, 507 IL-15R, 517, 518, 486, 487, 488 CD133, 9 CD134 (OX40; MRC OX40; ACT35 antigen), 400-1, 575 CD134L s e e OX40L CD135 (STK-1; FLT3; ilk-2), 402-3 CD136, 9 CDw137 (4-1BB ILA), 404-5, 435 CDw137L s e e 4-1BBL CD138 (syndecan-1), 9, 406-7 CD139, 9 CD140, 9, 91, 117 CD141, 9 CD142, 9 CD143, 9 CD144, 9 CDw145, 9 CD146, 9 CD147 (EMMPRIN (human); M6 (human); OX-47 (rat); CE9 (rat); Basigin (mouse); gp42 (mouse); Neurothelin (chicken); HT7 (chicken); 5A11 (chicken), 408-9 CD148 (HTPT~ DEP-1), 410-1 associated enzymes, 113 CDw149, 9 CDw 150 (SLAM), 412-3
CD151 (PETA-3), 414 CD 152 (CTLA-4), 335, 345, 415-6 CD 153 (CD30L), 205, 417-18 CD154 (CD40L), 419-20 architecture of the cell surface, 120, 121, 122 CD155, 9 CD156, 9 CD157, 9 CD 158 s e e K I R CD 158a s e e K I R CD 158b s e e K I R CD159, 9 CD160, 9 CD 161 (NKR-P1 family), 421-2 CD 162 (P-selectin glycoprotein ligand 1; PSGL-1), 296, 299, 302, 424-5 architecture of the cell surface, 121, 123 CD163 (M130 antigen; GHI/61; Ber-Mac3; Ki-M8; SM4), 427-8 CD 164, 9, 230-1 CD165, 9 CD 166 (ALCAM; CD6L; BEN; SC- 1; DM-GRASP; Neurolin; KG-CAM), 145, 429-30 CE9 (rat)(CD147), 408-9 CEA s e e CD66e Ceramide dodecasaccharide 4c (CD65), 309 Ceramide trihexoside (CD77), 331 CGM1 (CD66d), 310, 311, 312, 313 CGM6 (CD66b), 310, 311, 312, 313 Chemokine receptors, 441-2 Chondroitin sulfate, symbol used, 10 Chromatography, affinity, 18, 19 Chromosomal organization of genes, 11 Cloning, 23-5 Clusters of differentiation s e e CD CMRF35 antigen, 445 Collagen, 221,241,255, 257, 260, 407 Colony-stimulating factor receptor (CSF)-I s e e M-CSFR Common acute lymphoblastic leukaemia antigen (CALLA)see CD10 Complement control protein s e e CCP Complement decay acclerating factor s e e CD55
501
Index
Complement membrane cofactor protein (CD46), 248-9 Complement protectin s e e CD59 Complement receptor type 1 s e e CD35 Complement receptor type 2 s e e CD21 Convergent evolution, 35, 37-8 Coronavirus, 166 CR1 s e e CD35 CR2 s e e CD21 CR3 s e e CD 1lb CRAF- 1,230 Cross-hybridization probing for related proteins, 24 CSFR granulocyte (G-CSFR), 467-8 granulocyte-macrophage s e e GM-CSF macrophage s e e M-CSFR multi-, s e e IL-3R CTLA-4 s e e CD 152 CXCR1 s e e IL-8R Cyanogen bromide coupling method, 19 CyAP (Cycophilin C-associated protein) s e e Mac-2-BP Cyclic ADP-ribose hydrolase s e e CD38 Cyclophilin C, 548 Cyclophilin C-associated protein s e e Mac-2-BP Cysteine-rich FGF receptor (ESL-1), 452-3 Cytokine receptor common 7 chain s e e CD132 Cytokine receptor superfamily, 3-8, 10 Cytoplasm, domains, distribution, 3-8 D
DAF s e e CD55 Database entries, 12-16 identifying domains and repeats, 34-5, 36 Death domain, 39, 79, 80 DEC-205, 447-9 Decay accelerating factor s e e CD55 Deoxycholate, 18-19 DEP- 1 s e e CD 148 Detergents, solubilization of antigens with, 18-19 Dipeptidyl peptidase IV (EC 3.4.14.5) s e e CD26
~02
Divergent evolution, 35, 37-8, 43 DM-GRASP s e e CD 166 DNA nucleic acid sequencing, 23-5 sequence databases, 12-13 DNAM-1 (DNAX accessory molecule 1), 450-1 DNAX accessory molecule 1 (DNAM-1 ), 450-1 Domains, 3-8, 32 distribution among leucocyte antigens, 3-8 divergent and convergent evolution, 35-8 identifying, 34-5 nomenclature, 32-3 organization/position, 11-12 see also specific domains
in Chapter
DRAP27 (monkey)see CD9 E
E2 (CD99), 371-2 5E6 (Ly-49C), 545, 546, 547 12E7 (CD99), 371-2 EA 1 s e e CD69 EBV-receptor s e e GD21 EC 3.1.3.48, 113 s e e a l s o CD45 EC 3.1.4.1, 113 s e e a l s o PC- 1 EC 3.4.11.2, 113 s e e a l s o CD 13 EC 3.4.11.7, 113 s e e a l s o Aminopeptidase A EC 3.4.14.5, 113 s e e a l s o CD26 EC 3.4.24.11, 113 s e e a l s o CD 10 EC 3.6.1.9, 113 s e e a l s o PC-1 E-cadherin, 3 79 integrin aE subunit and, interaction, 379 integrin f17 subunit and, interaction, 522 Ecto-5'-nucleotidase L-VAP-2 (CD73), 325-6 EGF (epidermal growth factor), 3-8, 10 EGF-like growth factor, 152
3
EGF module-containing mucin-like hormone receptor 1 (EMR1) (human) s e e F4/80 ELAM- 1 s e e CD62E Electron microscopy, 27 Electrophoresis, gel, SDS polyacrylamide, antigens resolved on, 11, 13, 18, 21 EMBL database, 12, 13, 14 EMMPRIN (human)(CD147), 408-9 EMR1 s e e F4/80 ENA-78, 503 Endoglin (CD105), 384-5 Endothelial antigens, 2 Enkephalinase s e e CD 10 ENTREZ database, 13 Enzymatic activity, leucocyte membrane protein, 112, 113-4 Epidermal growth factor s e e EGF Epiligrin, 260 Epstein-Barr virus (EBV) s e e CD21 Escherichia c o l i , 557 E-selectin, carbohydrate structures, 111 E-selectin ligand 1 (ESL-1), 452-3 ESL- 1 (E-selectin ligand 1; MG- 160; Cysteine-rich FGF receptor), 296, 452-3 Evolutionary relationships between domains, 35, 37-40 Exon(s) of domains, 33, 38-9 organization/position/boundaries, 11-12 Extracellular domains, distribution, 3-8 Extracellular matrix receptor type III (ECMRIII) s e e CD44 Ezrin, 241 F
4F2 (CD98), 369-70 F4/80 (EGF module-containing mucinlike hormone receptor 1 (EMR1) (human), 50, 454-5 glycosaminoglycans, 9 integration into cell membrane, 104 4F9 s e e CD82 F(ab')2 antibodies, studies employing, 20-1 Factor B, domains, 41, 42, 516 Factor H, domains, 41, 42
Factor I, domains, 69, 70, 79 Factor IX, domains, 43, 46 Factor X, 158 Factor XII, domains, 47 3-FAL s e e CD 15 Fas s e e CD95 Fasciclin II (Drosophila) s e e CD56 FasL, 363, 456-7 domain, 85, 89 FASTA search, 16, 34 FAT (rat)see CD36 Fatty acids, 221 Fatty acyl anchors, 106 Fc receptor for aggregated IgG s e e CD32 Fca receptor (CD89), 351-2 Fc?RI (CD64), 306-8 Fc?RII s e e CD32 Fc?RIII s e e CD 16 Fc~R, 104 Fc~RI (High-affinity receptor for IgE), 458-60 chain, 458, 459, 460 //chain, 458, 459, 460 7 chain, 174, 458, 459, 460 ligands and associated molecules, 465, 476 Fc~RII s e e CD23 Fetal liver kinase 2 s e e FLT3 ligand Fibrinogen, 158, 232, 271 Fibroblast growth factor, 407, 452 Fibroblast growth factor receptor, 285 Fibronectin, 232, 241,260, 262, 265, 271, 407, 522 Fibronectin type I, 32 Fibronectin type II (Fn2) domain, 3-8, 10, 39, 41, 47-8 Fibronectin type III (Fn3), 3-8, 10, 32, 37, 38, 39, 40, 41, 43, 45, 48, 49, 54 ilk-2 (CD135), 402-3 ilk-2 (fetal liver kinase 2) ligand s e e FLT3 ligand Flow cytometry, 21, 22 FLT3 (CD135), 402-3 FLT3 ligand iflk-2 (fetal liver kinase 2) ligand), 461-2 CD 135 and interaction, 402 FMC29 (CD99), 371-2 fMLP receptor (fMLPR) s e e FPR c-fms proto-oncogene s e e M-CSFR
~0~
Index Fn2 s e e Fibronectin type II Fn3 s e e Fibronectin type III N-Formyl peptide receptor s e e FPR FPR (N-Formyl peptide receptor; fMLP receptor (fMLPR), 463-4 integration into cell membrane, 103 FRP-1 (CD98), 369-70 3-fucosyl-N-acetyl-lactosamine (3-FAL) s e e CD 15 Fyn, 135, 138, 179, 181, 199, 221, 253, 270, 353, 581 G GAG s e e Glycosaminoglycans Galaptin, 390 Galectin, 3-8, 10, 390 Galectin 2, 39, 51 Galectin 3 (Mac-2; eBP; IgEBP; CBP-35; CBP-30, RL29; L29; L31; L34; LBL), 465-6, 548 GAP, 538 Gb3 (CD77), 331 G-CSF, 468 G-CSFR (Granulocyte colonystimulating factor receptor; CD 114), 467-8 Gel electrophoresis, 18 GENBANK database, 12, 13, 14 Genes chromosomal organization of, 9 exon organization, 11-12 location and size, 11 GENOME database, 14 GHI/61 (CD163), 427-8 Globotriaocylceramide (Gb3)(CD77), 331 fl-glucan, 158 Glutamyl aminopeptidase (Aminopeptidase A), 43 7-8 GlyCAM- 1 (Glycosylation-dependent cell adhesion molecule 1; Sgp50), 299, 473-4 carbohydrate structures, 111 Glycophorin, 18 Glycoprotein(s), 106-7, 111-2 see also specific glycoproteins
and
e n t r i e s u n d e r gp; G P
dimensions, 106 leucine-rich s e e Leucine-rich repeats
~04
Glycosaminoglycans (GAG), 9, 10, 206 Glycosylation, 106-7, 120 cell type specificity, 111-2 N-linked s e e N-linked glycosylation O-linked s e e O-linked glycosylation sites, 8-9 degree of, 11 symbols used, 10 Glycosylation-dependent cell adhesion molecule 1 s e e GlyCAM-1 Glycosyl-phosphatidylinositol (GPI) anchors, 3-8, 10, 16, 18, 19 integration into membrane, 101, 102, 104-6 motif, 33 protein structure, 26 GM3 ganglioside, 176 GM-CSFR (Granulocyte-macrophage colony-stimulating factor receptor), 470-1 GMP- 140 s e e CD62P Gov a/b alloantigen (CD109), 393 GPI s e e Glycosyl-phosphatidylinositol GP1B s e e CD42b GPIIb/IHa complex s e e CD41 GP1X s e e CD42a gp40 (CD7), 147-8 gp42 (mouse)(CD147), 408-9 gp42 (rat), 475 gp49, 476-7 gpl00 s e e CD10 gpl20, 142 gp 130 s e e CD 130 gp 150 s e e CD 13 gp 160 (Aminopeptidase A), 43 7-8 G-protein, 103, 349, 442, 560 G-protein coupled R, 50, 53, 114 Gram negative bacteria, 586 Gram positive bacteria, 586 Granulocyte(s), neutrophil glycosylation, 111-2 Granulocyte colony-stimulating factor receptor (G-CSFR), 467-8 Granulocyte-macrophage colony-stimulating factor receptor s e e GM-CSFR Granulophysin s e e CD63 GRB-2/SOS, 199 GTP-binding proteins, 101
Index
H
H2, 20 H-2K, -D, -L (mouse)see MHC Class I H19 s e e CD59 HB 15 (CD83), 341 HCAM s e e CD44 Hck, 304, 307, 312 Heat stable antigen (CD24), 192-3 Heparin, 158 Heparin sulfate, 302 Hermes antigen s e e CD44 High-affinity Fc? receptor (CD64), 306-8 High-affinity receptor for IgE s e e Fc~RI High molecular weight B cell growth factor receptor (IL-14R), 516 Histocompatibility complex, major s e e MHC Historical background to leucocyte antigens, 18-23 HIV- 1, 142 HLA, 497 HLA-A, -B, -C (human)see MHC Class I HLA-DP, -DQ, -DR (human)see MHC Class II HML-1 antigen (CD103), 379-80 HNK-1 (CD57), 287 Homologue, 2 HRF20 s e e CD59 HSA (CD24), 192-3 HT7 (chicken)(CD147), 408-9 HTm4, 478 HTPT~/see CD 148 HuLy-m3 s e e CD48 HuLy-m6 (CD99), 371-2 Hutch- 1 s e e CD44 Human macrophage lectin (HML), 550 HVS 13, 520 Hyaluronan (HA), 241 Hybrid arrest translation, 24 I
I-A (mouse)see MHC Class II Ia subunit of platelet GP Ia-IIa s e e CD49b I A 4 s e e CD82 IAP (integrin-associated protein)see CD47 Ic subunit of GPIc-IIa s e e CD49e, CD49f ICAM-1 s e e CD54 ICAM-2 s e e CD 102
ICAM-3 s e e CD50 ICAM-R s e e CD50 I-E (mouse)see MHC Class II IFN7 accessory factor 1 (IFN7 AF-1), 479, 480 IFN?R (interferon 7 receptor), 479-80 Ig s e e Immunoglobulin I~ (CD74), 327-8 IgSF s e e Immunoglobulin superfamily Ii (CD74), 327-8 IL-1R (interleukin 1 receptor), 482-5 accessory protein (IL-1R AcP), 482, 483, 484 type I, 482, 483, 484 type II, 482, 483, 484 IL-2R (interleukin 2 receptor), 486-8 chain (CD25), 486, 487, 488 fl chain, (CD122), 486, 487, 488 ?c chain (CD132), 486 487, 488 IL-3R (interleukin 3 receptor; multicolony- stimulating factor receptor), 490-2 chain (CDw123), 490, 491 tic chain (CDwl31), 490, 491,492 IL-4R (interleukin 4 receptor), 493-5 chain (CD124), 493, 494, 495 ?c chain (CD132), 493, 494, 495 chain (IL-13R), 493, 494, 495 IL-5R (interleukin 5 receptor), 496-7 chain (CDw125), 496, 497 tic chain (CDwl31), 496, 497 IL-6R (interleukin 6 receptor), 498-500, 520 chain (CD126), 498, 499 fl chain; gpl30 (CD130), 498, 499, 500 IL-7R (interleukin 7 receptor), 501-2 chain (CD127), 501,502 7c chain (CD132), 501,502 IL-8R (interleukin 8 receptor (IL-8RA/CDw128); IL-8RB, 503-4, 52O integration into membrane, 103 IL-9R (interleukin 9 receptor), 506-7 chain (CD129), 506, 507 7c chain (CD 132), 506, 507 IL- 1OR (interleukin 10 receptor), 508-9 IL-11R (interleukin 11 receptor), 510-11 chain (IL-11R), 510, 511 gpl30 (CDI30), 510, 511
Index
IL-12R (interleukin 12 receptor ]/chain), 512-3 IL-13R (interleukin 13 receptor), 514-5 a chain (IL-13R), 494, 514, 515 IL-4R a chain (CD124), 514, 515 IL- 14R (interleukin 14 receptor; High molecular weight B cell growth factor receptor), 516 IL-15R (interleukin 15 receptor), 517-8 a chain (IL-15R), 517, 518 //chain (IL-2r; CD132), 517, 518 7c chain (CD132), 517, 518 IL-17R (interleukin 17 receptor), 520-1 Immunoglobulin C 1 set domains symbol used, 10 C2 set domains symbol used, 10 V set domains symbol used, 10 Immunoglobulin A, 351 Immunoglobulin A receptor (CD89), 351-2 Immunoglobulin D, 333 Immunoglobulin E, 189, 459, 465 Fc receptor for s e e Fc~RI, Fc~RII Immunoglobulin EBP s e e Galectin 3 Immunoglobulin FcRII, 57 Immunoglobulin G, 210, 307, 555 Immunoglobulin M, 121, 122, 333 Immunoglobulin (Ig)superfamily IgSF, 3-8 amino acid sequence, 26 integration into cell membrane, 102 In(Lu)-related p80 s e e CD44 INCAM- 110 s e e CD 106 Integrin al-6 subunit s e e CD49a-f Integrin aIIb subunit s e e CD41 Integrin aD subunit (CD11 d), 163-4 Integrin aE subunit (CD103), 379-80 Integrin aL subunit s e e CD 11a Integrin aM subunit s e e CD 1lb Integrin aV subunit s e e CD51 Integrin aX subunit s e e CD 11c Integrin ]/1 subunit s e e CD29 Integrin//2 subunit s e e CD 18 Integrin/t3 subunit s e e CD61 Integrin//4 subunit (CD104), 381-3 Integrin//7 subunit, 123, 522-3, 553
JO~
Integrin-associated protein (IAP) s e e CD47 Integrins, 3-8, 40, 41, 60, 61, 62, 63, 65 nomenclature, 32, 34 protein structure, 27 Interferon a, 184 Interferon 7 receptor (IFN?R), 479-80 Interleukin receptors s e e IL entries Intron-exon boundaries of domains, 38-9 Invariant chain (CD74), 327-8 ITAM, 33, 39, 79, 81 associated enzymes, 114 ITIM, domain, 79, 81 ITK, 199
J J11 d (CD24), 192-3 Jak 1,468, 494 Jak 2, 468, 471 Jak 3, 494 K s e e CD82 KG-CAM s e e CD 166 Ki-1 s e e CD30 Ki-M8 (CD163), 427-8 Killer cell inhibitory receptor family s e e KIR KIR (Killer cell inhibitory receptor family; CD158), 524-7, 565 associated enzymes, 114 KIR-cl.11 (NKAT3), 524, 525, 526 KIR-cl.42 (NKAT1), 524, 525, 526 c-kit s e e CD 117 c-kitL (c-kit ligand; mast/stem cell growth factor; steel factor), 395, 443-4 Kp43 s e e CD94 Kringle, 32 KAI1
L L1 (NCAM L1), 265, 271,285, 528-9 L3 antigen s e e Mac-2-BP L3T4 (mouse)see CD4 L29 s e e Galectin 3 L31 s e e Galectin 3 L34 s e e Galectin 3 Lactoferrin, 356 Lactosylceramide (LacCer)(CDwl 7), 176
LAG-3 (Lymphocyte activation gene 3), 532-3 L A M - 1 s e e CD62L Laminin, 241,257, 260, 267, 271,287 Lamp-1 (CD107a), 389-90 Lamp-2 (CD107b), 389-90 LBL s e e Galectin 3 L-CA s e e CD45 Lck, 135, 142, 143, 150, 181, 187, 199, 241,253, 270, 312, 404, 581 integration into cell membrane, 102, 106 LDAD database, 12 LDL receptor-related protein (CD91 ), 355-8 LDLR (low-density lipoprotein receptor), 3-8, 10, 221,315, 533-5, 557, 586 Lex s e e CD 15 LECAM- 1 s e e CD62L Lectin C-type, 3-8, 10 Lectin S-type domain s e e Galectin Leu-1 s e e CD5 Leu-7 antigen (CD57), 287 Leu-8 s e e CD62L Leu- 12 s e e CD 19 Leu-13, 179, 180, 337 Leu- 14 s e e CD22 Leu-19 antigen s e e CD56 Leu-23 s e e CD69 Leucine-rich (glycoprotein)repeats (LGR), 3-8, 10, 33 Leucocyte common antigen s e e CD45 Leucocyte function antigen- 1 s e e LFA-1 Leucocyte function antigen-2 s e e CD2 Leucocyte function antigen-3 s e e CD58 Leucocyte tyrosine kinase s e e ltk Leukosialin s e e CD43 Lewis x (Lex) s e e CD 15 LFA-1,269- 70 architecture of cell surface, 120, 121 LFA-1 ~ subunit s e e CD 1l a LFA-2 s e e CD2 LFA-3 s e e CD58 LGL-1 (Ly-49G), 545, 546, 547 Lgp-110 (rat)(CD107b), 389-90 Lgp-120 (rat)(CD107a), 389-90 Ligand, 12 Ligands and associated molecules s e e specific
antigens
and
Chapter
4
Link superfamily, 3-8, 10, 32, 39, 41 Lipopolysaccharide (LPS), 158, 169 Lipopolysaccharide binding protein (LBP), 169 Low-density lipoprotein receptor s e e LDLR LPAP (lymphocyte phosphataseassociated phosphoprotein; CD45-AP (CD45-associated protein) (mouse); LSM-1 (mouse)), 536-7 LRG s e e Leucine-rich (glycoprotein) repeats L-selectin, 287 LSM-1 (mouse)(LPAP), 536-7 ltk (leucocyte tyrosine kinase), 538-9 associated enzymes, 113 Ly-1 s e e CD5 Ly-2, 20 Ly-3, 20 Ly-5 s e e CD45 Ly-6 (Sca-1; TAP; MALA-1), 3-8, 10, 540-2 Ly-9, 543-4 Ly-19.2 (mouse)see CD72 Ly-24 (mouse)see CD44 Ly-32.2 (CD72), 323-4 Ly-44 (mouse)see CD20 Ly-48 (mouse)see CD43 Ly-49, 545-7, 565 Ly-49A (A1; YE1/48), 545, 546, 547 Ly-49B, 545, 547 Ly-49C (5E6), 545, 546, 547 Ly-49D, 545, 547 Ly-49E, 545, 547 Ly-49F, 545, 546, 547 Ly-49G (LGL-1), 545, 546, 547 Ly-49H, 545, 547 Lyb-2 s e e CD72 Lyb-8 s e e CD22 Lymph node homing receptor s e e CD62L Lymphocyte activation gene 3 s e e LAG-3 Lymphocyte homing receptor s e e CD44 Lymphocyte phosphatase-associated phosphoprotein (LPAP), 245, 536-7 Lymphocyte vascular adhesion protein 2 (CD73), 325-6 Lymphotoxin, 89 Lymphotoxin a (LTa), 398
~07
Lymphotoxin//(LT//), 398 Lyn, 179, 181,221,304, 307, 459 Lysosome-associated membrane protein-1 (CD107a), 389-90 Lyt2/3 (mouse)see CD8 M
M1/69-JIId (mouse)(CD24), 192-3 M6 (human)(CD147), 408-9 M130 antigen (CD163), 427-8 Mac-1 (Mo-I,CR3) a subunit s e e CD1 lb Mac-2 s e e Galectin 3 Mac-2 binding protein s e e Mac-2-BP Mac-2-BP (Mac-2 binding protein; CyAP(Cyclophilin C-associated protein); L3 antigen; MAMA), 465, 548-9 MACIF s e e CD59 a2-macroglobulin receptor (CD91 ), 355-8 Macrophage asialoglycoprotein binding protein (M-ASGP-BP)(rat) (Macrophage lectin), 550-1 Macrophage colony-stimulating factor receptor s e e M-CSFR Macrophage galactose/ N-acetylgalactosamine-specific lectin (MMGL)(mouse) (Macrophage lectin), 550-1 Macrophage lectin (Macrophage asialoglycoprotein binding protein (M-ASGP-BP) (rat); Macrophage galactose/N-acetylgalactosaminespecific lectin (MMGL)(mouse)), 550-1 Macrophage mannose receptor s e e Mannose receptor Macrosialin (mouse)(CD68), 314-5 MAdCAM-1 (Mucosal addressin cell adhesion molecule 1), 262, 299, 522, 552-3 Major histocompatibility complex Class I antigen s e e MHC Class I Major histocompatibility complex Class II antigen s e e MHC Class II MALA- 1 s e e Ly-6 MAMA s e e Mac-2-BP Mannose receptor (Macrophage mannose receptor), 554-6 MARCO, 557-8
~0~
M-ASGP-BP (macrophage lectin), 550-1 Mast/stem cell growth factor (c-kitL), 443 -4 Mast/stem cell growth factor receptor s e e CDll7 rob-1 s e e CD79/BCR MCP (CD46), 248-9 MCP-1,441 M-CSFR (Macrophage colony-stimulating factor receptor; CD 115; Colonystimulating factor (CSF)-1 receptor; c-fms proto-oncogene), 559-60 associated enzymes, 113 MDR1 (Multidrug resistance protein 1; P-glycoprotein (P-gp), P170; Multidrug transporter), 562-3 integration into cell membrane, 104 ME491 s e e CD63 Measles virus, 249 MEL- 14 antigen s e e CD62L Melanoma growth-stimulating activity (MGSA/GRO), 503 MEM- 102 s e e CD48 Membrane, cell, integration of proteins into, 101-6 s e e a l s o Transmembrane Membrane cofactor protein (CD46), 248-9 MG-160 (ESL-1), 452-3 MHC, 3-8, 10 architecture of cell surface, 120 MHC Class I (Major histocompatibility complex Class I antigen; HLA-A, -B, -C (human); H-2K, -D, -L (mouse); RT1A; RT1C (rat), 150, 181,276, 337, 339, 361,497-8, 525, 546, 564-6, 571 architecture of cell surface, 119, 120, 121 monoclonal antibodies, 22, 23 protein structure, 26 MHC Class II (Major histocompatibility complex Class II antigen; HLA-DP, -DQ, -DR (human); I-A, I-E (mouse); RT1B, RT1D (rat), 142, 181, 190, 224, 276, 337, 339, 532, 533, 567-8
architecture of cell surface, 119, 120, 122 monoclonal antibodies, 23 protein structure, 26 MHC Class II-associated invariant chain (Ik or Iv) s e e CD74 MIC2 (CD99), 371-2 f12 Microglobulin, 26, 56 MIP- 1~, 441 MIRL s e e CD59 MLA1 s e e CD63 MLR-3 s e e CD69 Mo-1 s e e CDllb Mo3 s e e CD87 Module, 32, 33, 38 Moesin, 241 Molecular analysis of leucocyte antigens, 23-5 protein structure, 26-7 Molecular weight of processed molecules, 11 Monoclonal antibodies, 2, 22-3 'Mosaic' proteins, 34 Motifs nomenclature, 32, 33 Mouse leucocyte antigens, 19-20 see also specific
antigens
Mp 1 (Thrombopoietin receptor), 591-2 MRC OX2 s e e OX2 MRC OX40 s e e CD 134 MRP-1 s e e CD9 MS2, 569-70 Mucosal addressin cell adhesion molecule s e e MAdCAM- 1 Multicolony-stimulating factor receptor s e e IL-3R Multidrug resistance protein 1 s e e MDR1 Multidrug transporter s e e MDR1 Murine leucocyte antigen s e e Mouse Myeloproliferative leukemia (Thrombopoietin receptor), 591-2 Myristoyl, 101, 106 N
NCA (CD66c), 310, 311,312, 313 NCA-90 (CD66c), 310, 311,312, 313 NCA-95 (CD66b), 310, 311, 312, 313 NCA-160 (CD66a), 310, 311,312 NCAM s e e CD56
NCAM L1 (L1), 528-9 NEP s e e CD 10 Neprilysin s e e CD 10 Nerve growth factor receptor s e e Tumour necrosis factor superfamily Neural cell adhesion molecule (NCAM) isoform s e e CD56 Neurolin s e e CD 166 Neurothelin (chicken) (CD 147), 408-9 Neutral endopeptidase [EC 3.4.24.11] (NEP) s e e CD 10 Neutrophil-activating peptide 2 (NAP-2), 503 Neutrophil granulocyte, glycosylation, 111-2 Neutrophil inhibitory factor (NIF), 158 NK gene complex, 545-7 NKI.1 (CD161), 421-2 NKAT1-10 s e e KIR NKAT3-10 s e e KIR NKG2 family, 361,571-2 NKH-1 antigen s e e CD56 NKR-P1 family (CD161), 421-2 NKR-P1A (CD161; NKR-P1 gene 2; mNKR-P1.7 (mouse); 3.2.3; NKRPl(rat)), 421-2 NKR-P1B (NKR-P1 gene 34 (mouse)), 421-2 NKR-P1C (NKI.1; NKR-P1 gene 40; mNKR-P1.9 (mouse)), 421-2 N-linked carbohydrate, 107, 108, 110 symbol used, 10 N-linked glycosylation degree of, 11 sites, 106-7 symbol used, 10 NMR, 26 Nomenclature, 32-4 Nucleic acid sequencing, 23-5 0
O-linked carbohydrate, 107, 109, 111, 112 symbol used, 10 O-linked glycosylation degree of, 11 sites, 106-7 symbol used, 10 OMIM Database, 14
~0~
Index
Osteopontin, 241 OV-3 s e e CD47 OX2 (MRC OX2), 573-4 architecture of cell surface, 119 OX40 s e e CD 134 OX40L (CD134L), 400-1,575-6 OX-44 (rat)see CD53 OX-45 s e e CD48 OX-47 (rat)(CD147), 408-9 P
CD59 p50.1, 524 p50.2, 524 p55 s e e CD25 a chain p58.1, 524 p58.2, 524 p64 s e e CD132 p70, 524 p75 s e e CD122 fl chain p85 s e e CD44 p126 (CD101), 375-6 p 150, 95 ~ subunit s e e CD 11 c p161 (mouse) s e e CD13 P170 s e e MDR1 PADGEM s e e CD62P PAMS mutation matrix, 35, 36 PC-1 (EC 3.1.4.1; EC 3.6.1.9), 577-8 associated enzymes, 113 PD- 1, 579-80 PECAM- 1 s e e CD31 Perforin, 32 PETA-3 (CD151), 414 P-glycoprotein (P-gp)see MDR1 P - g p s e e MDR1 P g p - 1 s e e CD44 Phagocytic glycoprotein 1 ( P g p - 1 ) s e e CD44 Phosphatidylinositol 3-kinase, 135, 147, 179, 187, 199, 395, 415, 494, 538, 560 Phosphatidylinositol-phospholipase C, 105 Phospholipase C, 187, 395, 538 Phospholipids, 221 PILEUP program, 40 PIR database, 12, 14 pK blood group antigen (CD77), 331 Plasminogen, 356 P- 18 s e e
~IC
Plasminogen activator inhibitor type 1, 356 Plasmodium f a l c i p a r u m erythroycte membrane protein 1 (PfEMP1), 221,279 Platelet endothelial cell adhesion molecule 1 (PECAM-1) s e e CD31 Platelet glycoprotein W s e e CD36 PMA, 476 Polymerase chain reaction, 23, 24 Pregnancy-specific glycoprotein (PSG) subgroup, 311 Prenyl anchors, 106 PROSITE database, 34, 35 Protease, 356 Protein Data Bank (PDB), 13, 14 Protein Identification Resource (PIR) database, 12, 14 Protein(s) architecture of cell surface, 118-23 carbohydrate structures, 106-12 domains/repeats/motifs, s e e Domains; Motifs; Repeats function of leucocyte membrane antigens, 112-4 integration into cell membrane, 101-6 sequence, s e e Amino acid sequence superfamily, s e e Superfamily s e e a l s o Glycoprotein(s) Protein tyrosine kinase, 39, 41 Protein tyrosine phosphatase (PTPase), 3-8, 10, 39, 41,276, 304, 363 Proteolysis, 37-8 Proto-oncogenes s e e specific oncogenes P-selectin s e e CD62P P-selectin glycoprotein ligand 1 s e e CD162 PSGL- 1 s e e CD 162 PTLGP40 s e e CD63 PTPase s e e Protein tyrosine phosphatase R
R2 s e e CD82 Radixin, 241 Rafl, 395, 538 RANTES, 441 Rat leucocyte antigens
see specific
antigens
Receptor-associated protein (RAP), 356
Index
Receptors s e e s p e c i f i c r e c e p t o r s Recombinant DNA, 26 Repeats identifying, 34-5 nomenclature, 32, 33 Rhinovirus, 279 RIP, domain, 79, 80 RL-388(mouse) (CD98), 369-70 RL29 s e e Galectin 3 rLy-49.9, 545, 546, 547 rLy-49.12, 545, 547 rLy-49.29, 545, 547 RNA virus, 166 RT1A, RT1C (rat)see MHC Class I RT1B, RT1D (rat)see MHC Class II RT6, 581-2 associated enzymes, 113 S
SC- 1 s e e CD 166 Sca- 1 s e e Ly-6 Sca-2 (Stem cell antigen 2; Thymic shared antigen 1 (TSA-1), 583 Scavenger RI and II (SR-A1 and SR-AII), 585-7 Scavenger receptor cysteine-rich (SRCR), 3-8, 10 Scavenger receptor superfamily, domains, 143 SCR s e e CCP SDS-PAGE, antigens resolved on, 11, 13, 18, 21 SE6 (Ly-49C), 545, 546, 547 Seed's expression cloning system, 24 Selectins, 111 s e e a l s o CD62E, CD62L, CD62P, E-selectin, L-selectin, P-selectin Semaphorin, 32 CD100, 373 Serine protease, 32 Serology, 19-21 quantitative, 20-1 SHIP, 210 SHP-1, 187, 210, 526, 546, 571 SHP-2, 415, 526, 546 Sgp50 s e e GlyCAM-1 Sgp90 s e e CD34 SH2, domain, 91 SH3, domain, 91
Sheep erythrocyte receptor s e e Sialoadhesin Short consensus repeat (SCR) s e e CCP Sialoadhesin (Sheep erythrocyte receptor), 588-90 carbohydrate structures, 111 Sialoglycoconjugates, 187, 213, 589 Sialophorin s e e CD43 Sialyl Lewis x (sLex) (CD 15s), 172 Size of processed molecules, 11 SLAM (CDwl50), 412-3 SM4 (CD163), 427-8 Sm23, domain, 83 Sodium dodecyl sulfate (SDS) polyacrylamide gel electrophoresis (SDS)-PAGE, 11, 13, 18, 21 Solubilization of antigens by detergents, 18-19 Somatomedin B, 3-8, 10, 39, 41 SOS, 199 SR-A1 and SR-MI (Scavenger RI and II), 585-7 Src, 312, 347 s e e a l s o Fyn, Lyn Staphylococcus a u r e u s , 557 Steel factor s e e c-kitL Steel factor receptor s e e CD 117 Stem cell antigen 2 (Sca-2), 583 STK-1 (CD135), 402-3 S t r e p t o c o c c u s p y o g e n e s , 249 S-type lectins s e e Galectin Sulfatides, 302 Superfamily, concept, 25-6, 32 nomenclature, 32-4 protein, 32-9 Sushi (Complement control protein)see CCP SWISSPROT database, 8, 12, 13, 14, 34 Syk, 187, 459 Syndecan-1 (CD138), 9, 406-7 T T1 s e e CD5 T4 s e e CD4 T6 s e e CD1 T8 s e e CD8 T9 s e e CD71 T10 s e e CD38 T11 s e e CD2
~11
Index
T12 s e e CD6 T200 s e e CD45 Tac antigen s e e CD25 ~ chain Tactile (CD96), 365-6 TAP s e e Ly-6 TAPA- 1 s e e CD81 T cells activated, glycosylation, 111, 112 antigen, domains, 52, 54, 5, 56, 60 resting, glycosylation, 111, 112 surface architecture, 118 site numbers of and area covered by antigens on, 119 T cell receptor complex s e e TcR TCR (T cell receptor complex), 25, 135, 137-9, 142, 174, 195, 245, 320, 376, 565, 568 chain, 137, 138 //chain, 13 7 chain, 13 7 chain, 13 7 pTCR a, 137, 138, 139 architecture of cell surface, 120, 121, 122 integration into cell membrane, 103, 106 molecular analysis, 25 protein structure, 26 Telokin, 43 Tenascin, 407 TGF~, domain, 43 TGFfl, 384-5 Theta s e e CD90 Thrombin, 236 Thrombopoietin receptor (Mp 1 (myeloproliferative leukemia)), 591-2 Thrombospondin, 32, 221,232, 271,407 Thy-1 s e e CD90 Thymic shared antigen 1 (Sca-2), 583 Thymocyte-activating molecule (THAM) s e e CD26 Tissue distribution of antigens, 12 s e e also specific antigens
Tla, 20 TM4SF s e e Transmembrane 4 TNF a, 90, 398
//, 90, 118, 398 architecture of cell surface, 120 TNF receptor-associated factors (TRAF), 398 TNF receptor-associated kinase (TRAK), 398 TNF receptor-associated proteins (TRAP), 398 TNF-R55 s e e CD120a TNF-R75 s e e CD120b TNFR architecture of cell surface, 120 protein structure, 27 TNFRl-associated death domain protein (TRADD), 398 CD 120a and CD 120b and, interaction, 398 TNFRI s e e CD120a TNFRII (CD120b), 397-9 TNFRSF (Tumour necrosis factor receptor superfamily (TNFRSF), 3-8, 10, 87, 89-90 TNFSF (Tumour necrosis factor superfamily), 3-8, 10 architecture of cell surface, 120 Toxins, 356 Tp41 (CD7), 147-8 Tp44 s e e CD28 Tp 103 s e e CD26 Tp 120 s e e CD6 TQ1 s e e CD62L TRADD s e e TNFRl-associated death domain protein Transferrin, 320 Transferrin receptor s e e CD71 Transmembrane 4 (TM4)superfamily, 104 Transmembrane 7 superfamily, 50, 53 Transmembrane attachment, 101-4 multipass, 102, 103-4 type I, 102-3 type II, 103 Transmembrane regions, domains in, distribution, 3-8 Transmissible gastroenteritis virus (TGEV), 166 TRC, 91 TSA-1 (Sca-2)583 TSG6, domain, 66, 68, 69 Tubulin, 135
Tumour necrosis factor s e e TNF Tumour necrosis factor I s e e CD120a Tumour necrosis factor II (CD 120b), 397-9 Tumour necrosis factor receptor superfamily s e e TNFRSF Tumour necrosis factor superfamily s e e TNFSF Tyrosine kinase, 3-8, 10, 304 integration into cell membrane, 101
W
W3/13 (rat)see CD43 W3/25(rat) s e e CD4 W272 (CD66b), 310, 311, 312, 313 WC1 (cattle)antigen, 593-5 World Wide Web (WWW), 13, 14
U UM4D4 (CD60), 292 Urokinase plasminogen activator receptor (uPAR) s e e CD87 V V7 (CD101), 375-6 VASE, 38-9 Vav, 179 VCAM- 1 s e e CD 106 Viruses, 356 Vitronectin, 232 Vitronectin receptor e subunit VLA, domain, 60 VLA-1 ~ subunit s e e CD49a
VLA-2 ~ subunit s e e CD49b VLA-3 a subunit s e e CD49c VLA-4 a subunit s e e CD49d VLA-5 (fibronectin receptor) a subunit (CD49e), 265-6 VLA-6 a subunit s e e CD49f VWF (von Willebrand factor), 60, 62, 232, 236, 271
X
Xenogeneic antibodies, 21, 22 X-ray crystallography, 26 Y YE1/48 (Ly-49A), 545, 546, 547 Yes, 221 see
CD51 Z ZAP-70, 138
~1~