PROTEOMICS OF BIOLOGICAL SYSTEMS
PROTEOMICS OF BIOLOGICAL SYSTEMS Protein Phosphorylation Using Mass Spectrometry Tec...
58 downloads
890 Views
19MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
PROTEOMICS OF BIOLOGICAL SYSTEMS
PROTEOMICS OF BIOLOGICAL SYSTEMS Protein Phosphorylation Using Mass Spectrometry Techniques
Bryan M. Ham
A JOHN WILEY & SONS, INC., PUBLICATION
Copyright © 2012 by John Wiley & Sons, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data: Ham, Bryan M. â•… Proteomics of biological systems : protein phosphorylation using mass spectrometry techniques / Bryan M Ham. â•…â•… p. cm. â•… Includes index. â•… ISBN 978-1-118-02896-4 (cloth) â•… 1.╇ Proteomics–Methodology.â•… 2.╇ Phosphorylation–Research–Methodology.â•… 3.╇ Phosphoproteins–Synthesis.â•… 4.╇ Mass spectrometry.â•… 5.╇ Biological systems–Research– Methodology.â•… I.╇ Title. â•… QP519.9.M3H367 2012 â•… 572'.62–dc23 2011019941 oBook ISBN: 9781118137048 ePDF ISBN: 9781118137017 ePub ISBN: 9781118137031 MOBI ISBN: 9781118137024 Printed in the United States of America. 10â•… 9â•… 8â•… 7â•… 6â•… 5â•… 4â•… 3â•… 2â•… 1
This Book Is Dedicated to the Most Important Person in My Life, My Ever Loving Wife.
CONTENTS Preface
xvii
Acknowledgments
xxi
About the Author
xxiii
1 Posttranslational Modification (PTM) of Proteins 1.1 1.2 1.3
1.4 1.5
1
Over 200 Forms of PTM of Proteinsâ•… 1 Three Main Types of PTM Studied by MSâ•… 2 Overview of Nano-Electrospray/Nanoflow LC-MSâ•… 2 1.3.1 Definition and Description of MSâ•… 2 1.3.2 Basic Design of Mass Analyzer Instrumentationâ•… 3 1.3.3 ESIâ•… 7 1.3.4 Nano-ESIâ•… 11 Overview of Nucleic Acidsâ•… 15 Proteins and Proteomicsâ•… 20 1.5.1 Introduction to Proteomicsâ•… 20 1.5.2 Protein Structure and Chemistryâ•… 22 1.5.3 Bottom-Up Proteomics: MS of Peptidesâ•… 27 1.5.3.1 History and Strategyâ•… 27 1.5.3.2 Protein Identification through Product Ion Spectraâ•… 30 1.5.3.3 High-Energy Product Ionsâ•… 36 1.5.3.4 De Novo Sequencingâ•… 37 1.5.3.5 Electron Capture Dissociation (ECD)â•… 40 1.5.4 Top-Down Proteomics: MS of Intact Proteinsâ•… 42 1.5.4.1 Backgroundâ•… 42 1.5.4.2 GP Basicity and Protein Chargingâ•… 42 1.5.4.3 Calculation of Charge State and Molecular Weightâ•… 44 1.5.4.4 Top-Down Protein Sequencingâ•… 46 vii
viiiâ•…â•… CONTENTS
1.5.5 Systems Biology and Bioinformaticsâ•… 48 1.5.6 Biomarkers in Cancerâ•… 52 Referenceâ•… 56 2 Glycosylation of Proteins
59
2.1 2.2 2.3 2.4
Production of a Glycoproteinâ•… 59 Biological Processes of Protein Glycosylationâ•… 59 N-Linked and O-Linked Glycosylationâ•… 60 Carbohydratesâ•… 60 2.4.1 Ionization of Oligosaccharidesâ•… 64 2.4.2 Carbohydrate Fragmentationâ•… 65 2.4.3 Complex Oligosaccharide Structural Elucidationâ•… 70 2.5 Three Objectives in Studying Glycoproteinsâ•… 72 2.6 Glycosylation Study Approachesâ•… 72 2.6.1 MS of Glycopeptidesâ•… 73 2.6.2 Mass Pattern Recognitionâ•… 75 2.6.2.1 High Galactose Glycosylation Patternâ•… 75 2.6.3 Charge State Determinationâ•… 76 2.6.4 Diagnostic Fragment Ionsâ•… 76 2.6.5 High-Resolution/High-Mass Accuracy Measurement and Identificationâ•… 76 2.6.6 Digested Bovine Fetuinâ•… 78 Referenceâ•… 79 3 Sulfation of Proteins as Posttranslational Modification 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
Glycosaminoglycan Sulfationâ•… 81 Cellular Processes Involved in Sulfationâ•… 81 Brief Example of Phosphorylationâ•… 82 Sulfotransferase Class of Enzymesâ•… 82 Fragmentation Nomenclature for Carbohydratesâ•… 82 Sulfated Mucin Oligosaccharidesâ•… 83 Tyrosine Sulfationâ•… 84 Tyrosylprotein Sulfotransferases TPST1 and TPST2â•… 87 3.9 O-Sulfated Human Proteinsâ•… 89 3.10 Sulfated Peptide Product Ion Spectraâ•… 89 3.11 Use of Higher Energy Collisionsâ•… 93
81
CONTENTSâ•…â•… ix
3.12 Electron Capture Dissociation (ECD)â•… 94 3.13 Sulfation versus Phosphorylationâ•… 95 Referenceâ•… 97 4 Eukaryote PTM as Phosphorylation: Normal State Studies 4.1
4.2
99
Mass Spectral Measurement with Examples of HeLa Cell Phosphoproteomeâ•… 99 4.1.1 Introductionâ•… 99 4.1.2 Protein Phosphatase and Kinaseâ•… 99 4.1.3 Hydroxy-Amino Acid Phosphorylationâ•… 100 4.1.4 Traditional Phosphoproteomic Approachesâ•… 102 4.1.5 Current Approachesâ•… 103 4.1.5.1 Phosphoproteomic Enrichment Techniquesâ•… 103 4.1.5.2 IMACâ•… 104 4.1.5.3 MOACâ•… 105 4.1.5.4 Methylation of Peptides prior to IMAC or MOAC Enrichmentâ•… 107 4.1.6 The Ideal Approachâ•… 107 4.1.7 One-Dimensional (1-D) Sodium Dodecyl Sulfate (SDS) PAGEâ•… 108 4.1.8 Tandem MS Approachâ•… 108 4.1.8.1 pS Loss of Phosphate Groupâ•… 109 4.1.8.2 pT Loss of Phosphate Groupâ•… 112 4.1.8.3 pY Loss of Phosphate Groupâ•… 113 4.1.9 Alternative Methods: Infrared Multiphoton Dissociation (IRMPD) and Electron Capture Dissociation (ECD)â•… 115 4.1.10 Electron Transfer Dissociation (ETD)â•… 115 The HeLa Cell Phosphoproteomeâ•… 118 4.2.1 Introductionâ•… 118 4.2.2 Background of Studyâ•… 118 4.2.3 What is Coveredâ•… 119 4.2.4 Optimized Methods to Use for Phosphoproteomic Studiesâ•… 119 4.2.4.1 Cell Cultureâ•… 119 4.2.4.2 Extraction of HeLa Cell Proteinsâ•… 120 4.2.4.3 Trizol Extraction and Tryptic Digestionâ•… 120
xâ•…â•… CONTENTS
4.2.4.4
Solid-Phase Extraction (SPE) Desaltingâ•… 120 4.2.4.5 Converting Peptide Carboxyl Moieties to Methyl Estersâ•… 121 4.2.4.6 Roche Complete Lysis-M, EDTA-Free Extractionâ•… 122 4.2.4.7 1-D SDS-PAGE Cleanupâ•… 122 4.2.4.8 In-Gel Reduction, Alkylation, Digestion, and Extraction of Peptidesâ•… 122 4.2.4.9 Phosphopeptide Enrichment Using IMACâ•… 123 4.2.5 Description of Instrumental Analysesâ•… 123 4.2.5.1 RP/Nano-HPLC Separationâ•… 123 4.2.5.2 MS Analysisâ•… 125 4.2.6 Current Approaches for Peptide Identification and False Discovery Rate (FDR) Determinationâ•… 125 4.2.7 Results of the Protein Extraction and Preparationâ•… 126 4.2.7.1 Detergent Lysis, Trizol, and Ultracentrifugationâ•… 126 4.2.7.2 Nucleic Acid Removal with SDS-PAGEâ•… 127 4.2.8 HeLa Cell Phosphoproteome Methodology Comparisonâ•… 128 4.2.8.1 Roche In-Solution versus Trizol Extractionâ•… 129 4.2.8.2 In-Solution and In-Gel Digests Phosphoproteome Coverageâ•… 129 4.2.9 Overall Conclusionâ•… 134 4.3 Nonphosphoproteome HeLa Cell Analysisâ•… 135 4.3.1 IMAC Flow Through Peptide Analysisâ•… 135 4.3.2 IMAC NaCl Wash Peptide Analysisâ•… 136 4.3.3 IMAC Flow Through versus NaCl Wash Comparisonâ•… 138 4.3.4 Gene Ontology Comparisonâ•… 138 4.3.5 IMAC Bed Nonspecific Binding Studyâ•… 140 4.4 Reviewing Spectra Using the SpectrumLook Software Packageâ•… 143 Referenceâ•… 144
CONTENTSâ•…â•… xi
5 Eukaryote PTM as Phosphorylation: Perturbed State Studies 147 5.1
Study of the Phosphoproteome of HeLa Cells under Perturbed Conditions by Nano-High-Performance Liquid Chromatography HPLC Electrospray Ionization (ESI) Linear Ion Trap (LTQ)-FT/Mass Spectrometry (MS)â•… 147 5.1.1 Introductionâ•… 147 5.1.2 Ataxia Telangiectasia Mutated (ATM) and ATM and Rad3-Related (ATR)â•… 149 5.1.3 Background of Studyâ•… 149 5.1.3.1 PP5â•… 149 5.1.3.2 Functions of PP5â•… 151 5.1.3.3 DDR of PP5â•… 151 5.1.4 Review of Optimized Approach to Studyâ•… 151 5.1.4.1 Producing Cell Culturesâ•… 151 5.1.4.2 Protein Extractionâ•… 152 5.1.4.3 Phosphopeptide Enrichment by IMACâ•… 154 5.1.4.4 Reversed-Phase (RP)/Nano-HPLC Separationâ•… 155 5.1.4.5 LTQ-FT/MS/MSâ•… 156 5.1.4.6 Protein Identification and False Discovery Rate (FDR) Determinationâ•… 156 5.1.4.7 Phosphopeptide Quantitative Differential Comparisonâ•… 157 5.1.4.8 Data Set Peak Matching and Alignmentâ•… 157 5.1.4.9 Phosphopeptide Response Normalizationâ•… 160 5.1.5 Phosphoproteome Gene Ontology (GO) Comparisonâ•… 160 5.1.5.1 GO Cellular Componentâ•… 162 5.1.6 Potential Regulated Target Proteins of PP5â•… 162 5.1.6.1 Analysis of Variance (ANOVA)â•… 162 5.1.6.2 Four Potential Target Proteinsâ•… 166 5.1.7 GO Differential Comparisonâ•… 167 5.1.7.1 GO Cellular Componentâ•… 168 5.1.7.2 Influence of Classes or Categories of Proteinsâ•… 168 5.1.7.3 Molecular Function Interacting Modulesâ•… 168
xiiâ•…â•… CONTENTS
5.1.8 5.1.9
Conclusionâ•… 175 Reviewing Spectra Using the SpectrumLook Software Packageâ•… 175 Referenceâ•… 176 6 Prokaryotic Phosphorylation of Serine, Threonine, and Tyrosine 6.1
6.2
6.3
181
Introductionâ•… 181 6.1.1 Serine (Ser)/Threonine (Thr)/Tyrosine (Tyr) Phosphorylationâ•… 181 6.1.2 Histidine (His) Phosphorylationâ•… 181 6.1.3 Caulobacter crescentusâ•… 181 6.1.4 Ser/Thr/Tyr Phosphorylation of C. crescentusâ•… 183 6.1.5 Ser/Thr/Tyr Phosphorylation of Bacillus subtilis and Escherichia coliâ•… 184 6.1.6 C. crescentus as Cell Cycle Modelâ•… 185 6.1.7 Bacteria Starvation Responseâ•… 187 6.1.8 First Coverage of C. crescentus Phosphoproteomeâ•… 188 Optimized Methodology for Phospho Ser/Thr/Tyr Studiesâ•… 188 6.2.1 Bacterial Strain and Growth Conditionsâ•… 188 6.2.2 C. crescentus Cell Protein Extraction: Phosphoproteomicsâ•… 189 6.2.3 Solid-Phase Extraction (SPE) Desaltingâ•… 190 6.2.4 In Vitro Methylation of Peptidesâ•… 190 6.2.5 Phosphopeptide Enrichment by IMACâ•… 191 6.2.6 Normal Proteomicsâ•… 192 6.2.7 pY Enrichment by IPâ•… 192 6.2.8 RP/Nano-High-Performance Liquid Chromatography (HPLC) Separationâ•… 192 6.2.9 LC-Linear Ion Trap (LTQ)-Orbitrap MS/MSâ•… 193 6.2.10 LTQ-Fourier Transform (FT)/MS/MSâ•… 193 6.2.11 Peptide Identification and False Discovery Rate (FDR) Determinationâ•… 193 6.2.12 Peptide Quantitative Comparisonâ•… 194 Identification of the Components of the Ser/Thr/Tyr Phosphoproteome in C. crescentus Grown in the Presence and Absence of Glucoseâ•… 194 6.3.1 Total Phosphoprotein Identificationsâ•… 194 6.3.2 MSA Spectraâ•… 196
CONTENTSâ•…â•… xiii
6.3.3 6.3.4 6.3.5 6.3.6 6.3.7 6.3.8
6.3.9
6.3.10
6.3.11
6.3.12
6.3.13
Phosphorylation Sites Identifiedâ•… 196 Ser/Thr/Tyr Phosphoproteome of C. crescentusâ•… 205 Phosphorylated His and Aspartateâ•… 213 Cell Cycle His Kinase CckAâ•… 215 Phosphoglutamateâ•… 216 Enriched Tyr Phosphoproteome of C. crescentusâ•… 216 6.3.8.1 Sensor His Kinase KdpDâ•… 216 6.3.8.2 TonB-Dependent Receptor Proteinsâ•… 216 Carbon Environment-Shared Phosphoproteomeâ•… 217 6.3.9.1 Two-Component His Kinasesâ•… 217 6.3.9.2 Multiply Phosphorylated Kinasesâ•… 217 6.3.9.3 pTPLAALpSAQSRRAR Peptide as Sensor His Kinaseâ•… 217 6.3.9.4 Aspartate Phosphorylated Tyr Kinase DivLâ•… 217 Carbon-Rich versus Carbon-Starved Class/ Categoryâ•… 225 6.3.10.1 Localization of Phosphoproteome of C. crescentusâ•… 225 6.3.10.2 Integral Membrane Proteinsâ•… 225 6.3.10.3 Function of Phosphoproteome of C. crescentusâ•… 225 Carbon-Rich versus Carbon-Starved Unique Phosphorylated Proteinsâ•… 227 6.3.11.1 Carbon-Rich Environment Phosphorylated Proteinsâ•… 227 6.3.11.2 Carbon-Starved Environment Phosphorylated Proteinsâ•… 227 6.3.11.3 Decreased Normal Activityâ•… 232 Confirmation of Decreased Energy Pathwaysâ•… 232 6.3.12.1 Carbon-Rich Mitochondrial Localizationâ•… 232 6.3.12.2 Normal Proteome Glycolytic Pathwayâ•… 233 6.3.12.3 Starvation Survival Responseâ•… 233 Phosphopeptide Quantitative Differential Comparisonâ•… 233 6.3.13.1 Upregulation in Phosphorylationâ•… 234
xivâ•…â•… CONTENTS
6.3.13.2 Adaptive Response with Phosphorylationâ•… 234 6.3.13.3 Upregulation NAD-Dependent GDHâ•… 234 6.3.13.4 Downregulation of Flagellin Proteinâ•… 235 6.3.14 Carbon-Rich versus Carbon-Starved Normal Proteome Time Course Studyâ•… 235 6.3.14.1 Entire Proteome Localization and Functionâ•… 235 6.3.14.2 Regulated Proteinsâ•… 237 6.3.14.3 Localization of Regulated Proteinsâ•… 237 6.3.14.4 Function of Regulated Proteinsâ•… 238 6.3.14.5 Normal Proteome Energy Pathwaysâ•… 239 6.3.14.6 Overlap of Phosphorylated Proteins and Regulated Normal Proteomeâ•… 239 6.3.14.7 Differences of Phosphorylated Proteinsâ•… 240 6.3.14.8 Localization of Phosphorylated Proteinsâ•… 240 6.3.14.9 Direct Relationships Observedâ•… 240 6.3.15 Conclusionsâ•… 243 6.3.16 Supplementary Materialâ•… 243 6.3.16.1 Reviewing Spectra Using the SpectrumLook Software Packageâ•… 243 Referenceâ•… 244 7 Prokaryotic Phosphorylation of Histidine 7.1 7.2 7.3
7.4
249
Phosphohistidine as Posttranslational Modification (PTM)â•… 249 Bacterial Kinases and the Two-Component Systemâ•… 250 Measurement of Phosphorylated His (pH)â•… 251 7.3.1 Stabilities of Phosphorylated Amino Acidsâ•… 251 7.3.2 Immobilized Metal Affinity Chromatography (IMAC) and Mass Spectrometry (MS)â•… 252 In Vitro and In Vivo Study of pH-Containing Peptides by Nano-ESI Tandem MSâ•… 255 7.4.1 Introductionâ•… 255 7.4.2 Background of Studyâ•… 257 7.4.2.1 Bacteria Models of Ser/Thr/Tyr Phosphorylationâ•… 257
CONTENTSâ•…â•… xv
7.4.2.2 7.4.2.3 7.4.2.4 7.4.3
7.4.4 7.4.5 7.4.6
7.4.7 7.4.8
Prokaryotic Phosphorylation of Hisâ•… 258 C. crescentusâ•… 258 Mass Spectral Measurement of Phosphohistidineâ•… 258 Optimized Methodology for Phosphohistidine Studiesâ•… 259 7.4.3.1 In Vitro Selective pHis Phosphorylationâ•… 259 7.4.3.2 In Vitro Phosphorylation of Angio II (Sar1Thr8)â•… 261 7.4.3.3 In Vitro Methylation of Peptidesâ•… 262 7.4.3.4 C. crescentus Cell Protein Extraction with V-8 Protease Digestionâ•… 262 7.4.3.5 1-D SDS-Polyacrylamide Gel Electrophoresis (PAGE)â•… 263 7.4.3.6 Phosphohistidine Enrichment by Cu(II)Based IMACâ•… 264 7.4.3.7 Reversed-Phase (RP)/Nano-HPLC Separationâ•… 265 7.4.3.8 Nano-ESI Nano-HPLC MSâ•… 266 7.4.3.9 Peptide Identification and False Discovery Rate (FDR) Determinationâ•… 268 C18 RP LC Behaviorâ•… 268 Phosphohistidine Loses HPO3 and H3PO4â•… 270 7.4.5.1 Rational for H3PO4 Lossâ•… 272 Q-TOF/MS/MS Product Ion Spectraâ•… 277 7.4.6.1 pH-Containing Peptide INpHDLRâ•… 277 7.4.6.2 Doubly Charged (2+) Peptide INpHDLRâ•… 279 7.4.6.3 pH-Containing Peptide pHLGLARâ•… 279 7.4.6.4 Singly Charged (1+) Peptide pHLGLARâ•… 280 Behavior of Monophosphohistidine and Diphosphohistidine Peptideâ•… 281 7.4.7.1 Peptide Angio I as DRVYIHPFHLâ•… 281 Behavior of Phosphotyrosine and Phosphohistidine Peptideâ•… 285 7.4.8.1 Peptide Angio II as DRVpYIHPFâ•… 285 7.4.8.2 Phosphorylated Angio II as DRVpYIpHPFâ•… 285
xviâ•…â•… CONTENTS
7.4.9
Behavior of Phosphotyrosine-, Phosphothreonine-, and Phosphohistidine-Containing Peptideâ•… 287 7.4.9.1 Peptide Angio II (Sar1Thr8)â•… 287 7.4.10 Validation of Cu(II)-Based IMAC Phosphohistidine Enrichmentâ•… 291 7.4.10.1 Fe(III)-Based IMAC versus Cu(II) Basedâ•… 292 7.4.10.2 Cu(II)-Based IMAC of Angio Iâ•… 292 7.4.10.3 Cu(II)-Based IMAC of Angio IIâ•… 293 7.4.11 In Vivo Measurement of Phosphohistidineâ•… 293 7.4.11.1 Time-Based Digestion Studyâ•… 293 7.4.11.2 Phosphohistidine-Containing Peptidesâ•… 294 7.4.11.3 Phosphohistidine Product Ion Spectraâ•… 294 7.4.12 Gene Ontology of Phosphorylated Proteinsâ•… 296 7.4.12.1 Localization of Phosphorylated Proteinsâ•… 296 7.4.12.2 Function of Phosphorylated Proteinsâ•… 304 7.4.13 Predicted Regulatory Protein Motif Studyâ•… 307 7.4.14 Validation of Phosphohistidine-Containing Proteinsâ•… 308 7.4.14.1 Phosphorylation Motif Studyâ•… 308 7.4.14.2 Phosphohistidine Kinase Motifâ•… 309 7.4.15 The pDpH Motifâ•… 310 7.4.16 Conclusionsâ•… 311 7.5 Supplementary Materialâ•… 311 7.5.1 Reviewing Spectra Using the SpectrumLook Software Packageâ•… 311 Referenceâ•… 313 Appendix I Atomic Weights and Isotopic Compositions
317
Appendix II Periodic Table of the Elements
325
Appendix III Fundamental Physical Constants
327
Glossary
329
Index
345
PREFACE This book is a review of posttranslational modification (PTM) of proteins, including a special focus on a collection of mass spectral studies of phosphorylation as a PTM of both eukaryotic and prokaryotic proteomes utilizing the most recent advances and approaches in analytical chemistry. Protein PTM studies have now been studied for over 30 years in recognition of its importance in cellular processes. In particular, the study of protein phosphorylation as a PTM has received much attention in the last 5 years. The timing of this book is in accordance with the new advances in protein phosphorylation studies. A major focus of the book is in the first time reporting of the study of prokaryote phosphoproteomes, in particular the extensive study of the phosphorylation of the histidine residue that is extremely important in prokaryote signaling processes, with very little or no examples previously reported. The focus of the measurement of the phosphoproteomes is based on state-of-the-art mass spectrometry instrumentation and techniques. Also discussed are specific methodologies for performing PTM phosphoproteome studies, which include from cell cultures, through important steps during sample preparation, instrumental analysis using the most recent mass spectral approaches, the handling of the extensive data collected, and finally, recent tools available to the scientific community free of charge on the Internet currently being used to help understand and interpret the extensive and complex data collected during these studies. The data reduction approach utilizes the current “systems biology” viewpoint to rationalize the observations. The primary focus of the book is to teach the basic skills and methodologies needed for studies of phosphorylation as a PTM of proteins using real-life examples of actual phosphoproteome studies of both eukaryotic and prokaryotic systems. The book is a mixture of the fundamentals of sample preparation, nano-liquid chromatographic separation/ nano-electrospray ionization, tandem mass spectrometry instrumental analysis, followed by bioinformatic data interpretation concerning phosphoproteome studies. The book is presenting a number of first-time observations and studies involving the phosphoproteomes of prokaryotes. This is a xvii
xviiiâ•…â•… PREFACE
subject area that is timely and new with few previous examples in the literature. There is currently an overwhelming need and interest in the scientific community concerning phosphoproteomic studies of prokaryotic systems. This involves both the normal type of phosphoproteomes that are studied in eukaryotic systems and the novel areas of phosphorylation observed in prokaryotes. Researchers are currently applying the phosphoproteomic approaches that have now been well-optimized in eukaryotic systems to those of prokaryotic systems, and examples of these types of studies are presented in the book. However, the tremendously challenging area of phosphorylation of the histidine residue or “phosphohistidine proteomes” of prokaryotes has just begun to be studied. The book presents numerous first-time studies of this current topic of interest that has not been observed nor reported yet anywhere in the literature. The book describes the in vitro synthesis of phosphohistidine-containing peptides along with the mass spectral characterization of the peptides. Specifically, the book presents optimized methodologies for performing both eukaryotic phosphoproteome studies and prokaryotic phosphoproteome studies. The materials presented in the book are tried-and-tested sample preparation and analysis methods and approaches. While this can be collected in the literature for eukaryotic systems through exhaustive and time-consuming searches, the book has compiled the most recent approaches into one place. The methodologies that are presented for prokaryotic systems are, however, novel and new and not available in the literature in a systematic approach as presented in the book. In fact, the book is reporting prokaryotic study approaches that have not previously been reported before. This covers a new and quite challenging area of phosphohistidine prokaryote phosphoproteome studies. The book also presents methodologies for timebased studies of a prokaryote model that is undergoing a food-starved environment study. This is a first-ever reported quantitative differential study of a prokaryotic system under perturbed conditions. All studies reported in the book are actual laboratory experiments giving step-bystep sample preparation protocols using laboratory benchtop methodologies and the most recent vendor-optimized kits. For example, an immunoprecipitation study of phosphorylated tyrosine enrichment is described along with the associated mass spectral instrumental analysis results. The work and studies reported in the book can be an invaluable asset to the student and the researcher due to the combination of detailed step-by-step methodology for phosphoproteomic sample preparation, mass spectral instrumental analysis, and data interpretation approaches.
PREFACEâ•…â•… xix
The book also includes the use of some of the most recent systems biology bioinformatic internet tools such as the Blast2GO gene ontology (GO) tool. Also described are the most recent data processing approaches that have been developed by Dr. Richard Smith’s proteomics group at Pacific Northwest National Laboratory. The book is also an extensive reference concerning cell signaling studies associated with phosphorylation of proteins. Bryan M. Ham
ACKNOWLEDGMENTS I would like to acknowledge all those whose input, review, and criticisms helped enormously in the early structuring and final content of this book. I would like to include in the acknowledgement Pacific Northwest National Laboratory where much of the inspiration for this book was instilled within me while I was conducting research in Dr. Richard D. Smith’s proteomics group. Finally, and most important of all, is the acknowledgment of my wife, Dr. Aihui Ma Ham, whose consultations, support, reviewing, and invaluable encouragement saw through the entire process of this book from start to finish with an unending presence of which the project would most certainly not have completed to this level without. B. M. H.
xxi
ABOUT THE AUTHOR Bryan M. Ham, PhD, is a member of the American Society of Mass Spectrometry and the American Chemical Society. He has conducted proteomics and lipidomics research at The Ohio State University and Pacific Northwest National Laboratory in Richland, WA. He is currently working for the Department of Homeland Security at the U.S. Customs and Border Protection New York Laboratory. His research interests include the application of mass spectrometry for biomolecular analysis in the areas of proteomics, lipidomics, and metabolomics.
xxiii
1
Posttranslational Modification (PTM) of Proteins The study of posttranslational modification (PTM) of proteins using mass spectrometry (MS) approaches has now become a well-matured area of study. There are numerous approaches toward applying chromatography coupled with MS for PTM studies. The liquid chromatography (LC) front-end separation approach of choice is now nanoflow/ nano-electrospray, which allows increased sensitivity over previous LC methodology. This book looks at recent developments in PTM studies using MS and proteomic techniques with a focus upon a number of actual studies designed to instruct and highlight modern methodological approaches. A brief overview of nano-electrospray/nanoflow LC-MS is presented in Section 1.3. 1.1 OVER 200 FORMS OF PTM OF PROTEINS In the genomic sequencing field, the use of robotic gene sequencers allowed large-scale sequencing that was essentially automated. The robotic automation of determining gene sequences is possible because the sequences involved with genes involve only four bases (see “Overview of Nucleic Acids” in Section 1.4), and there are no variations induced in the form of postmodification. This has resulted in the wellpublicized entire sequencing of the human genome (Human Genome Project, Nature, February 2001). This is not the case with proteins where there is not only the observance of spliced variants from alternative splicing from the messenger ribonucleic acid (mRNA), there are also PTMs that can take place with the amino acids contained within the protein. There are over 200 PTMs that can take place with proteins as has been described by Wold.1 As examples, here are 22 different types
Proteomics of Biological Systems: Protein Phosphorylation Using Mass Spectrometry Techniques, First Edition. Bryan M. Ham. © 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 1
2â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
of PTMs that can take place with proteins: acetylation, amidation, biotinylation, C-mannosylation, deamidation, farnesylation, formylation, flavinylation, gamma-carboxyglutamic acids, geranyl-geranylation, hydroxylation, lipoxylation, myristoylation, methylation, N-acyl diglyceride (tripalmitate), O-GlcNAc, palmitoylation, phosphorylation, phosphopantetheine, pyrrolidone carboxylic acid, pyridoxyl phosphate, and sulfation.2 There are also artifactual modifications such as oxidation of methionine (Met). A brief overview of proteins and an introduction to proteomics is presented in Section 1.5.
1.2 THREE MAIN TYPES OF PTM STUDIED BY MS Of these, the three types of PTM that are primarily observed and studied using mass spectrometric techniques are glycosylation, sulfation, and phosphorylation. The observance of PTM is increasingly being used in expression studies where a normal state proteome is being compared with a diseased state proteome. However, the PTM of a protein during a biological or physiological change within an organism may take place without any change in the abundance of the protein involved and often, is one piece of a complex puzzle. Methods that measure PTM using mass spectrometric methodologies often focus on the degree (increase or decrease, or alternatively, upregulation or downregulation) of PTM for any given protein or proteins. We shall briefly look at glycosylation and sulfation, which are less involved in cellular processes than phosphorylation, a major signaling cascade pathway for the response to a change in cellular condition(s).
1.3 OVERVIEW OF NANO-ELECTROSPRAY/ NANOFLOW LC-MS 1.3.1 Definition and Description of MS During the past decade, MS has experienced a tremendously large growth in its uses for extensive applications involved with complex biological sample analysis. MS is basically the science of the measurement of the mass-to-charge ratio (m/z) of ions in the gas phase (GP). Mass spectrometers are generally composed of three components: (1) an ionization source that ionizes the analyte of interest and effectively transfers it into the GP, (2) a mass analyzer that separates positively or negatively charged ionic species according to their mass-to-charge ratio
OVERVIEW OF NANO-ELECTROSPRAY/NANOFLOW LC-MSâ•…â•… 3
(m/z), and (3) a detector used to measure the subsequently separated GP ions. Mass spectrometers are computer controlled, which allows the collection of large amounts of data and the ability to perform various and complex experiments with the mass spectral instruments. Applications of MS include unknown compound identification, known compound quantitation, structural determination of molecules, GP thermochemistry studies, ion–ion and ion molecule studies, and molecule chemical property studies. MS is routinely used to determine elements such as Li+, Na+, Cl−, Mg,2+ inorganic compounds such as Li+(H2O)x or (TiO2 )+x , and organic compounds including lipids, proteins, peptides, carbohydrates, polymers, and oligonucleotides (deoxyribonucleic acid [DNA]/RNA). 1.3.2 Basic Design of Mass Analyzer Instrumentation Typical mass spectrometric instrumentation that is used in laboratories and research institutions is composed of six components: (1) an inlet, (2) an ionization source, (3) a mass analyzer, (4) a detector, (5) a data processing system, and (6) a vacuum system. Figure 1.1 illustrates the interrelationship of the six components that make up the fundamental construction of a mass spectrometer. The inlet is used to introduce a sample into the mass spectrometer and can be a solid probe, a manual syringe or syringe pump system, a gas chromatograph, or a liquid chromatograph. The inlet system can be either at atmospheric pressure as is shown in Figure 1.1 or at a reduced pressure under vacuum. The ionization source functions to convert neutral molecules into charged analyte ions, thus enabling their mass analysis. The ionization source can also be part of the inlet system. A typical inlet system and ionization source that is used with high-performance liquid chromatography (HPLC) is electrospray ionization (ESI). In an HPLC/ESI inlet system and ionization source, the effluent coming from the HPLC column is transferred into the ESI capillary that has a high voltage applied to it inducing the ESI process. In this configuration, the inlet system and ionization source are located at atmospheric pressure outside of the mass spectrometric instrumentation that is under vacuum. The spray that is produced passes through a tiny orifice that separates the internal portion of the mass spectrometer that is under vacuum from its ambient surroundings that are at atmospheric pressure. This orifice is also often called the inlet and/or the source. In the case of the coupling of a gas chromatograph to the mass spectrometer, the capillary column of the gas chromatograph is inserted through a heated transfer capillary directly into the internal portion of the mass spectrometer that is under
4â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
Data collection and processing station (PC) m/z Inlet orifice Electrospray ionization source (ESI) and spray
Atmospheric pressure region (760 torr)
Ion lens
Quadrupole mass analyzer
First-stage vacuum (10–2 to 10–4 torr)
Rotary-vane mechanical (rough) pump
Electron multiplier detector
Dynode converter
Second-stage vacuum (10–5 to 10–9 torr)
Turbo molecular pump
Figure 1.1.╇ The six components that make up the fundamental configuration of mass spectrometric instrumentation composed of (1) inlet and ionization system, (2) inlet orifice (source), (3) mass analyzer, (4) detector, (5) vacuum system, and (6) data collection and processing station (PC). [See Wikipedia, turbomolecular pump, http:// en.wikipedia.org/w/index.php?title=Turbomolecularpump&oldid=71160479 (as of August 24, 2006, 17:45 GMT)].
vacuum. This is possible due to the fact that the species eluting from the capillary column are already in the GP, making their introduction into the mass spectrometer more straightforward as compared with the liquid eluant from an HPLC where analytes must be transferred from the solution phase to the GP. An example of an ionization process that takes place under vacuum in the front end of the mass spectrometer is a process called matrix-assisted laser desorption ionization or MALDI. In this ionization technique, a laser pulse is directed toward a MALDI
OVERVIEW OF NANO-ELECTROSPRAY/NANOFLOW LC-MSâ•…â•… 5
target that contains a mixture of the neutral analytes and a strongly UV-absorbing molecule, often times a low-molecular-weight organic acid such as dihydroxybenzoic acid (DHB). The analytes are lifted off of the MALDI target plate directly into the GP in an ionized state. This is due to transference of the laser energy to the matrix and then to the analyte. The MALDI technique takes place within a compartment that is at the beginning of the mass spectrometer instrument and is under vacuum. The compartment that this takes place is often called the ionization source, thus combining the inlet system and the ionization source together into one compartment. As illustrated in Figure 1.1, the analyte molecules (small circles), in an ionized state, pass from atmospheric conditions to the first stage of vacuum in the mass spectrometer through an inlet orifice that separates the mass spectrometer that is under vacuum from ambient conditions. The analytes are guided through a series of ion lenses into the mass analyzer. The mass analyzer is the heart of the system, which is a separation device that separates positively or negatively charged ionic species in the GP according to their respective mass-to-charge ratios. The mass analyzer GP ionic species separation can be performed by an external field such as an electric field or a magnetic field or by a field-free region such as within a drift tube. For the detection of the GP-separated ionic species, electron multipliers are often used as the detector. Electron multipliers are mass impact detectors that convert the impact of the GP-separated ionic species into a cascade of electrons, thereby multiplying the signal of the impacted ion many times fold. The vacuum system ties into the inlet, the source, the mass analyzer, and the detector of the mass spectrometer at different stages of increasing vacuum as movement goes from the inlet to the detector (left to right in Fig. 1.1). It is very important for the mass analyzer and detector to be under high vacuum as this removes ambient gas, thereby reducing the amount of unwanted collisions between the mass-separated ionic species and gas molecules present. As illustrated in Figure 1.1, ambient, atmospheric conditions are generally at a pressure of 760╯torr. The firststage vacuum is typically at or near 10−3╯torr immediately following the inlet orifice and around the first ion transfer lenses. This stage of vacuum is obtained using two-stage rotary vane mechanical pumps that are able to handle high pressures such as atmospheric and large variation in pressures but are not able to obtain the lower pressures that are required further into the mass spectrometer instrument. The two-stage rotary vane mechanical pump has an internal configuration that utilizes a rotating cylinder that is off-axis within the pump’s hollow body. The off-axis-positioned rotor contains two vanes that are opposed and
6â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
directed radially and are spring controlled to make pump body contact. As the cylinder rotates, the volume between the pump’s body and the vanes changes; the volume increases behind each vane that passes a specially placed gas inlet port. This will cause the gas to expand behind the passing vane, while the trapped volume between the exhaust port and the forward portion of the vane will decrease. The exhaust gas is forced into a second stage and is then released by passing through the oil that is contained within the pump’s rear oil reservoir. This configuration is conducive for starting up at atmospheric pressure and working toward pressures usually in the range of 10−3 to 10−4╯torr. The lower stages of vacuum are obtained most often using turbo molecular pumps as illustrated in Figure 1.1. Turbo molecular pumps are not as rugged as the mechanical pumps described previously and need to be started in a reduced pressure environment. Typically, a mechanical pump will perform the initial evacuation of an area. When a certain level of vacuum is obtained, the turbo molecular pumps will then turn on and bring the pressure to higher vacuum. Using a mechanical vane pump to provide a suitable forepump pressure for the turbo molecular pump is known as roughing or “rough out” the chamber. Therefore, two-stage rotary vane mechanical pumps are often referred to as rough pumps. As illustrated in Figure 1.1, the turbo molecular pump contains a series of rotor/stator pairs that are mounted in multiple stages. The principle of turbomolecular pumps is to transfer energy from the fast rotating rotor (turbo molecular pumps operate at very high speeds) to the molecules that make up the gas. After colliding with the blades of the rotor, the gas molecules gain momentum and move to the next lower stage of the pump and repeat the process with the next rotor. Eventually, the gas molecules enter the bottom of the pump and exit through an exhaust port. As gas molecules are removed from the head or beginning of the pump, the pressure before the pump is continually reduced as the gas is removed through the pump, thus achieving higher and higher levels of vacuum. Turbo molecular pumps can obtain much higher levels of vacuum (up to 10−9╯torr) as compared with the rotary vane mechanical pumps (up to 10−4╯torr). The final component of the mass spectrometer is a data processing system. This is typically a personal computer (PC) allowing the mass spectrometric instrumentation to be software controlled, enabling precise measurements of carefully designed experiments and the collection of large amounts of data. Commercially bought mass spectrometers will come with its own software that is used to set the operating parameters of the mass spectrometer and to collect and interpret the data, which is in the form of mass spectra.
OVERVIEW OF NANO-ELECTROSPRAY/NANOFLOW LC-MSâ•…â•… 7
1.3.3 ESI ESI is a process that enables the transfer of compounds in solution phase to the GP in an ionized state, thus allowing their measurement by MS. The use of ESI coupled with MS was pioneered by Whitehouse et al.3 and Fenn4 in 1985 and 1993 by extending the work of Dole et al.5 in 1968, who demonstrated the production of GP ions by spraying macromolecules through a steel capillary that was electrically charged and subsequently monitoring the ions with an ion-drift spectrometer. The process by which ESI works has received much theorization, study, and debate,6–12 in the scientific community, especially the formation of the ions from the Taylor13 cone droplets and offspring droplets. Figure 1.2
HPLC column
HPLC pump
Data collection and processing station (PC) m/z N2 Curtin gas Electrospray capillary
Electron multiplier detector
N2 Electrospray sleave (3–5 kV)
Electrospray counter electrode (600 V)
Mass spectrometer at 10–5 torr
Electrospray Taylor cone at 760 torr Figure 1.2.╇ General setup for ESI when measuring biomolecules by electrospray mass spectrometry.
8â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
Taylor Cone
ESI Droplets
Spray Needle 2–5 kV
Metal Plate ~100 V Mass Spectrometer
ESI Solution
Oxidation
Excess Charge on Surface
Reduction
Solvent and Neutralized Ions
Spray Current (i)
+
2–5 kV Power Supply
_
Figure 1.3.╇ Electrospray ionization process illustrated in positive ion mode. (Reprinted with permission of John Wiley & Sons, Inc. Cech, N.B., and Enke, C.G. Practical implications of some recent studies in electrospray ionization fundamentals. Mass Spectrometry Reviews, 2001, 20, 362–387. Copyright 2001.)
shows the general setup for ESI when measuring biomolecules by electrospray MS. The electrospray process is achieved by placing a potential difference between the capillary and a flat counter electrode. This is illustrated in Figure 1.3 where the “spray needle” is the capillary and the “metal plate” is the flat counter electrode. The generated electric field will penetrate into the liquid meniscus and create an excess abundance of charge at the surface. The meniscus becomes unstable and protrudes out, forming a Taylor cone. At the end of the Taylor cone, a jet of emitting droplets (number of drops estimated at 51,250 with radius of 1.5╯µm) will form that contains an excess of charge. Pictures of jets of offspring droplets are illustrated in Figure 1.4. As the droplets move toward the counter electrode, a few processes take place. The drop shrinks due to evaporation, thus increasing the surface charge until columbic repulsion is great enough that offspring droplets are produced. This is known as the Rayleigh limit, producing a columbic explosion. The produced offspring droplets have 2% of the parent droplets’ mass and 15% of the parent droplets’ charge. This process will continue until the drop contains one molecule of analyte and charges that are associated with basic sites (positive ion mode). This is referred to as the
OVERVIEW OF NANO-ELECTROSPRAY/NANOFLOW LC-MSâ•…â•… 9
(a)
(b)
Figure 1.4.╇ Pictures illustrating the jet production of offspring droplets. (Reprinted with permission from Alessandro Gomez, Physics of Fluids, 6, 404 (1994). Copyright 1994, American Institute of Physics.)
“charged residue model” that is most important for large molecules such as proteins. This process is illustrated in Figure 1.5. As the droplets move toward the counter electrode, a second process also takes place known as the “ion evaporation model.” In this process, the offspring droplet will allow evaporation of an analyte molecule from its surface along with charge when the charge repulsion of the analyte with the solution is great enough to allow it to leave the surface of the drop. This usually takes place for droplets with a radius that is less than 10╯nm. This type of ion formation is most important for small molecules.
10â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
Taylor cone “budding” sample in solution
+HV
+ + +
+ + + + + +
+
+
+
+
+ + +
+ + +
+ + +
+ + +
+ +
+ + +
+
+ +
+
+
+
droplet fission at Rayleigh limit
+ +
+
+
solvent evaporation
+
+ +
[M+nH]n+
+
+ +
+
+
formation of desolvated ions by further droplet fission and/or ion evaporation
Figure 1.5.╇ Gas-phase ion formation process from electrospray droplets. (Reprinted from Gaskell, S.J. Electrospray: principles and practice. J. Mass Spectrom. 1997, 32, 677–688. Copyright John Wiley & Sons Limited 1997. Reproduced with permission.)
In the ensuring years since its introduction, electrospray MS has been used for structural elucidation and fragment information,14–16 and noncovalent complex studies,17,18 just to name a few recent examples of its overwhelmingly wide range of applications. Electrospray3,4,7,8 is an ionization method that is now well known to produce intact GP ions with very minimal, if any, fragmentation being produced during the ionization process. In the transfer process of the ions from the condensed phase to the GP, several types of “cooling” processes of the ions are taking place in the source: (1) cooling during the desolvation process through vibrational energy transfer from the ion to the departing solvent molecules, (2) adiabatic expansion of the electrospray as it enters the first vacuum stage, (3) evaporative cooling, and (4) cooling due to low-energy dampening collisions with ambient gas molecules. The combination of these effects, and the fact that
OVERVIEW OF NANO-ELECTROSPRAY/NANOFLOW LC-MSâ•…â•… 11
electrospray can effectively transfer a solution phase complex to the GP with minimal interruption of the complex, makes the study of noncovalent complexes from solution by ESI-MS attractive. 1.3.4 Nano-ESI A major application of biomolecule analysis using MS has been the ability to allow liquid flows to be introduced into the source of the mass spectrometer. This has enabled the coupling of HPLC to MS where HPLC is used for a wide variety of biomolecule analysis. Normal ESI, introduced in the preceding section, typically has flow rates in the order of microliters per minute (∼1–500╯µL/min). Traditional analytical HPLC systems designed with UV/Vis detectors generally employ flow rates in the range of milliliters per minute (∼0.1–1╯mL/min). A recent advancement in the ESI technique has been the development of nanoelectrospray where the flows employed are typically in the range of nanoliters per minute (∼1–500╯nL/min). Following the progression of the development of electrospray from Dole’s original reporting in 1968 through Fenn’s work reported in 1984 and 1988, a more efficient electrospray process was reported by Wilm et al.19 employing flows in the range of 25╯nL/min. This early reporting of low flow rate electrospray was initially termed as microelectrospray by Wilm et al. but was later changed to nano-electrospray.20 At the same time that Wilm et al.19 reported the microelectrospray, Caprioli et al.21 also reported a miniaturized ion source that they had named microelectrospray. The name “nanoelectrospray” for Wilm’s source is actually more descriptive due to flow rates used in the nanoliter per minute range and the droplet sizes that are produced in the nanometer range. Conventional electrospray sources before the introduction of nano-electrospray produced droplets on the order of 1–2╯µm. The nano-electrospray source produces droplets in the size range of 100–200╯nm, which is 100–1000 times smaller in volume. When spraying standard solutions at concentrations of 1╯pmol/µL, it is estimated that droplets of the nanometer size contain only one analyte molecule per droplet. The original nano-electrospray sources that were used were composed of pulled fused-silica capillary tips 3–5╯cm long with orifices of 1–2╯µm in diameter. The tips also have thin gold plating that allows current flow. The tips are loaded with 1–5╯µL of sample directly using a pipette22 and coupled to the electrospray source, completing the closed circuit required for the production of the applied voltage electrospray Taylor cone generation. This is illustrated in Figure 1.6 where in the top portion of the figure a sample is being loaded into the
12â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
(1) Load sample directly into nanospray tip
High-voltage (HV) power supply Current meter
m/z (2) Place loaded nanospray tip into electrospray HV source
Resistor
(3) Apply HV to produce electrospray process
(4) Measure m/z values with mass spectrometer
Figure 1.6.╇ Top of figure illustrates the loading of a nano-electrospray tip. Bottom of figure illustrates the coupling of the nano-electrospray tip to the closed-circuit system.
nanospray tip using a pipette. The tip is then placed into the closedcircuit system for the electrospray to take place. The sample flow rate is very low using the nanospray tips allowing the measurement of a very small sample size over an extended period of time. It has also been observed that nanospray requires a lower applied voltage for the production of the electrospray that helps to reduce problems with corona electrical discharges that will interrupt the electrospray. In nanoelectrospray, the flow rate is lower than in conventional electrospray and is felt to have a direct impact on the production of the droplets within the spray and the efficiency of ion production. The lower flow rate produces charged droplets that are reduced in size as compared with conventional electrospray. This has been described in detail by Wilm et al.,19 by Fernandez de la Mora et al.,23 and by Pfeifer and Hendricks.24 There are fewer droplet fission events required with smaller
OVERVIEW OF NANO-ELECTROSPRAY/NANOFLOW LC-MSâ•…â•… 13
(a)
(b)
1 µm i.d.
(c)
2 µm i.d.
5 µm i.d.
Figure 1.7.╇ Illustration of different nano-electrospray tip orifice diameters. Scanning electron microscopy images of employed nanospray emitters: (a) 1-, (b) 2-, and (c) 5-µm tip. Images were obtained after 2 hours of use. (Reprinted with permission from Li, Y.; Cole, R.B. Shifts in Peptide and Protein Charge State Distributions with Varying Spray Tip Orifice Diameter in Nano-Electrospray Fourier Transform Ion Cyclotron Resonance Mass Spectrometry. Anal. Chem. 2003, 75, 5739–5746. Copyright 2003 American Chemical Society.)
(a)
(b)
>1 µm 50 µm
(c)
500 µm (d)
12.5 µm 50 µm
500 µm
Figure 1.8.╇ Examples of nanospray tip sizes and the influence upon the ESI Taylor cone. The cone is not observed in (b) at a diameter of >1╯µm. The cone is observed in (d) for a diameter of 12.5╯µm. (Reprinted with permission. This article was published in J Am Soc Mass Spectrom, Schmidt, A., Karas, M. Effect of different solution flow rates on analyte ion signals in nano-ESI MS, or: when does ESI turn into nano-ESI?, 2003, 14, 492–500. Copyright Elsevier 2003.)
initial droplets in conjunction with less solvent evaporation taking place before ion release into the GP.25,26 A result of this is that a larger amount of the analyte molecule is transferred into the mass spectrometer for analysis. Though the efficiency of ionization is increased with nano-electrospray, the process is also influenced by the size and shape of the orifice tip.27,28 Pictures of nano-electrospray orifice tips are illustrated in Figure 1.7. Figure 1.8 shows an example of the production and observance of and ESI Taylor core.
14â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
Figure 1.9.╇ Photograph of nine stable electrosprays generated from the nine-spray emitter array. (Reprinted with permission from Tang, K.; Lin, Y.; Matson, D.W.; Kim, T.; Smith, R.D. Anal. Chem. 2001, 73, 1658–1663. Copyright 2001 American Chemical Society.)
While Figure 1.8d does show a Taylor cone formed, Figure 1.9 gives a good picture of an array of Taylor cones formed from a microelectrospray emitter. In the figure, multiple cones can be seen along with their associated spray produced from the electrospray process. As mentioned previously, nano-HPLC is increasingly being coupled to nano-electrospray for biomolecule analysis. A nano-HPLC-ESI system is illustrated in Figure 1.10. The flow involved in nano-HPLCESI often ranges between 10 and 100╯nL/min. The fused-silica capillary columns that are used in nano-HPLC have very small diameters, often around 50╯µm. These small-diameter columns can often create high back pressures in the HPLC system. One way to achieve the very low flow rate through the fused-silica nano-HPLC column is to use a flow splitter that is located in-stream between the column and the HPLC pump as illustrated in Figure 1.10. The tubing from the splitter to waste is called a restrictor and is used to regulate the flow through the nanocolumn. A smaller-diameter restrictor used will increase the back pressure, forcing more mobile phase through the nanocolumn. If a larger diameter restrictor is used, the back pressure will be lower, resulting is less flow being directed through the column. The nanocolumns have a nano-ESI tip coupled to them (diameters can range from 1╯µm up to 100╯µm) to produce the electrospray. Another difference observed here as compared with the atmospheric pressure source is the absence of a nebulizing gas or a drying gas. These are not needed or used in nano-ESI.
OVERVIEW OF NUCLEIC ACIDSâ•…â•… 15
Restrictor split to waste
Flow splitter
m/z
Fused-silica (packed) capillary column
Figure 1.10.╇ Design of a nano-HPLC nano-ESI system for mass spectrometric analysis of biomolecules.
1.4 OVERVIEW OF NUCLEIC ACIDS Nucleic acids are an important consideration in PTM study; they are present in cellular protein extracts and must be separated. In traditional studies, the Trizol precipitation method was used to isolate a nucleic acid fraction and a protein fraction. This afforded the opportunity to study both from an extraction. Due to their close relationship and importance, we will look at a brief overview of nucleic acids and their measurement by MS. Nucleic acids are also analyzed using mass spectrometric techniques, and we will start with a background look at the makeup of nucleic acids before looking at the MS. In contrast to polysaccharides, and similar to proteins, nucleic acids are specifically directional in their makeup and contain nonidentical monomers that have a distinct sequence that produce informational macromolecules. The nucleic acids reside in the nucleus of the cell and are the storage, expression, and transmission of genetic information of living species. The two types of nucleic acids are
16â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
DNA and RNA. There are two distinct parts of their chemical structure and makeup that differentiate the two. First, DNA contains the fivecarbon sugar deoxyribose, while RNA contains ribose; and second, DNA contains the base thymine (T), while RNA contains uracil (U). The molecules that make up the DNA and RNA structures are illustrated in Figure 1.11. This constitutes the purine bases adenine (A) and guanine (G), the pyrimidine bases cytosine (C), uracil (U), and thymine. Also illustrated in Figure 1.11 are the two sugars d-deoxyribose and d-ribose and finally, the phosphate group that acts as the backbone of the nucleic acids linking the nucleotides together. Nucleotides are the monomeric units that make up the nucleic acids. There are actually only four nucleotides that make up DNA and RNA, a much smaller number than the 20 amino acids found in proteins. Examples of nucleotides found in DNA and RNA are illustrated in Figure 1.12. Figure 1.12a is a DNA nucleotide where the number 2′ carbon in the sugar ring contains a hydrogen atom for d-deoxyribose. One of the bases will be attached to the 1′ carbon of the sugar through an aromatic nitrogen, and the phosphate will be attached to the number 5′ sugar carbon with a phosphoester bond. The RNA nucleotide illustrated in Figure 1.12b has the same types of bonding as illustrated for the DNA nucleotide but to a d-ribose sugar. In the case that the phosphate group is removed from the nucleotide, the remaining base sugar structure is called a nucleoside. The nucleotides are linked to each other through the phosphate group, forming a linear polymer. The nucleotides undergo a condensation reaction through the linking of the phosphate group on the 5′ carbon to the 3′ carbon of the next nucleotide known as a 3′,5′ phosphodiester bond. The resulting polynucleotide therefore has a 5′ hydroxyl group at the start (by convention) and a 3′ hydroxyl group at the end (by convention). Representative linear nucleotide structures are illustrated for RNA and DNA in Figure 1.13. A similar naming scheme that is used for the fragmentation ions generated by collision-induced dissociation (CID) of peptides was proposed by Glish et al.29 for nucleic acids and is illustrated in Figure 1.14. There are four cleavage sites producing fragmentation along the phosphate backbone from CID. When the product ion contains the 3′-OH portion of the nucleic acid, the naming includes the letters w, x, y, and z, where the numeral subscript is the number of bases from the associated terminal group. When the product ion contains the 5′-OH portion of the nucleic acid, the naming includes the letters a, b, c, and d. Losses are also more complicated than that shown in Figure 1.14 due to the neutral loss of base moieties. Figure 1.15 illustrates an actual structural
OVERVIEW OF NUCLEIC ACIDSâ•…â•… 17
Pyrimidines
Purines NH2
N
N
N H
O
NH2
NH
N
N
N H
Adenine (A)
N H
O
Uracil (U) (in RNA)
Cytosine (C)
O
O
N
NH
NH
N H
O
N
N H
NH2
O
Thymine (T) (in DNA)
Guanine (G)
Sugars
HO
OH O H
H
OH
OH
H
O H
D-ribose
O
P
O
(in RNA) O
HO
OH
Phosphate group
O H
H
OH
H
H
H
D-deoxyribose
(in DNA)
Figure 1.11.╇ Structures of the molecules that make up the nucleic acids DNA and RNA. The purine bases adenine (A) and guanine (G), the pyrimidine bases cytosine (C), uracil (U), and thymine, the sugars d-deoxyribose and d-ribose, and the phosphate group.
18â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS (a)
NH2
N
Phosphodiester bond
N
O -O
N
P
N
5′
O
O
4′
O-
H
2′H
3′
H
1′
H
OH
H
(b)
NH2
N
N
O -O
P
O
O
O
O-
H
H
OH
OH
H
H
Figure 1.12.╇ (a) DNA nucleotide and (b) RNA nucleotide.
cleavage at the w2/a2 site of a 4-mer nucleic acid’s phosphate backbone according to the naming scheme of Figure 1.14. Numerous mechanisms have also been reported for the fragmentation pathways leading to charged base loss and also neutral base loss. These are losses that are observed in product ion spectra other than the cleavage along the phosphate backbone that is illustrated in Figures 1.14 and 1.15. Neutral and charged base losses add to the complexity of the product ion spectra but also add information concerning the makeup of the oligonucleotide. Figure 1.16 illustrates a couple examples of proposed fragmentation pathway mechanisms for neutral and charged base losses. In Figure 1.16a, a simple nucleophilic attack on the C-1′ carbon atom by the phosphodiester group results in the elimination of a charged base.30 Figure 1.16b illustrates a two-step reaction where in the first step there is neutral base loss followed by breakage of the 3′-phosphoester bond.31 There are other proposed fragmentation pathways for a number of other possible mechanisms for the production of the product ions observed in tandem mass spectra of the nucleic acids.
OVERVIEW OF NUCLEIC ACIDSâ•…â•… 19 O
(a)
N
Guanine
NH
N
NH2
N O
HO
O H
H
H
H
N
O O
Thymine
NH H O
NH2 O
P
O
H
N
N
N NH2
P
O
O H
O-
H
H
N H
H
N
O O
O
P
H
O
(b) NH
N
N
Cytosine O
O
O-
N
Adenine
H
H
O O
N
H
H
O-
H
H
OH
H
H
Guanine NH2
NH2
HO
O H
H
N
O O
Cytosine
N
H OH
H
P
O
O NH2
O H
H
O-
N
N
H OH
H
N
O
Adenine
N O
O
P
O
O H
H
OH
N
O O
P
Uracil
NH
H OH
O
O
O
OH
H
H
OH
H OH
Figure 1.13.╇ Linear nucleic acid structures for (a) DNA and (b) RNA.
20â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS w3 x 3
y3
z3
B1
w2 x 2
B2
O
O
P
1
b1
c1
O
O O
P
O
O-
d1
y1 z 1 B4
O
O-
a
w1 x 1
B3
O
5′ HO
y2 z 2
a2 b 2
c 2 d2
OH 3′
O
P O-
a
3 b3
c3
d3
Figure 1.14.╇ Naming scheme for nucleic acid product ions. When the product ion contains the 3′-OH portion of the nucleic acid, the naming includes the letters w, x, y, and z. When the product ion contains the 5′-OH portion of the nucleic acid, the naming includes the letters a, b, c, and d. In both 3′-OH and 5′-OH containing product ions, the numeral subscript is the number of bases from the associated terminal group.
1.5 PROTEINS AND PROTEOMICS 1.5.1 Introduction to Proteomics Proteomics, the study of a biological system’s compliment of proteins (e.g., from cell, tissue, or a whole organism) at any given state in time, has become a major area of focus for research and study in many different fields and applications. In proteomic studies, MS can be employed to analyze either the intact, whole protein or the resultant peptides obtained from enzyme-digested proteins. The mass spectrometric analysis of whole, intact proteins is often called top-down proteomics where the measurement study starts with the analysis of the intact protein in the GP and subsequently investigating its identification and any possible modifications through CID measurements. The mass spectrometric analysis of enzyme-digested proteins that have been converted to peptides is known as bottom-up proteomics. Finally, MS is also used to study PTMs that have taken place with the proteins such as glycosylation, sulfation, and phosphorylation. We shall begin with a look at bottom-up proteomics, the most common approach, followed by topdown proteomics, which is seeing more applications and study lately, and finally, the PTMs of glycosylation, sulfation, and phosphorylation. Bioinformatics has become an important tool used in the interpretation of results obtained from MS studies. In the last part of this chapter, we will briefly look at what bioinformatics is and what it can be used for in relation to MS and proteomic studies. Due to the enormous impact
PROTEINS AND PROTEOMICSâ•…â•… 21 O N
NH
N
N
NH2 O
HO
O H
H
H
H
NH H N
O
O NH2
O
O
P
O H
OH
a2
N
H H
H
P
N
w2
O O
N
O
N NH2
O H
O-
H
N
H
H
H
N
O O
P
O
O
OH
O N
O
H
H
OH
H
H
NH NH2
N
NH2
N
N
O HO
N
O H
H
H
H
NH
P O-
N NH2
N
O O
N
O
H O
HO
P
O
O H
O-
O
O H
H H
a2
H
H
H
N H N
O
O
H O
P
O
O
OH
H
H
OH
H
H
w2 Figure 1.15.╇ Example of structural cleavage at the w2/a2 site of a 4-mer nucleic acid’s phosphate backbone according to the naming scheme of Figure 1.14.
22â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS (a)
O
O N
N
O O
P
N
NH
N
N
NH 2 O
O
O
O
O O
O
O
O
O
(b) N
N
O
P
O
O H O
O
O P O
H
H
P
O
+ O
+ O
O
H
O H
O
H
O O
P
O
P
NH2
N
O
H
H
O
O
N H
NH 2
OH
NH
NH
N
O
O
O
N
O P
O O
O
NH2
N
+
O
P
P O
NH
H
O
P O
O
O
O H H
H
O
Figure 1.16.╇ Proposed fragmentation pathways associated with the base substituent groups. (a) Nucleophilic attack on the C-1′ carbon atom by the phosphodiester group results in the elimination of a charged base. (b) Two-step reaction mechanism where in the first step there is neutral base loss followed by breakage of the 3′-phosphoester bond.
of proteomics on research into biological processes, organisms, diseased states, tissues, and so on, we will begin this section starting with a brief overview of proteins including their structure and makeup. 1.5.2 Protein Structure and Chemistry Of all biological molecules, proteins are one of the most important, next only to the nucleic acids. All living cells contain proteins, and their name is derived from the Greek word proteios, which has the meaning of “first.”32 There are two broad classifications for proteins related to their structure and functionality: water-insoluble fibrous proteins and water-soluble globular proteins. The three-dimensional configuration of a protein is described by its primary, secondary,
PROTEINS AND PROTEOMICSâ•…â•… 23
Figure 1.17.╇ Ribbon structure representation of the RNase protein illustrating substructures of alpha helices and beta sheets.
tertiary, and quaternary structures. Figure 1.17 is a three-dimensional ribbon representation of the protein RNase. The primary structures of proteins are made up of a sequence of amino acids forming a polypeptide chain. Typically, if the chain is less than 10,000╯ Da, the compound is called a polypeptide; if greater than 10,000╯Da, the compound is called a protein. There are 20 amino acids that make up the protein chains through carbon to nitrogen peptide bonds. Figure 1.18 illustrates the 20 amino acid structures that make up the polypeptide backbone chain of proteins. Amino acids possess an amino group (NH2) and a carboxyl group (COOH) that are bonded to the same carbon atom that is alpha to both groups; therefore, amino acids are called alpha amino acids (α-amino acid). At physiological pH (∼7.36), the amino acids can be subdivided into four classes according to their structure, polarity, and charge state: (1) negatively charged composed of aspartic acid (Asp) and glutamic acid (Glu); (2) positively charged composed of lysine (Lys), arginine (Arg), and histidine (His); (3) polar composed of serine (Ser), threonine (Thr), tyrosine (Tyr), cysteine (Cys), glutamine (Gln), and asparagine (Asn); and (4) nonpolar composed of glycine (Gly), leucine (Leu), isoleucine (Ile), alanine (Ala), valine (Val), proline (Pro), Met, tryptophan (Trp), and phenylalanine (Phe). The carbon to nitrogen peptide bonds are formed through condensation reactions between the carboxyl and amino groups. An example condensation reaction between the amino acids Leu and Tyr is illustrated in Figure 1.19. The peptide C-N bonds are found to be
24â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS Positively charged
Negatively charged O O H3N
CH
H3N
O
C
CH2 C
O
CH
O
O
C
O
H3N
CH
CH2
CH2
CH2
CH2
C
H3N
O
CH
O
C
O
H3N
CH
CH2
C
O
CH2
CH2 HN
C
O
CH2
CH2
O
NH O
NH
CH2
Aspartic acid (Asp) Glutamic acid (Glu)
C
NH3
NH2
NH2
Lysine (Lys)
Arginine (Arg)
Histidine (His)
Polar O H3N
CH
C
O
H3N
CH2 OH
CH
C
CH
OH
H3N
O
CH
C
H3N
O
CH
CH2
CH3
C
H3 N
O
CH
CH2
CH2
SH
CH2
Cysteine (Cys)
Threonine (Thr)
Serine (Ser)
O
O
O
O
C
C
O
O
O OH H3N
CH
C
Glutamine (Gln)
O
OH
Tyrosine (Tyr)
CH2 C
O
OH
Asparagine (Asn)
Nonpolar O
O H3N
CH
C
H3N
O
CH
C
O
H3 N
CH2
H
Glycine (Gly)
CH
CH3
C
CH
CH3
H3 N
CH3
CH
CH3
Alanine (Ala)
CH3
C
O
CH
O
Valine (Val)
C
O
O O
H3 N
CH
CH2 H2N
C
CH
Isoleucine (Ile) O
O O
H3N
CH
H3N
O
CH3
Leucine (Leu)
C
CH
CH2
CH3
O
O
O
C
O
CH2
H3N
CH
C
O
CH2
CH2 S
Proline (Pro)
HN CH3
Methionine (Met)
Tryptophan (Trp)
Phenylalanine (Phe)
Figure 1.18.╇ Structures of the 20 amino acids that make up the polypeptide backbone of proteins. Divisions include negatively charged, positively charged, polar, and nonpolar.
PROTEINS AND PROTEOMICSâ•…â•… 25 O
O
O
O H3 N
CH
C
CH2 CH
H3 N
O
+
CH CH2
C
O
H3 N
CH
N H
CH2 CH
CH3
C
CH3
CH
C
O
CH2
+
H2O
CH3
CH3
OH OH
Leucine
Tyrosine
Figure 1.19.╇ Condensation reaction between the amino acids leucine and tyrosine forming a peptide bond.
shorter than most amine C–N bonds due to a double-bond nature that contributes to 40% of the peptide bond.33 This double-bond character lessons the free rotation of the bond, thus affecting the overall structure of the protein.34 The secondary structure of the protein is described by two different configurations and turns. The two configurations are α-helices (first proposed by Linus Pauling and Robert B. Corey in 1951) and β-sheets (parallel and antiparallel) and are illustrated in Figure 1.20. The α-helix is described as a right-hand-turned spiral that has hydrogen bonding between oxygen and the hydrogen of the nitrogen atoms of the chain backbone. This hydrogen bonding stabilizes the helical structure. The R-group side chains that make up the amino acid residues extrude out from the helix. The β-sheet is a flat structure that also has hydrogen bonding between oxygen and the hydrogen of the nitrogen atoms but from different β-sheets (parallel and antiparallel) that run along side each other. These hydrogen bonds also work to stabilize the structure. The R-group side chains alternatively extrude out flat with the sheet from the sides of the sheet. The third secondary structure, the turn, basically changes the direction of the polypeptide strand. The tertiary structure, which includes the disulfide bonds, is composed of the ordering of the secondary structure, which is stabilized through side chain interactions. The quaternary structure is the arrangement of the polypeptide chains into the final working protein. All four structures describe what is actually a folded protein, where the apolar regions of the protein are tucked away inside the structure, away from the aqueous medium they are found in naturally, and more polar regions are on the surface.
26â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS (a) α-Helix O
R
O
O
CH R
CH
NH2
H
O N H
C
O
N
CH
C
O
N
CH
C
R
R
O
H
C
O
H N
O
CH R
C
O
H N
O R CH
C
O
O O O
HO
H N
CH
C
C
O
H N
CH
O
C
CH
N H
R R
R
NH2
(b) Parallel β-sheets
R NH2 R
C
CH O
O
R
R
H R
R
H R
R
O
H R
R
O
N CH
N
C
O
O C
HO C
H
O
CH
N CH
N
O C
O
O
CH C
O C
H
CH
CH
HO
O
N
N
H
C
O
O
N
O C
O
O
CH
CH
CH
H
R N
N
C
H
O
O
O
R
C O
C
O
CH
CH
CH
H
R N
N
C
R
H
NH2
O
O C
H
(c) Antiparallel β-sheets
CH
H
O R
N CH
O
C
O
HO
Figure 1.20.╇ (a) α-Helices. β-Sheets: (b) parallel and (c) antiparallel.
PROTEINS AND PROTEOMICSâ•…â•… 27
1.5.3 Bottom-Up Proteomics: MS of Peptides 1.5.3.1 History and Strategy.╇ The proteomic approach was composed of measuring the enzymatic products of the protein digestion (after protein extraction from the biological sample), namely the peptides, using MS is known as bottom-up proteomics. In the bottom-up approach using nano-ESI-HPLC/MS, the peptides are chromatographically separated and subjected to collision-induced dissociation in the GP. The product ion spectra thus obtained of the separated peptides are then used to identify the proteins present in the biological system being studied. Prior to the use of nano-ESI-HPLC/MS for peptide measurement, Edman degradation was used to sequence unknown proteins. The method of Edman sequencing involves the removal of each amino acid residue one by one from the polypeptide chain starting from the N-terminus of the peptide or protein.35 The method worked well for highly purified protein samples that contained a free amino N-terminus, but the analysis was slow, usually taking a day to analyze the sequence of one protein. MS was first coupled with Edman sequencing in 1980 by Shimonishi et al.,36 where the products of the Edman degradation were measured using field desorption (FD) MS. FD, introduced in 1969 by Beckey, is an ionization technique not commonly in use today. FD consists of depositing the sample, either solid or dissolved in solvent, onto a needle and applying a high voltage. The process of desorption and ionization are obtained simultaneously. The analyte ions produced from FD are then introduced into the mass spectrometer for mass analysis. Fast atom bombardment was also used as an ionization technique to measure peptides obtained from the Edman sequencing approach.37 Another early approach to proteomics using MS was the application of MALDI time-of-flight (TOF) MS (MALDI-TOF/MS) to the measurement of peptides obtained from in-gel digestions of proteins separated by gel electrophoresis. This technique was reported by several groups and was called peptide mass fingerprinting (PMF).38–40 In the PMF approach, proteins are first separated using two-dimensional gel electrophoresis (2-DE), a protein separation technique first introduced in the 1970s.41 The gel used in electrophoresis is a rectangular gel composed of polyacrylamide. The protein sample is loaded onto the gel and the proteins are separated according to their isoelectric point (pH where the protein has a zero charge). This is the first dimension of the separation. The second dimension is a linear separation of the proteins according to their molecular weights. In preparation for sodium dodecylsulfate polyacrylamide gel electrophoresis (SDS-PAGE), the proteins
28â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
are first denatured (usually with 8╯M urea and boiling) and sulfide bonds are cleaved effectively unraveling the tertiary and secondary structure of the protein. SDS, which is negatively charged, is then used to coat the protein in a fashion that is proportional to the proteins’ molecular weight. The proteins are then separated within a polyacrylamide gel by placing a potential difference across the gel. Due to the potential difference across the gel, the proteins will experience an electrophoretic movement through the gel, thus separating them according to their molecular weight with the lower-molecular-weight proteins having a greater mobility through the gel and the higher molecular weight proteins having a lower mobility through the gel. The resultant 2-DE separation is a collection of spots on the gel that can be up to a few thousand in number. In 2-D SDS-PAGE, the proteins have been essentially separated into single protein spots. This allows the digestion of the protein within the spot (excised from the gel) using a protease with known cleavage specificity into subsequent peptides that are unique to that particular protein. The peptides extracted from the in-gel digested proteins separated by 2-D SDS-PAGE are then measured by MALDI-TOF/MS, creating a spectrum of peaks that represent the molecular weight of the protein’s enzymatic generated peptides. This list of measured peptides can be compared with a theoretical list according to the specificity of the enzyme used for digestion. There is an extensive list of references and searching software that has been introduced for the PMF approach to proteomics that has been reviewed.42 The 2-D SDS-PAGE and peptide mass fingerprint approach to proteomics is illustrated in Figure 1.21. In bottom-up proteomics, the proteins are generally extracted from the sample of interest, which can include a sample of cultured cells, bacterium, tissue, or a whole organism. A general scheme for the extraction and peptide mass fingerprint mass spectrometric analysis typically followed in early proteomic studies is illustrated in Figure 1.21. The initial sample is lysed and the proteins are extracted and solubilized. The proteins can then be separated using one-dimensional (1-D) or 2-D SDS-PAGE. Proteins can be digested in the gels, or the proteins in solution are digested using a protease such as trypsin. Trypsin is an endopeptidase that cleaves within the polypeptide chain of the protein at the carboxyl side of the basic amino acids Arg and Lys (the trypsin enzyme has optimal activity at a pH range of 7–10 and requires the presence of Ca+2). It has been observed though that trypsin does not efficiently cleave between the residues Lys–Pro and Arg–Pro. Tryptic peptides are predominantly observed as doubly or triply charged when using electrospray as the ionization source. This is due to the amino
PROTEINS AND PROTEOMICSâ•…â•… 29
Eukaryote cell
Prokaryote
Whole organism
Extraction of proteins
1-D and 2-D SDS-PAGE
In-gel or in-solution digestion
SCX fractionation
IMAC/TiO 2
Nano-LC-MS MS/MS spectra Database searching Swiss-Prot
NCBI
Bioinformatics Figure 1.21.╇ General strategy and sample flow involved in proteomics. IMAC, immobilized metal affinity chromatography.
terminal residue being basic in each peptide, except for the C-terminal peptide. There exist a number of proteases that are available to the mass spectrometrist when designing a digestion of proteins into peptides. These can be used to target cleavage at specific amino acid residues within the polypeptide chain. Examples of available proteases and their cleavage specificity are listed in Table 1.1. The enzymes will cleave the proteins into smaller chains of amino acids (typically from five
30╅╅ POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS TABLE 1.1.╇ Examples of Proteases Available for Polypeptide Chain Cleavage Protease
Trypsin Chymotrypsin Proteinase K Factor Xa Carboxypeptidase Y Submaxillary Arg-C protease Staphylococcus aureus V-8 protease Aminopeptidase M Pepsin Ficin Papain
Polypeptide Cleavage Specificity
At carboxyl side of arginine and lysine residues At carboxyl side of tryptophan, tyrosine, phenylalanine, leucine, and methionine residues At carboxyl side of aromatic, aliphatic, and hydrophobic residues At carboxyl side of Glu-Gly-Arg sequence Sequentially cleaves residues from the carboxy (C) terminus At carboxy side of arginine residues At carboxy side of glutamate and aspartate residues Sequentially cleaves residues from the amino (N) terminus Nonspecifically cleaves at exposed residues favoring the aromatic residues Nonspecifically cleaves at exposed residues favoring the aromatic residues Nonspecifically cleaves at exposed residues
amino acid residues up to 100 or so). These short-chain amino acids are mostly water soluble and can be directly analyzed by MS. However, often, a lysis and extract from a biological system will constitute a very complex mixture of proteins that requires some form of separation to decrease the complexity prior to mass spectral measurement. 1.5.3.2 Protein Identification through Product Ion Spectra.╇ More recently, nano-ESI-HPLC-MS/MS has been employed using reversedphase (RP) C18 columns to initially separate the peptides prior to introduction into the mass spectrometer. If a highly complex compliment of digested proteins are being analyzed such as those obtained from eukaryotic cells or tissue, a greater degree of complexity reduction is employed such as strong cation exchange (SCX) fractionation, which can separate the complex peptide mixture up to 25 fractions or more. The coupling of online SCX with nano-ESI C18 RP HPLC-MS/ MS has also been employed and is called 2-D HPLC and multidimensional protein identification technology (MudPIT).43 This is a gel-free approach that utilizes multiple HPLC-MS analysis of in-solution digestions of protein fractions. The separated peptides are introduced into the mass spectrometer, and product ion spectra are obtained. The
PROTEINS AND PROTEOMICSâ•…â•… 31
product ions within the spectra are assigned to amino acid sequences. A complete coverage of the amino acid sequence within a peptide from the product ion spectrum is known as de novo sequencing. This can unambiguously identify a protein (except for a few anomalies that will be covered shortly) according to standard spectra stored in protein databases. Two examples of protein databases are NCBInr, a protein database composed of a combination of most public databases compiled by the National Center for Biotechnology Information (NCBI), and Swiss-Prot, a database that includes an extensive description of proteins including their functions, PTMs, and domain structures. The correlation of peptide product ion spectra with theoretical peptides was introduced by Yates et al. in 1994.44 At the same time, Mann et al.45 proposed a partial sequence error-tolerant database searching for protein identifications from peptide product ion spectra. There exists now a rather large choice of searching algorithms that are available for protein identifications from peptide product ion spectra. A list of identification algorithms and their associated uniform resource locators (URLs) is illustrated in Table 1.2. The final step in the proteomic analysis of a biological system is the interpretation of the identified proteins, which has been called bioinformatics. Bioinformatics attempts to map and decipher interrelationships between observed proteins and the genetic description. Valuable information can be obtained in this way concerning biomarkers for diseased states, the descriptive workings of a biological system, biological interactions, and so on. In the identification of proteins from peptide collision-induced dissociation using MS is performed to fragment the peptide and identify its amino acid residue sequence. In most mass spectrometers used in proteomic studies such as the ion trap, the quadrupole TOF, the triple quadrupole, and the Fourier transform ion cyclotron resonance (FTICR), the collision energy is considered low (5–50╯eV) and the product ions are generally formed through cleavages of the peptide bonds. According to the widely accepted nomenclature of Roepstorff and Fohlman,46 when the charge is retained on the N-terminal portion of the fragmented peptide, the ions are depicted as a, b, and c. When the charge is retained on the C-terminal portion, the ions are denoted as x, y, and z. The description of the dissociation associated with the peptide chain backbone and the nomenclature of the produced ions is illustrated in Figure 1.22. The ion subscript, for example, the “2” in y2, indicates the number of residues contained within the ion, two amino acid residues in this case. The weakest bond is between the carboxyl carbon and the nitrogen located directly to the left in the peptide chain. At low-energy collision-induced dissociation of the peptide in MS, the
TABLE 1.2.╇ List of Identification Algorithms
MS identification algorithms and URLs PMF â•… Aldente http://www.expasy.org/tools/aldente/ â•… Mascot http://www.matrixscience.com/search_form_select.html â•… MOWSE http://srs.hgmp.mrc.ac.uk/cgi-bin/mowse â•… MS-Fit http://prospector.ucsf.edu/ucsfhtml4.0/msfit.htm â•… PeptIdent http://www.expasy.org/tools/peptident.html â•… ProFound http://65.219.84.5/service/prowl/profound.html MS/MS identification algorithms and URLs PFF â•… Phenyx http://www.phenyx-ms.com/ â•… Sequest http://fields.scripps.edu/sequest/index.html â•… Mascot http://www.matrixscience.com/search_form_select.html â•… PepFrag http://prowl.rockefeller.edu/prowl/pepfragch.html â•… MS-Tag http://prospector.ucsf.edu/ucsfhtml4.0/mstagfd.htm â•… ProbID http://projects.systemsbiology.net/probid/ â•… Sonar http://65.219.84.5/service/prowl/sonar.html â•… TANDEM http://www.proteome.ca/opensource.html â•… SCOPE N/A â•… PEP_PROBE N/A â•… VEMS http://www.bio.aau.dk/en/biotechnology/vems.htm â•… PEDANTA N/A De novo sequencing â•… SeqMS http://www.protein.osaka-u.ac.jp/rcsfp/profiling/SeqMS.html â•… Lutefisk http://www.hairyfatguy.com/Lutefisk â•… Sherenga N/A â•… PEAKS http://www.bioinformaticssolutions.com/products/ peaksoverview.php Sequence similarity search â•… PeptideSearch http://www.narrador.embl-heidelberg.de/GroupPages/ Homepage.html â•… PepSea http://www.unb.br/cbsp/paginiciais/pepseaseqtag.htm â•… MS-Seq http://prospector.ucsf.edu/ucsfhtml4.0/msseq.htm â•… MS-Pattern http://prospector.ucsf.edu/ucsfhtml4.0/mspattern.htm â•… Mascot http://www.matrixscience.com/search_form_select.html â•… FASTS http://www.hgmp.mrc.ac.uk/Registered/Webapp/fasts/ â•… MS-Blast http://dove.embl-heidelberg.de/Blast2/msblast.html â•… OpenSea N/A â•… CIDentify http://ftp.virginia.edu/pub/fasta/CIDentify/ Congruence analysis â•… MS-Shotgun N/A â•… MultiTag N/A Tag approach â•… Popitam http://www.expasy.org/tools/popitam/ â•… GutenTag http://fields.scripps.edu/GutenTag/index.html Reprinted with permission of John Wiley & Sons, Hernandez, P., Muller, M., Appel, R.D. Automated protein identification by tandem mass spectrometry: Issues and strategies. Mass Spectrom. Rev. 2006, 25, 235–254.
32
PROTEINS AND PROTEOMICSâ•…â•… 33
primary breakage will take place at the weakest bond, generally along the peptide backbone chain, and produce a, b, and y fragments. Notice that the c ions and the y ions contain an extra proton that they have abstracted from the precursor peptide ion. There has also been a proposed third structure for the b ion that is formed as a protonated oxazolone, which is suggested to be more stable through cyclization47 (see b2 ion in Fig. 1.23). The stability of the y ion can be attributed to the N-terminus x3
y3
R1
O
C H
C
H 2N
a1
H 2N
z3
N H b1
R1
O
C H
C
x2
y2
R2
O
C H
C
c1
a2
z2
N H b2
x1 R3
O
C H
C
c2
a3
R2 N H
O
CH
C
y1
N H b3
O
C H
C
a2
H 2N
R1
O
C H
C
R4
O
C H
C
OH
c3
R3 N H
C-terminus
z1
R4
O
C H
C
N H
OH
x2 R2 N H
C H
C
O
H 3N
R3
O
C H
C
N H
R4
O
C H
C
OH
y2 R1
O
C H
C
R2 R3
H 2N
N H
C H
C
HC
b2
H 2N
R1
O
C H
C
O
R4
O
C H
C
O C
N H
OH
z2
N H
R2
O
C H
C
NH 3
c2 Figure 1.22.╇ Dissociation associated with the peptide chain backbone and the nomenclature of the produced ions. Charge retained on the N-terminal portion of the fragmented peptide the ions are depicted as a, b, and c. Charge retained on the C-terminal portion the ions are denoted as x, y, and z. Ion subscript, for example, “2” in y2, indicates the number of residues (two) contained within the ion.
34â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
H 3N
R1
O
C H
C
N H
R2
O
C H
C
N H
R3
O
C H
C
N H
R4
O
C H
C
OH
Mobile proton
H 2N
R1
O
C H
C
R2 N H
C H
C
H
R3
O
N H
C H
C
N H
R4
O
C H
C
OH
O
R1 CH
H 2N
O
H 2N
C
O
C H
C
N H
R4
O
C H
C
R4
O
C H
C
OH
O
C HN
R3
CH R2 O C
O R1
CH
C CH
R2
N
H 2N
H H 2N
R3
O
C H
C
N H
OH
O C
O R1
CH
C CH H 2N
N
R3
O
C H
C
R4
O
C H
C
R2 H 3N
N H
OH
H
b2 ion
y2 ion
Figure 1.23.╇ Fragmentation pathway leading to the production of the b and y ions from collision-induced dissociation from the polypeptide backbone chain.
PROTEINS AND PROTEOMICSâ•…â•… 35
transfer of the proton that is producing the charge state to the terminal nitrogen, thus inducing new bond formation and a lower energy state. The model that describes the dissociation of protonated peptides during low-energy collision-induced excitation is called the “mobile proton” model.48 Peptides fragment primarily from charge-directed reactions where protonation of the peptide can take place at side chain groups, amide oxygen and nitrogen, and at the terminal amino acid group. On the peptide chain backbone, protonation of the amide nitrogen will lead to a weakening of the amide bond inducing fragmentation at that point. However, it is more thermodynamically favored, as determined by molecular orbital calculations,48,49 for protonation to take place on the amide oxygen, which also has the effect of strengthening the amide bond. Inspection of peptide product ion fragmentation spectra has demonstrated though that the protonating of the amide nitrogen is taking place over the protonating of the amide oxygen. This is in contrast to the expected site of protonation from a thermodynamic point of view that indicates the amide oxygen protonation and not the amide nitrogen. This discrepancy has been explained by the “mobile proton model,” introduced by Wysocki et al.,48,50 which describes that the proton(s) added to a peptide, upon excitation from CID, will migrate to various protonation sites provided they are not sequestered by a basic amino acid side chain prior to fragmentation. The fragmentation pathway leading to the production of the b and y ions is illustrated in Figure 1.23. The protonation takes place first on the N-terminus of the peptide. The next step is the mobilization of the proton to the amide nitrogen of the peptide chain backbone where cleavage is to take place. The protonated oxazolone derivative is formed from nucleophilic attack by the oxygen of the adjacent amide bond on the carbon center of the protonated amide bond. Depending on the location of the retention of the charge, either a b ion or a y ion will be produced. Besides the amide bond cleavage producing the b and y ions that are observed in low-energy collision product ion spectra, there are also a number of other product ions that are quite useful in peptide sequence determination. Ions that have lost ammonia (−17╯Da) in low-energy collision product ion spectra are denoted as a*, b*, and y*. Ions that have lost water (−18╯Da) are denoted as ao, bo, and yo. The a ion illustrated in Figure 1.22 is produced through loss of CO from a b ion (−28╯Da). Upon careful inspection of the structures in Figure 1.22 for the product ions, it can be seen that the a ion is missing CO as compared with the structure of the b ion. When a difference of 28 is observed in product ion spectra between two m/z values, an a–b ion pair is suggested and can be useful in ion series identification. Internal cleavage
36â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
H2N
R2
O
C H
C
R3 N H
C H
C
Amino-acylium ion b-type and y-type cleavage
O
H2N
R2
O
C H
C
R3 N H
CH
Amino-immonium ion a-type and y-type cleavage
Figure 1.24.╇ Structure of (left) an amino-acylium ion produced through a combination of b- and y-type cleavage and (right) an amino-immonium ion through a combination of a- and y-type internal cleavage.
ions are produced by double backbone cleavage, usually by a combination of b- and y-type cleavage. When a combination of b- and y-type cleavage takes place, an amino-acylium ion is produced. When a combination of a- and y-type internal cleavage takes place, an aminoimmonium ion is produced. The structures of an amino-acylium ion and an amino-immonium ion are illustrated in Figure 1.24. These types of product ions that are produced from internal fragmentation are denoted with their one-letter amino acid code. Though not often observed, xtype ions can be produced using photodissociation. 1.5.3.3 High-Energy Product Ions.╇ Thus far, the product ions that have been discussed, the a-, b-, and y-type ions, are produced through low-energy collisions such as those observed in ion traps. The collisioninduced activation in ion traps is a slow heating mechanism, produced through multiple collisions with the trap bath gas, which favors lowerenergy fragmentation pathways. High-energy collisions that are in the kiloelectron volt range such as those produced in MALDI TOF-TOF MS produce other product ions in addition to the types that have been discussed so far. Side-chain cleavage ions that are produced by a combination of backbone cleavage and a side-chain bond are observed in high-energy collisions and are denoted as d, v, and w ions. Figure 1.25 contains some illustrative structures of d-, v-, and w-type ions. Immonium ions are produced through a combination of a-type and y-type cleavage that results in an internal fragment that contains a single side chain. These ions are designated by the one-letter code that corresponds to the amino acid. Immonium ions are not generally observed in ion trap product ion mass spectra but are in MALDI TOFTOF product ion mass spectra. The structure of a general immonium ion is illustrated in Figure 1.26. Immonium ions are useful in acting as confirmation of residues suspected to be contained within the peptide backbone. Table 1.3 is a compilation of the amino acid residue
PROTEINS AND PROTEOMICSâ•…â•… 37
H
H
R'
H2N
R1
O
C H
C
O
HC N H
HN
CH
C H
C
N H
R3
O
C H
C
OH
v2
d2
R'
H CH HC
O C
N H
R4
O
C H
C
OH
w2 Figure 1.25.╇ Structures of d-, v-, and w-type ions produced by a combination of backbone cleavage and a side chain bond observed in high-energy collision product ion spectra.
R H2N
CH
Figure 1.26.╇ Structure of a general immonium ion.
information that is used in MS analysis of peptides. The table includes the amino acid residue’s name, associated codes, residue mass, and immonium ion mass. 1.5.3.4 De Novo Sequencing.╇ An example of de novo sequencing is illustrated in Figure 1.27. The product ion spectrum in Figure 1.27a is for a peptide composed of seven amino acid residues. The peptide product ion spectrum in Figure 1.27b is also composed of seven amino acid residues; however, the Ser residue (Ser, C3H5NO2, 87.0320╯amu) in Figure 1.27a has been replaced by a Thr residue (Thr, C4H7NO2, 101.0477╯amu) in Figure 1.27b. The product ion spectra are very similar, but a difference can be discerned with the b5 ion and the y3 ions where a shift of 14╯Da is observed due to the difference in amino acid residue composition associated with Ser and Thr. Though the sequencing of the amino acids contained within a peptide chain can be discerned by de novo MS as just illustrated, there is a
38╅╅ POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS TABLE 1.3.╇ Amino Acid Residue Names, Codes, Masses, and Immonium Ion m/z Values Residue
Alanine Arginine Asparagine Aspartic acid Cysteine Glutamic acid Glutamine Glycine Histidine Isoleucine Leucine Lysine Methionine Phenylalanine Proline Serine Threonine Tryptophan Tyrosine Valine
One-Letter Code
Three-Letter Code
Residue Mass
Immonium Ion (m/z)
A R N D C E Q G H I L K M F P S T W Y V
Ala Arg Asn Asp Cys Glu Gln Gly His Ile Leu Lys Met Phe Pro Ser Thr Trp Tyr Val
71.04 156.10 114.04 115.03 103.01 129.04 128.06 57.02 137.06 113.08 113.08 128.09 131.04 147.07 97.05 87.03 101.05 186.08 163.06 99.07
129 87.09 88.04 76 102.06 101.11 30 110.07 86.1 86.1 101.11 104.05 120.08 70.07 60.04 74.06 159.09 136.08 72.08
problem associated with isomers and isobars. Isomers are species that have the same molecular formula but differ in their structural arrangement, while isobars are species with different molecular formulas that possess similar (or the same) molecular weights. For example, it is not possible to determine whether a particular peptide contains Leu (Leu, C6H11NO, 113.0841╯amu) or its isomer Ile (Ile, C6H11NO, 113.0841╯amu) both at a residue mass of 113.0841╯amu. Furthermore, even though the remaining 18 amino acid residues each contain distinctive elemental compositions and thus distinct molecular masses, some combinations of residues will actually equate to identical elemental compositions. This produces an isobaric situation where different peptides will possess either very similar or identical sequence masses. If every single peptide amide bond cleavage is not represented within the product ion spectrum, then it is not possible to discern some of these possible combinations. The use of high-resolution/high-mass accuracy instrumentation such as the FTICR mass spectrometer or the Orbitrap can be used to
PROTEINS AND PROTEOMICSâ•…â•… 39 y6
y4
y3
y2
y1
Cys-Gln-Ile-Ala-Ser-Pro-Cys
(a)
b1
100
y5
Cys+Gln
b2
b3
b4
Ile
b5
b6
Ala
Ser
Pro
Cys
b2 b3
y2
%
y2
b5
b4 y2
b6
+
[M+H]
0 100
150
200
250
300
y6
450
400
y4
y3
y2
500
550
600
650
700 800
y1
Cys-Gln-Ile-Ala-Thr-Pro-Cys
(b)
b1 100
y5
350
Cys+Gln
b2
b3
b4
Ile
b5
Ala
b6
Thr
Pro
Cys
b2 b3
y2
%
y2
b5
b4
+
b6 [M+H]
y2 0 100
150
200
250
300
350
400
450
500
550
600
650
700 800
Figure 1.27.╇ Example of de novo sequencing using product ion spectra collected by collision-induced dissociation mass spectrometry. (a) Peptide composed of seven amino acid residues. (b) Peptide composed of seven amino acid residues with the serine residue (Ser, C3H5NO2, 87.0320╯amu) replaced by a threonine residue (Thr, C4H7NO2, 101.0477╯amu). The product ion spectra are very similar, but a difference can be discerned with the b5 ion and the y3 ions where a shift of 14╯Da is observed due to the difference in amino acid residue composition associated with serine and threonine.
help reduce this problem when complete de novo sequencing is not possible. Table 1.4 is a listing of some of the amino acid combinations that may arise that can contribute to unknown sequence determination when complete de novo sequencing is not being obtained. For a peptide with a mass of 800╯Da, the differences in the table for, for example, the Gln versus Lys difference at 0.03638╯Da would take a mass accuracy of better than 44╯ppm to distinguish the two. For the Arg versus Gly╯+╯Val at 0.01124╯Da, it would require a mass accuracy of 14╯ppm to distinguish the two. For the FTICR mass spectrometers and the hybrid mass spectrometers such as the linear ion trap-Fourier transform (LTQ-FT)
40╅╅ POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS TABLE 1.4.╇ Examples of Combinations of Amino Acid Residues Where Isobaric Peptides Can Be Observed Amino Acid Residue
Leucine Isoleucine Glutamine Glycine╯+╯alanine Asparagine 2╯×╯glycine Oxidized methionine Phenylalanine Glutamine Lysine Arginine Glycine╯+╯valine Asparagine Ornithine Leucine/isoleucine Hydroxyproline 2╯×╯valine Proline╯+╯threonine
Residue Mass (Da)
113.08406 113.08406 128.05858 128.05858 (57.02146╯+╯71.03711) 114.04293 114.04293 (2╯×╯57.02146) 147.03540 147.06841 128.05858 128.09496 156.10111 156.08987 (57.02146╯+╯99.06841) 114.04293 114.07931 113.08406 113.04768 198.13682 (2╯×╯99.06841) 198.10044 (97.05276╯+╯101.04768)
Δ Mass (Da)
0 0 0 0.03301 0.03638 0.01124 0.03638 0.03638 0.03638
or the LTQ-Orbitrap, this is readily achievable, but often, for ion traps this is not always achievable. 1.5.3.5 Electron Capture Dissociation (ECD).╇ Other techniques such as ECD51 and electron transfer dissociation (ETD) have also been used to alleviate the problem of isobaric amino acid combinations by giving complimentary product ions (such as c and z ions) that help to obtain complete sequence coverage. The technique of ECD tends to promote extensive fragmentation along the polypeptide backbone, producing c- and z-type ions while also preserving modifications such as glycosylation and phosphorylation. The general z-type ion that is shown in Figure 1.22 is different though from the z-type ion that is produced in ECD, which is a radical cation. Peptide cation-radicals are produced by passing or exposing the peptides, which are already multiplyprotonated by ESI through low-energy electrons. The mixing of the protonated peptides with the low-energy electrons will result in exothermic ion–electron recombinations. There are a number of dissociations that can take place after the initial peptide cation-radical is formed. These include loss of ammonia, loss of H atoms, loss of side chain fragments, cleavage of disulfide bonds, and most importantly, peptide backbone cleavages. The c-type ion is produced through homolytic cleavage
PROTEINS AND PROTEOMICSâ•…â•… 41
at the N–C peptide bond, and charges are present in the amino-terminal fragment. The z-type ion is produced when charges are present in the carboxy-terminal fragment. The mechanism that has been given for the promotion of fragmentation of the peptides is due to electron attachment to the protonated sites of the peptide. The now cation radical intermediate that has formed will release a hydrogen atom. A nearby carbonyl group will capture the released hydrogen atom and the peptide will dissociate by cleavage of the adjacent N–C peptide bond. The mechanism for the production of an α-amide radical of the peptide C-terminus, a z-type ion, and the enolamine of the N-terminus portion of the peptide, a c-type ion, is illustrated in Figure 1.28.
H
H
H3N
R1
O
C H
C
N H
R2
O
C H
C
R3
O
C H
C
N H
Electron capture
N H
R4
O
C H
C
e H
H
H3N
R1
O
C H
C
N H
R2
O
C H
C
.
N H
R3
O
C H
C
N H
R4
O
C H
C
H
H
H3N
O
C H
C
N H
R2
O
C H
C H
.
OH
Loss and recapture
H transfer
R1
OH
N H
R3
O
C H
C
N H
R4
O
C H
C
H
H R1 H3N
C H
O C
R2 N H
C H
c-Type ion
O C
OH
R3 NH
+
HC
.
O C
N H
R4
O
C H
C
OH
z-Type ion
Figure 1.28.╇ Mechanism for the production of c- and z-type ions observed in ECD.
42â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
1.5.4 Top-Down Proteomics: MS of Intact Proteins 1.5.4.1 Background.╇ Measuring the whole, intact protein in the GP using mass spectrometric methodologies is known as “top-down” proteomics. Top-down proteomics measures the intact protein’s mass followed by collision-induced dissociation of the whole protein, breaking it into smaller parts. A vital component of top-down proteomics is the accuracy in which the masses are measured. Often, high-resolution mass spectrometers such as the FTICR mass spectrometer are used to accurately measure the intact protein’s mass and the product ions produced during collision-induced dissociation experiments. In early topdown experiments, though, this was not the case. Mass spectrometers such as the triple quadrupole coupled with electrospray were first used to measure intact proteins in the GP.52,53 However, the triple quadrupole mass spectrometer does not allow the resolving of the isotopic distribution of the product ions being generated in the top-down approach. The use of FT-MS/MS was later reported with high enough resolution to resolve isotopic peaks.54,55 An example of these early topdown experiments utilizing FT-MS is illustrated in Figure 1.29. Extensive initial, premass analysis sample preparation, such as cleanup, digestion, desalting, and enriching, all often incorporated in “bottomup” proteomics, is not necessarily required in top-down approaches. The dynamic range in top-down proteomics can be limited by the number of analytes that can be present during analysis, but this is usually overcome by using some type of separation prior to introduction into the mass spectrometer. The separation of complex protein mixtures can be obtained using techniques such as RP-HPLC, gel electrophoresis, anion exchange chromatography, and capillary electrophoresis. Typically in bottom-up analysis, the digested protein peptides are <3╯kDa and the complete description of the original, intact protein is not possible. With top-down analysis, there is often 100% coverage of the protein being analyzed. This allows the determination of the N- and C-termini, the exact location of modifications to the protein such as phosphorylation, and the confirmation of DNA-predicted sequences. 1.5.4.2 GP Basicity and Protein Charging.╇ It is the process of ESI that allows the measurement of intact proteins with large molecular weight. As a rule of thumb, for each 1000╯Da of the protein, there is associated one charge state. For example, a 30╯kDa protein, as an approximation, will have a charge state of 30+. This brings the measured mass of the protein down into the range of many mass
PROTEINS AND PROTEOMICSâ•…â•… 43
Figure 1.29.╇ (Top) ESI mass spectrum of ubiquitin (sum of 10 scans, 64,000 data). (Center) Regions expanded to show the presence of impurities. (Bottom) MS/MS fragment ions from collisionally activated dissociation of [M╯+╯10H]10+ and from placing 200╯V between the nozzle and skimmer of the ESI source. (Reprinted with permission from Loo, J.A., Quinn, J.P., Ryu, S.I., Henry, K.D., Senko, M.W., McLafferty, F.W. Highresolution tandem mass spectrometry of large biomolecules (electrospray ionization/ polypeptide sequencing), Proc. Natl. Acad. Sci. U.S.A. 1992, 89, 286–289.)
spectrometers that typically scan between m/z 100 and m/z 4000 (m/z╯=╯30╯kDa/30+╯=╯1000╯Th). Multiple charging of peptides and proteins is achieved during the ESI process due to the presence of amino acid residue basic sites. There is a limit to the number of charges that can be placed onto a peptide or protein during the ESI process as was demonstrated by Schnier et al. in 1995.56 In their study, the apparent GP basicity as a function of charge state was measured using cytochrome c. A graphical plot of apparent GP basicity versus charge state is illustrated in Figure 1.30. The curves in Figure 1.30 have a negative trend (go down) as each charge state is increased. As a new charge state is added (increasing x-axis), the apparent GP basicity decreases
Apparent Gas-Phase Basicity (kcal/mol)
44â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
240 220 200 180 160 140 120 5
10 Charge State
15
20
Figure 1.30.╇ Apparent gas-phase basicity as a function of charge state of cytochrome c ions, measured (•); calculated, linear (Єr╯=╯1.0, Δ; best fit Єr╯=╯2.0, O); intrinsic, ; calculated, X-ray crystal structure (Єr╯=╯2.0, ); calculated, α-helix (Єr╯=╯4.1, ).The dashed line indicates gas-phase basicity (GB) of methanol (174.1╯kcal/mol) and the dash-dot line indicates GB of water (159.0╯kcal/mol). (Reprinted with permission from Schnier, D.S.; Gross, E.R.; Williams, E.R. J. Am. Chem. Soc. 1995, 117, 6747–6757. Copyright 1995 American Chemical Society.)
(y-axis). In the graph, the dashed line represents the GP basicity of methanol, which is included as a reference of a species present during the ESI process that can also accept a charge. The charging of large molecules during the ionization process is thought to follow the charged residue mechanism. In this ionization process, the solvent molecules evaporate from around the protein, leaving charges that associate to basic sites within the protein. What the intersection in the graph between the charging of cytochrome c and the GP basicity of methanol is demonstrating is that there is a limit to the amount of charges that can be placed on a species during ionization. At some point (intersection), it becomes thermodynamically favorable to put the next charge (proton) onto a methanol molecule than onto the protein. Here, a state has been reached where Coulombic repulsion between the charges and a loss of basic sites does not allow further charging. At this point, the maximum charge state has been reached for the protein molecule. 1.5.4.3 Calculation of Charge State and Molecular Weight.╇ Another interesting feature of the electrospray charging of proteins is the ability to calculate the charge state and molecular weight of an unknown
PROTEINS AND PROTEOMICSâ•…â•… 45
+15 1131.3 +16 1060.4
100
% Abundance
+19 893.2
Horse heart myoglobin (mol. wt. 16,954 Da)
+13 1305.2
+20 848.7 +21 808.3
0
700
+11
900
1100
1300
1500
1700
m/z
Figure 1.31.╇ Electrospray mass spectrum of horse heart myoglobin at a molecular weight of 16,954╯Da. The spectrum illustrates an envelope of peaks of different m/z values and charge states for the protein.
protein from its single-stage mass spectrum. Figure 1.31 illustrates the electrospray mass spectrum of horse heart myoglobin at a molecular weight of 16,954╯Da. If this were an unknown protein species, we would only have the respective m/z values from the mass spectrum. Using the following two simultaneous equations with two unknowns, the charge states and the molecular weight of the unknown protein can be calculated using only the information obtained within the mass spectrum:
m/z =
m + z ( 1.0079 ) higher m / z peak from spectrum, z
m/z =
m + ( z + 1)( 1.0079 ) lower m / z peak from spectrum. (1.2) z+1
(1.1)
If we solve the higher mass/lower charge state peak (m/z╯=╯1131.3╯Th) for m by taking Equation 1.1 and solving for m, we obtain the following:
m = 1131.3z − 2.0079z,
(1.3)
m = 1130.2921z.
(1.4)
46â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
If we next substitute this mass into the lower mass/higher charge state (m/z╯=╯1060.4╯Th) in Equation 1.2, we can calculate the charge state of the m/z 1131.3 peak: m + ( z + 1)( 1.0079 ) , z+1
m/z =
1060.4 =
m + 1.0079z + 1.0079 , z+1
(1.5)
1060.4 =
1130.2921m + 1.0079z + 1.0079 , z+1
(1.6)
z = 15.
(1.7)
Therefore, we have determined that the charge state of the m/z 1131.3 peak is +15. To calculate the molecular weight of the unknown protein, we can substitute the charge state value into Equation 1.1 and solve for m:
m/z =
1131.3 =
m + z ( 1.0079 ) , z m + 15 ( 1.0079 ) , 15
m = 16954.
(1.8) (1.9)
This process is known as deconvolution where a distribution (mass spectral envelope) of a protein’s m/z values and associated charge states are collapsed down to a single peak representing the molecular weight of the protein. The deconvolution of the horse heart myoglobin protein is illustrated in Figure 1.32. The width of the deconvoluted peak indicates the variability in the calculation of the molecular weight of the protein from the mass spectral peaks. While it is possible to deconvolute protein peaks by hand by solving two simultaneous equations with two unknowns, mass spectral computer software is typically used to perform this task. 1.5.4.4 Top-Down Protein Sequencing.╇ In bottom-up proteomics, the mass spectral identification of proteins through sequence determination of separated peptides requires the isolation of a single peptide for fragmentation experiments. This is typically done by removing from an ion trap mass spectrometer all m/z species present except for one that
PROTEINS AND PROTEOMICSâ•…â•… 47
+15 1131.3 +16 1060.4
100
% Abundance
+19 893.2
+13 1305.2
+20 848.7 +21 808.3
0
700
Deconvolution
16,954
+11
900
1100
1300
1500
1700
m/z
16,900
Mass (Da)
17,000
Figure 1.32.╇ Deconvoluted, computer-generated spectrum of horse heart myoglobin at a molecular weight of 16,954╯Da.
is of interest for fragmentation and subsequent sequencing. It has been demonstrated though that multiple proteins can be simultaneously fragmented and identified in top-down proteomics. In a study reported by Patrie et al.,57 the authors used a hybrid mass spectrometer that coupled a quadrupole mass analyzer to an FTICR mass spectrometer that uses a 9.4 Tesla magnet. The instrumental design is illustrated in Figure 1.33. Prior to the FTICR-MS, there is a resolving quadrupole that can act as either a radio frequency (RF)-only ion guide or as a fully functional mass analyzer. Following this in the instrumental design is an accumulation octopole that was used to accumulate and store ions prior to introduction into the FTICR mass spectrometer. Nitrogen or helium gas at a pressure of approximately 1╯mtorr was introduced into the accumulation octopole to help improve the accumulation. The FTICR cell located within the 9.4 Tesla magnet is an open-ended capacity coupled cell that is cylindrical and divided axially into five segments. At the end of the instrument (far right side) is a laser that is used for infrared multiphoton dissociation (IRMPD) experiments. For fragmentation experiments, collision-induced dissociation could be performed in the accumulation octopole, IRMPD could be performed within the ion cyclotron resonance (ICR) cell by irradiating the trapped species with the laser, or finally, the instrumental design also included ECD capabilities. A topdown experiment of a mixture of proteins collected as a fraction eluting from a reversed-phase liquid chromatography (RPLC) separation is illustrated in Figure 1.34. Figure 1.34a illustrates a broadband spectrum of the RPLC fraction where a very low response is observed for the proteins present. The same broadband spectrum is illustrated in Figure
48â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS Focusing Tube Lens Octopole ESI Emitter Skimmer
20 L/sec rotary vane pump
500 L/sec turbo pump
Resolving Quadrupole
Gas Inlet
Accumulation Octopole
Excite/Detect Laser Electrodes Dispenser Transfer Octopoles Cathode
700 L/sec diffusion pump
500 L/sec turbo pump
500 L/sec turbo pump
Trapping Conductance Electrodes Conductance Heated Metal Limit 2 Conductance Limit 4 Conductance Capillary Conductance Limit 3 Limit 5 Limit 1
500 L/sec turbo pump
Figure 1.33.╇ Schematic representation of the quadrupole/Fourier transform ion cyclotron resonance hybrid mass spectrometer for versatile MS/MS and improved dynamic range by means of m/z-selective ion accumulation external to the superconducting magnet bore. (Reprinted with permission. This article was published in J Am Soc Mass Spectrom, Patrie, S.M., Charlebois, J.P., Whipple, D., Kelleher, N.L., Hendrickson, C.L., Quinn, J.P., Marshall, A.G., Mukhopadhyay, B. Construction of a hybrid quadrupole/ Fourier transform ion cyclotron resonance mass spectrometer for versatile MS/MS above 10╯kDa, 2004, 15, 1099–1108. Copyright Elsevier 2004.)
1.34b after using the accumulation octopole to increase the amount of sample that is being introduced into the FTICR cell and thus increase the sensitivity of the mass measurement of the seven proteins present. Figure 1.34c, d illustrates the fragmentation result of an IRMPD experiment of the seven proteins that were present. Of the seven proteins present in Figure 1.34b, three were identified by top-down proteomics, listed as X, the 19,431.8╯Da protein MJ0543; O, the 20,511.3╯Da protein MJ0471; and Z, the 17,263.0╯Da protein MJ0472. Notice the large number of amino acid residues contained within the b- and y-type ions in Figure 8+ product 1.34d and the associated high number of charges (e.g., the Oy63 ion that contains 63 amino acid residues and eight charges). This is quite different from the peptides that are normally observed in bottom-up proteomics where most peptides contain between 7 and 25 amino acid residues with mostly two charges, but three or four charges are also observed for the longer chain peptides. 1.5.5 Systems Biology and Bioinformatics The application and use of MS as an analytical tool to the field of biology is obviously apparent. The amount of information given from
8358.7-0
(a)
m/z
9923.6-0
20,510.2-0
20,509.3-0
m/z
(c)
7852.3-0
710
700
690
680
730
9924.6-0
9696.5-0 8358.7-0 7604.1-0
(b)
720
710
700
690
680
O
720
730
920
930
Z
X
m/z (d)
(3x) Zb41+
m/z
X y183+
X b133+
Oy577+ X 5+ y39 Oy638+
Ob172+
X y386+
Ob163+
600
Z b175+
Ob172+
Oy183+
910
900
890
880
700
800
900
Z y185+
1000
O b192+
1100
Figure 1.34.╇ (a) Expansion of a 60 m/z section segment of the broadband spectrum from a Methanococcus jannaschii RPLC fraction (3.7-second scan time; 50 scans). (b) The same Δ(m/z)╯=╯60 segments after quadrupole-enhanced ion accumulation (Δ[m/z]╯=╯40 segment, 9.7-second scan time; 10 scans). (c) Δ(m/z)╯=╯60 segments from the same sample with subsequent IRMPD fragmentation of all intact proteins in parallel (shown in [d], 25 scans). Identified proteins are MJ0543 (X, Exptl. 19,431.8-0╯kDa, Theoret. 19,432.5-0╯kDa), MJ0471 (O, Exptl. 20,511.3-0╯kDa, Theoret. 20,511.1-0), and MJ0472 (asterisk, Exptl. 17.263.0-0╯kDa, Theoret. 17,263.6-0╯kDa). (Reprinted with permission. This article was published in J Am Soc Mass Spectrom, Patrie, S.M., Charlebois, J.P., Whipple, D., Kelleher, N.L., Hendrickson, C.L., Quinn, J.P., Marshall, A.G., Mukhopadhyay, B. Construction of a hybrid quadrupole/fourier transform ion cyclotron resonance mass spectrometer for versatile MS/MS above 10╯kDa, 2004, 15, 1099–1108. Copyright Elsevier 2004.)
50â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
mass spectrometric methodologies, that is, the molecular weight, the structure, and the amount (can be relative and/or absolute) of a particular biomolecule extracted from a biological matrix, is similar to the biological analytical approach traditionally used of “one gene” or “one protein” at a time study. The study of biological systems is now moving toward more encompassing analysis such as the sequencing of an organism’s genome or proteome. The trend now is to study both the single components of a system in conjunction with the particular system’s entire compliment of components. Of special interest is how the various components of the system interact with one another under normal conditions and some type of perturbed condition such as a diseased state or a change in the systems environment (e.g., lack of oxygen, food, water). Thus, systems biology is the study of the processes and complex biological organizational behavior using information from its molecular constituents.58 This is quite broad in the sense that the biological organization may go to the level of tissue up to a population or even an ecosystem. The hierarchical levels of biological information were summed up by L. Hood at the Institute for Systems Biology (Seattle, WA) showing the progression from DNA to a complex organization as illustrated in Figure 1.35. The idea is to gather as much information about each component of a system to more accurately describe the system as a whole. This encompasses most of the disciplines of biology and incorporates analytical chemistry (MS) through the need to measure and identify individual species on the molecular level. When attempting to describe the behavior of a biological system, often, the reality of the system is that the behavior of the whole system is greater than what would be predicted from the sum of its parts.59 The study of systems biology incorporates multiple analytical techniques and methodologies, which include MS, to study the components of a biological system (e.g., genes, proteins, metabolites) to better model their interactions.60 The experimental workflow involved in systems biology that center on mass spectrometric analysis is illustrated in Figure 1.36. Specifically, the metabolite analysis and the protein analysis, illustrated in the second step of the flow, are increasingly performed using highly efficient separation methodology such as nano-HPLC (1-D and 2-D) coupled with high-resolution MS such as FTICR-MS and LTQOrbitrap-MS. Much development has been done, and is still ongoing, in the processing and data extraction of the information obtained from high-peak capacity nano-HPLC-ESI MS. A tremendous amount of information is obtained from the experiments that need to be
PROTEINS AND PROTEOMICSâ•…â•… 51
DNA mRNA Protein Informational Pathways Informational Networks Cells Organs Individuals Populations Ecologies
Figure 1.35.╇ Hierarchical levels of biological information. (Reprinted with permission from Hood, L. J. Proteome Res. 2002, 1, 399–409. Copyright 2002 American Chemical Society.)
processed, validated, and statistically evaluated. Bioinformatics for proteomics has developed with great speed and complexity with many open source software available with statistical methods and filtering algorithms for proteomics data validation. Some examples at the present are Bioinformatics.org; sourceforge.net; Open Bioinformatics Foundation that features toolkits such as BioPerl, BioJava, and BioPython; and BioLinux, an optimized Linux operating system for Bioinformaticians. Finally, the visualization of the data obtained from mass spectrometric analysis has also been developed significantly in the past few years. The ability to take different compliments of analyses and integrate them with the goal of correlation is quite daunting when hundreds to thousands of biomolecules are involved. Programs such as Cytoscape allow the visualization of molecular interaction networks. Figure 1.37 is a correlation network between data obtained from proteomics,
Body Fluid or Tissue Samples from System States 1, 2, ..N
52â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
Metabolite Analysis Protein Analysis
Processing Normalization and Integration of Data Sets
Master Data Set
Transcript Analysis
Cluster Analysis
Biochemical Grouping of Samples
Statistical Modeling
Biomarkers of System States
Correlation Analysis
Associations between Molecules
Additional Phenotype Data Bioinformatics Data Sources
Expert Mining
Systems Knowledge
Figure 1.36.╇ Systems biology workflow. Data are produced by different platforms (transcriptomics, proteomics, and metabolomics) followed by integration into a master data set. Different biostatistical strategies are pursued: clustering, modeling, and correlation analysis. Integration with extensive bioinformatics tools and expert biological knowledge is key to the creation of meaningful knowledge. (Reprinted with permission from van der Greef, J., Martin, S., Juhasz, P., Adourian, A., Plasterer, T., Verheij, E.R., McBurney, R.N. J. Proteome Res. 2007, 6, 1540–1559. Copyright 2007 American Chemical Society.)
metabolomics, and transcriptomics from liver tissue and plasma. The correlation network found biomarker analytes in the plasma that may be useful in monitoring processes occurring in the organ. 1.5.6 Biomarkers in Cancer MS is having a substantial impact on systems biology-type studies such as those of biomarker discovery in cancer diagnostics. Figure 1.38 illustrates a scheme for the search for biomarkers in patient-derived samples through mass spectral analyses. From the patient, the figure lists nine types of patient-derived samples that include blood, urine, sputum, saliva, breath, tear fluid, nipple aspirate fluid, and cerebrospinal fluid. Also described in the figure are the various types of “omic” studies done using mass spectrometric techniques such as proteomics,
PROTEINS AND PROTEOMICSâ•…â•… 53
Figure 1.37.╇ Correlation network of analytes across blood plasma (top of figure) and liver tissue (bottom of figure). Analytes include proteins, endogenous metabolites, and gene transcripts. Not only is structure evident among analytes profiled from liver tissue, but there are also a number of correlations to analytes profiled in plasma in this case. Such analytes can serve as useful circulating biomarkers for the tissue-based biochemical processes occurring in the organ. (Reprinted with permission from van der Greef, J., Martin, S., Juhasz, P., Adourian, A., Plasterer, T., Verheij, E.R., McBurney, R.N. J. Proteome Res. 2007, 6, 1540–1559. Copyright 2007 American Chemical Society.)
metabonomics, peptidomics, glycomics, phosphoproteomics, and lipidomics. Indicative of the progress or status of a disease, a biomarker is a biologically derived molecule in the body that is measured along with many other species present by the omics methods. Bioinformatics analysis is performed on the mass spectral data to determine the presence of potential biomarkers through expression studies and response differentials. Once identified, the biomarkers can be used for clinical diagnostics such as early detection before the onset of a serious disease such as cancer. This allows medical intervention that may have a substantial influence on the success of an early treatment and subsequent cure. Table 1.5 lists a number of identified potential cancer biomarkers that have been discovered through mass spectrometric analyses, primarily through proteomics and metabonomics.
Patient
Blood
Urine Sputum
P r e t r e a t m e n t
Proteomics
P r e t r e a t m e n t
P r e t r e a t m e n t
Saliva
Breath
Tear fluid
Nipple aspirate fluid
Cerebrospinal fluid
Tissue sample
P r e t r e a t m e n t
P r e t r e a t m e n t
P r e t r e a t m e n t
P r e t r e a t m e n t
P r e t r e a t m e n t
P r e t r e a t m e n t
Metabonomics Peptidomics Glycomics Phosphoproteomics
Lipidomics
Bioinformatics analysis
Biomarkers identification
Clinical diagnostics
Figure 1.38.╇ Scheme for mass spectrometry-based “omics” technologies in cancer diagnostics. Proteomics is the large-scale identification and functional characterization of all expressed proteins in a given cell or tissue, including all protein isoforms and modifications. Metabonomics is the quantitative measurement of metabolic responses of multicellular systems to pathophysiological stimuli or genetic modification. Peptidomics is the simultaneous visualization and identification of the whole peptidome of a cell or tissue, that is, all expressed peptides with their posttranslational modifications. Glycomics is to identify and study all the glycan molecules produced by an organism, encompassing all glycoconjugates (glycolipids, glycoproteins, lipopolysaccharides, peptidoglycans, and proteoglycans). Phosphoproteomics is the characterization of phosphorylation of proteins. Lipidomics is system-level analysis and characterization of lipids and their interacting partners. (Reprinted with permission of John Wiley & Sons, Inc. Zhang, X., Wei, D., Yap, Y., Li, L., Guo, S., and Chen, F. Mass spectrometry based “omics” technologies in cancer diagnostics. Mass Spectrometry Reviews, 2007, 26, 403–431.) 54
PROTEINS AND PROTEOMICSâ•…â•… 55
TABLE 1.5.╇ Potential Cancer Biomarkers Identified by Mass Spectrometry-Based “Omics” Technologies Biomarkers
“omics” Platforms
MS Methods
Sample Source
Cancer Type
Apolipoprotein A1, inter-αtrypsin inhibitor Haptoglobin-asubunit Transthyretin Vitamin D-binding protein Stathmin (Op18), GRP 78 14-3-3 Isoforms, transthyretin Protein disulfide isomerase Peroxiredoxin, enolase Protein disulfide isomerase HSP 70, α-1antitrypsin HSP 27 Annexin I, cofilin, GST Superoxide dismutase Peroxiredoxin, enolase Protein disulfide isomerase Neutrophil peptides 1-3
Proteomics
SLDI-TOF
Serum
Ovarian
Proteomics
SELDI-TOF
Serum
Prostate
Proteomics
ESI-MS
Tissue
Lung
Proteomics
MALDI-TOF, LC-MS
Tissue
Breast
Proteomics Proteomics
MALDI-TOF MALDI-TOF, ESI-MS, Q-TOF
Serum Tissue
Liver Colon
Proteomics
SELDI-TOF
Breast
PCa-24 Alkanes, benzenes Decanes, heptanes Hexanal, heptanal Pseu, m1A, m1I
Proteomics Metabonomics Metabonomics Metabonomics Metabonomics
MALDI-TOF GC-MS GC-MS LC-MS HPLC, LC-MS
Nipple aspirate fluid Tissue Breath Breath Serum Urine
Prostate Lung Breast Lung Liver
SLDI, soft laser desorption/ionization; SELDI, surface-enhanced laser desorption/ionization; GC, gas chromatography. Reprinted with permission of John Wiley & Sons, Zhang, X., Wei, D., Yap, Y., Li, L., Guo, S., Chen, F. Mass spectrometry based “omics” technologies in cancer diagnostics. Mass Spectrom. Rev. 2007, 26, 403–431.
56â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
REFERENCES ╇ 1.╇ Wold, F. Annu. Rev. Biochem. 1981, 50, 783–814. ╇ 2.╇ Wilkins, M.R.; Gasteiger, E.; Gooley, A.A.; Herbert, B.R.; Molloy, M.P.; Binz, P.A.; Ou, K.; Sanchez, J.C.; Bairoch, A.; Williams, K.L.; Hochstrasser, D.F. J. Mol. Biol. 1999, 280, 645–657. ╇ 3.╇ Whitehouse, C.M.; Dreyer, R.N.; Yamashita, M.; Fenn, J.B. Anal. Chem. 1985, 57, 675. ╇ 4.╇ Fenn, J.B. J. Am. Soc. Mass Spectrom. 1993, 4, 524. ╇ 5.╇ Dole, M.; Hines, R.L.; Mack, R.C.; Mobley, R.C.; Ferguson, L.D.; Alice, M.B. J. Chem. Phys. 1968, 49, 2240. ╇ 6.╇ Kebarle, P.; Ho, Y. In Electrospray Ionization Mass Spectrometry, Cole, R.B., Ed. New York: Wiley, 1997; p. 17. ╇ 7.╇ Cole, R.B. J. Mass Spectrom. 2000, 35, 763–772. ╇ 8.╇ Cech, N.B.; Enke, C.G. Mass Spectrom. Rev. 2001, 20, 362–387. ╇ 9.╇ Gaskell, S.J. J. Mass Spectrom. 1997, 32, 677–688. 10.╇ Cech, N.B.; Enke, C.G. Anal. Chem. 2000, 72, 2717–2723. 11.╇ Sterner, J.L.; Johnston, M.V.; Nicol, G.R.; Ridge, D.P. J. Mass Spectrom. 2000, 35, 385–391. 12.╇ Cech, N.B.; Enke, C.G. Anal. Chem. 2001, 73, 4632–4639. 13.╇ Taylor, G.I. Proc. R. Soc. Lond. A 1964, 280, 383. 14.╇ Cao, P.; Stults, J.T. Rapid Commun. Mass Spectrom. 2000, 14, 1600–1606. 15.╇ Ho, Y.P.; Huang, P.C.; Deng, K.H. Rapid Commun. Mass Spectrom. 2003, 17, 114–121. 16.╇ Kocher, T.; Allmaier, G.; Wilm, M. J. Mass Spectrom. 2003, 38, 131–137. 17.╇ Lorenz, S.A.; Maziarz, E.P.; Wood, T.D. J. Am. Soc. Mass Spectrom. 2001, 12, 795–804. 18.╇ Daniel, J.M.; Friess, S.D.; Rajagopalan, S.; Wendt, S.; Zenobi, R. Int. J. Mass Spectrom. 2002, 216, 1–27. 19.╇ Wilm, M.S.; Mann, M. Int. J. Mass Spectrom. Ion Process. 1994, 136, 167–180. 20.╇ Wilm, M.; Mann, M. Anal. Chem. 1996, 68, 1–8. 21.╇ Caprioli, R.M.; Emmett, M.E.; Andren, P. Proceedings of the 42nd ASMS Conference on Mass Spectrometry and Allied Topics, Chicago, IL, May 29–June 3, 1994; p 754. 22.╇ Qi, L.; Danielson, N.D. J. Pharm. Biomed. Anal. 2005, 37, 225–230. 23.╇ Fernandez de la Mora, J.; Loscertales, I.G. J. Fluid Mech. 1994, 260, 155–184. 24.╇ Pfiefer, R.J.; Hendricks, C.D., Jr. AIAA J. 1968, 6, 496–502. 25.╇ Juraschek, R.; Dulcks, T.; Karas, M. J. Am. Soc. Mass Spectrom. 1999, 10, 300–308.
REFERENCESâ•…â•… 57
26.╇ Schmidt, A.; Karas, M. J. Am. Soc. Mass Spectrom. 2003, 14, 492–500. 27.╇ Li, Y.; Cole, R.B. Anal. Chem. 2003, 75, 5739–5746. 28.╇ El-Faramawy, A.; Siu, K.W.M.; Thomson, B.A. J. Am. Soc. Mass Spectrom. 2005, 16, 1702–1707. 29.╇ McLuckey, S.A.; Van Berkel, G.J.; Glish, G.L. J. Am. Soc. Mass Spectrom. 1992, 3, 60–70. 30.╇ Cerny, R.L.; Gross, M.L.; Grotjahn, L. Anal. Biochem. 1986, 156, 424. 31.╇ McLuckey, S.A.; Habibi-Goudarzi, S. J. Am. Chem. Soc. 1993, 115, 12085. 32.╇ Morrison, R.T.; Boyd, R.N. Organic Chemistry, 5th ed. Boston: Allyn and Bacon, 1987. 33.╇ Schulz, G.E.; Schirmer, R.H. In Principles of Protein Structure, Springer Advanced Texts in Chemistry, Cantor, C.R., Ed. New York: Springer-Verlag, 1990. 34.╇ Franks, F. Biophys. Chem. 2002, 96, 117–127. 35.╇ Edman, P. Acta Chem. Scand. 1950, 4, 283–293. 36.╇ Shimonishi, Y.; Hong, Y.M.; Kitagishi, T.; Matsuo, H.; Katakuse, I. Eur. J. Biochem. 1980, 112, 251–264. 37.╇ Morris, H.R.; Panico, M.; Barber, M.; Bordoli, R.S.; Sedgwick, R.D.; Tyler, A. Biochem. Biophys. Res. Commun. 1981, 101, 623–631. 38.╇ Mann, M.; Hojrup, P.; Roepstorff, P. Biol. Mass Spectrom. 1993, 22(6), 338–345. 39.╇ Pappin, D.D.J.; Hojrup, P.; Bleasby, A.J. Curr. Biol. 1993, 3, 327–332. 40.╇ Henzel, W.J.; Billeci, T.M.; Stults, J.T.; Wong, S.C.; Grimley, C.; Watanabe, C. Proc. Natl. Acad. Sci. U.S.A. 1993, 90(11), 5011–5015. 41.╇ Kenrick, K.G.; Margolis, J. Anal. Biochem. 1970, 33, 204–207. 42.╇ Gras, R.; Muller, M. Curr. Opin. Mol. Ther. 2001, 3(6), 526–532. 43.╇ Wolters, D.A.; Washburn, M.P.; Yates, J.R. Anal. Chem. 2001, 73, 5683–5690. 44.╇ Eng, J.K.; McCormack, A.L.; Yates, J., III J. Am. Soc. Mass Spectrom. 1994, 5, 976–989. 45.╇ Mann, M.; Wilm, M. Anal. Chem. 1994, 66, 4390–4399. 46.╇ Roepstorff, P.; Fohlman, J. Biomed. Mass Spectrom. 1984, 11(11), 601. 47.╇ Yalcin, T.; Csizmadia, I.G.; Peterson, M.R.; Harrison, A.G. J. Am. Soc. Mass Spectrom. 1996, 7, 233–242. 48.╇ Dongre, A.R.; Jones, J.L.; Somogyi, A.; Wysocki, V.H. J. Am. Chem. Soc. 1996, 118, 8365–8374. 49.╇ McCormack, A.L.; Somogyi, A.; Dongre, A.R.; Wysocki, V.H. Anal. Chem. 1993, 65, 2859–2872. 50.╇ Somogyri, A.; Wysocki, V.H.; Mayer, I. J. Am. Soc. Mass Spectrom. 1994, 5, 704–717.
58â•…â•… POSTTRANSLATIONAL MODIFICATION (PTM) OF PROTEINS
51.╇ Zubarev, R.A.; Kelleher, N.L.; McLafferty, F.W. J. Am. Chem. Soc. 1998, 120, 3265–3266. 52.╇ Loo, J.A.; Edmonds, C.G.; Smith, R.D. Science 1990, 248, 201–204. 53.╇ Feng, R.; Konishi, Y. Anal. Chem. 1993, 65, 645–649. 54.╇ Loo, J.A.; Quinn, J.P.; Ryu, S.I.; Henry, K.D.; Senko, M.W.; McLafferty, F.W. Proc. Natl. Acad. Sci. U.S.A. 1992, 89, 286–289. 55.╇ Senko, M.W.; Beu, S.C.; McLafferty, F.W. Anal. Chem. 1994, 66, 415–417. 56.╇ Schnier, D.S.; Gross, E.R.; Williams, E.R. J. Am. Chem. Soc. 1995, 117, 6747–6757. 57.╇ Patrie, S.M.; Charlebois, J.P.; Whipple, D.; Kelleher, N.L.; Hendrickson, C.L.; Quinn, J.P.; Marshall, A.G.; Mukhopadhyay, B. J. Am. Soc. Mass Spectrom. 2004, 15, 1099–1108. 58.╇ Kirschner, M.W. Cell 2005, 121, 503–504. 59.╇ Srivastava, R.; Varner, J. Biotechnol. Prog. 2007, 23, 24–27. 60.╇ Smith, J.C.; Lambert, J.P.; Elisma, F.; Figeys, D. Anal. Chem. 2007, 79, 4325–4344.
2
Glycosylation of Proteins
2.1 PRODUCTION OF A GLYCOPROTEIN Glycosylation is the covalent addition of a carbohydrate chain to amino acid side chains of a protein producing a glycoprotein. The carbohydrate side chains can be anywhere from 1 to 70 sugar units in length, branched or straight-chained, and are most commonly composed of mannose, galactose, N-acetylglucosamine, and sialic acid (SiA). The structures and abbreviations of these four carbohydrates are illustrated in Figure 2.1. 2.2 BIOLOGICAL PROCESSES OF PROTEIN GLYCOSYLATION Protein glycosylation is the most common type of protein modification that is found in eukaryotes. The glycosylation of proteins helps to determine the proteins’ structure, is crucial in cell–cell recognition, and may also be involved in cellular signaling events, though most likely not as important as phosphorylation modification is in signaling. Glycoproteins are primarily membrane proteins and are abundantly found in plasma membranes where they are involved in cell-to-cell recognition processes. The carbohydrate groups of the membrane glycoproteins are positioned so that they are externally extruded from the cell membrane surface. An example of this involves erythrocytes where the externally extruded carbohydrate groups of the cell membrane possess a negative charge and thus repel each other and subsequently reduce the blood’s viscosity.
Proteomics of Biological Systems: Protein Phosphorylation Using Mass Spectrometry Techniques, First Edition. Bryan M. Ham. © 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 59
60â•…â•… GLYCOSYLATION OF PROTEINS OH
OH
O OH
H H
OH
H2 C
H2 C
OH
OH
H
H
HO
O OH
HO H
H
CH3
H2 C
OH
H
H
OH
H
H H OH
H
O
H OH
H
OH
H
H
H HN
O O C
NH
H
HO H
H
C O OH
O
O-
OH
C CH3
β-D-mannose (Man)
β-D-galactose (Gal)
N-acetyl-β-D-glucosamine (GlcNAc)
Sialic acid (SiA)
Figure 2.1.╇ The most common four carbohydrates found in glycoproteins: mannose (Man), galactose (Gal), N-acetylglucosamine (GlcNAc), and sialic acid (SiA).
2.3 N-LINKED AND O-LINKED GLYCOSYLATION The two types of linkage of the carbohydrates to the amino acids are (1) to the nitrogen atom of an amino acid group called N-linked glycosylation and (2) to the oxygen atom of a hydroxyl group called O-linked glycosylation. In the N-linked glycosylation, the carbohydrates are attached to the asparagine’s (Asn) side chain amino group. In the O-linked glycosylation, the carbohydrates are attached to the serine (Ser) or threonine (Thr) side chain hydroxyl group. The current specific amino acid sequences involved with glycosylation state that the N-linked carbohydrates are linked through Asn and N-acetylglucosamine. The associated amino acid sequence with the N-linked carbohydrate consists of a Ser or Thr separated by one amino acid residue (any one of 19 residues excluding proline [Pro]), both located toward the C-terminus of the peptide chain. The Pro amino acid residue, however, does not participate as the one amino acid between the Asn and Ser/Thr residues. The N-linked and O-linked carbohydrates are illustrated in Figure 2.2. A brief look at the structure of carbohydrates and the mass spectrometry (MS) behind their measurement and identification will be helpful in the discussion of glycosylation of proteins as a posttranslational modification. 2.4 CARBOHYDRATES Polysaccharides are biopolymers that are composed of long chains of repeating sugar units, usually either the same repeating monomer or a pattern of two alternating units. The sugar monomers that make up the
CARBOHYDRATESâ•…â•… 61
N-linked carbohydrate to asparagine
OH
H2 C
NH
H
OH
OH
H
O
CH2 CH
H
NH
HN
O C CH3
H
H OH
H2 C
O-linked carbohydrate to serine
H
OH
OH O
HO OH
H
O HN
OH
OH
H
OH
OH
H2 C O
HO H
CH
HN
R O
CH O
HO O
H
H
H H
H H
O
NH
C
H
O
CH
HC
H3 C O O
H
O
C
H2 C
O-linked carbohydrate to threonine
OH
O CH3
H
C
NH
H H
R
C
H
H
H H
CH2
O O
H
HC
NH
OH
H2 C
O
C H
H
H
C
H
OH O
HO
H2N
O
H H2 C
O
CH3
OH
Figure 2.2.╇ Covalent linking of carbohydrates to the peptide amino acid backbone. N-linked carbohydrate to asparagine. Notice the one amino acid residue between the next amino acid residue serine (could also be threonine). O-linked carbohydrate to serine. O-linked carbohydrate to threonine.
polysaccharide chain are called monosaccharides from the Greek word “mono” meaning “single” (the Greek “poly” means “many”) and “saccharide” meaning “sugar.” Glycogen and starch are two forms of storage polysaccharides, while cellulose is a structural polysaccharide. It is the monosaccharide glucose that makes up these three polysaccharides as a single repeating unit. It is the bonding between the repeating glucose monosaccharide and the structural design through branching that gives
62â•…â•… GLYCOSYLATION OF PROTEINS
H
O H
5 4 OH H 3
HO
H
OH
6
OH
6
H
2
OH
α-D-glucose
H
1 OH
HO
O OH
5 4 OH H 3 H
H
2
1 H
OH
β-D-glucose
Figure 2.3.╇ Structures of α-d-glucose (the unit that repeats in starch and glycogen) where the hydroxyl group points downward, and in β-d-glucose (the unit the repeats in cellulose) where the hydroxyl group points upward.
each one of these its special properties designed for specific functions. Starch is the storage form of polysaccharide that is found in plants, while it is glycogen that is the storage form found in animals. Glycogen is primarily found in the liver where it is used to maintain blood sugar levels and in muscle tissue where it is used for energy in muscle contraction. These polysaccharides are composed of α-d-glucose units that are linked together by α-glycosidic bonds at the one and four positions of the glucose unit. Cellulose, found in the cell walls of plants, is a polysaccharide composed of the repeating glucose monomer β-dglucose with a β-glycosidic bond between the one and four carbons of the sugar. Figure 2.3 illustrates the structures of α-d-glucose and β-dglucose. Starting with carbon number one in the numbering of the ring to the right of the structure, the difference between the two is that in α-d-glucose, the hydroxyl group points downward, and in β-d-glucose, it points upward. Glucose, also known as aldohexose d-glucose, is a six-carbon structure with molecular formula of C6H12O6. The sugars are of the carbohydrate group with a general formula of CnH2nOn where the carbon to hydrogen to oxygen ratio is 1:2:1. Carbohydrate is actually an archaic nomenclature from early chemist studies that, in general, termed compounds such as sugars as a hydrate of carbon represented as Cn(H2O)n. There are two types of linkages that connect the monosaccharides forming the polysaccharides: the α-glycosidic bond and the β-glycosidic bond. These two types on bonds are illustrated in Figure 2.4 for the disaccharides maltose and lactose. In the disaccharide maltose structure illustrated in Figure 2.4a, there are two α-d-glucose units connected forming an α-glycosidic bond, while in lactose
CARBOHYDRATESâ•…â•… 63
(a) OH
OH O H
H OH
H
O H
H OH
H
H
H
O
OH H
OH H
OH
OH
α-D-glucose
α-D-glucose
Maltose with α-glycosidic bond (b) OH
OH O
HO OH H
O
H
H
O OH
H OH H
H H
H H
H
OH
OH
β-D-glucose
β-D-galactose
Lactose with β-glycosidic bond (c)
OH O H
H OH
H
HO H
O
H
H
HO
O
OH H
OH
α-D-glucose
OH
H
OH
β-D-fructose
Sucrose with α-glycosidic bond Figure 2.4.╇ (a) Maltose structure comprised of two α-d-glucose units connected forming an α glycosidic bond. (b) Lactose structure comprised of a β-d-galactose unit connected to a β-d-glucose forming a β glycosidic bond. (c) A α-sugar connected to a β-sugar as a α-d-glucose and β-d-fructose, the common table sugar sucrose forming a α glycosidic bond.
64â•…â•… GLYCOSYLATION OF PROTEINS
(Fig. 2.4b), there is a β-d-galactose unit connected to a β-d-glucose forming a β-glycosidic bond. It is also possible to have an α-sugar connected to a β-sugar as illustrated in Figure 2.4c between α-d-glucose and β-d-fructose forming the common table sugar sucrose with an αglycosidic bond. While the three polysaccharides glycogen, starch, and cellulose are the most abundant forms of polysaccharides in biological systems, there are other examples of polysaccharides that contain different sugar units than glucose that the mass spectrometrist may encounter. Examples include derivatives of β-glucosamine, a sugar where an amino group has replaced a hydroxyl group on carbon atom 2, found in the cell wall of bacteria and chitin, which is found in the exoskeletons of insects. There are also the oligosaccharides, shorter chains of sugar units than the polysaccharides, found as modifications on proteins, which are covered in this chapter. 2.4.1 Ionization of Oligosaccharides Normal electrospray that is pneumatically assisted has been observed to produce a low response for oligosaccharides that are not modified when analyzed by MS. The use of nano-electrospray, however, has shown to induce an enhanced response for the analysis of oligosaccharides by MS that is comparable with that observed of the ionization of peptides.1 This is due to the decrease in the size of the nano-electrospray drops, which helps to increase the surface activity of the hydrophilic oligosaccharides. With electrospray, the response of the oligosaccharides tends to decrease with increasing size of the oligosaccharides.2 This is not the case when using matrix-assisted laser desorption ionization (MALDI) as the ionization source as the response of the oligosaccharides basically stays constant with increasing oligosaccharide size. The electrospray process, though, is a softer ionization process as compared with MALDI, which has been observed to promote metastable fragmentation of the oligosaccharides. The inclusion of metastable fragmentation can increase the complexity of the mass spectra obtained. Similar to the mass spectrometric analysis of most biomolecules, there are three choices for the ionized species as either protonated [M╯+╯nH]n+, metalized [M╯+╯Na]+, or [M╯+╯Li]+ both measured in positive ion mode or deprotonated [M╯−╯nH]n− measured in negative ion mode. Product ion mass spectra of protonated oligosaccharides tend to dominate in Bm- and Yn-type ions. Also, when collision-induced dissociation (CID) is performed on larger oligosaccharides, the product ions are primarily derived through glycosidic bond cleavage, giving sequence information but little information concerning branching. When CID is performed on smaller
CARBOHYDRATESâ•…â•… 65
oligosaccharides, the product ions are produced through both glycosidic bond cleavage and cross-ring cleavage. This is due to the greater number of vibrational degrees of freedom in the larger oligosaccharides that are able to dissipate the internal energy imparted to them from the collisions through vibrational relaxation. In high-energy CID studies such as those obtained with a sector mass analyzer or the MALDI time-of-flight (TOF)-TOF mass analyzer, the use of pro�tonated species was observed to have more efficient fragmentation of the glycosidic bonds as compared with metal-adducted oligosaccharides. Negative ion mode mass spectrometric analysis of deprotonated oligosaccharides does not have a very efficient or high response for neutral oligosaccharides. The action of hydroxylic hydrogen migration has been suggested as a limiting factor for the observance of abundant cross-ring cleavage ions in negative ion mode analysis of deprotonated ions while supporting glycosidic bond cleavages.3 Sialylated and sulfated oligosaccharides can have an enhanced response and informative product ion spectra as deprotonated species analyzed in negative ion mode. 2.4.2 Carbohydrate Fragmentation Oligosaccharides are often observed in nature to be very complex mixtures of species that are very closely related to isomeric compounds. This, and the fact that they contain many labile bonds and can be highly branched, makes the analysis of oligosaccharides by MS difficult. In mass spectrometric analysis of the polysaccharides using collisioninduced dissociation for the production of product ion spectra, the polysaccharides fragment according to specific pathways that have a standardized naming scheme. Figure 2.5 illustrates the types of fragmentation that can take place with the polysaccharide chain (adapted from Domon and Costello4). The sugar ring structure on the far left also contains the numbering scheme for the sugar ring carbon atoms. Whether it is a five-membered ring or a six-membered ring, the counting starts with the first carbon to the left of the ring oxygen. The naming scheme in Figure 2.5 as illustrated is for a three-sugar containing oligosaccharide, with an R-group to the right of the structure; therefore, the naming starts at Y0 from the right. The general naming scheme for a polysaccharide with Rn sugar units would result in a naming of Yn and Zn directly left of the Rn, Yn╯+╯1 and Zn╯+╯1 for the next set and so forth. We will take a closer look at the generation of the B-type ions and the Y-type ions in positive ion mode to better understand the generation of the product ions and their naming using the more general scheme for a polysaccharide based off of the scheme in Figure 2.5. The
66â•…â•… GLYCOSYLATION OF PROTEINS Y2
HO
4H H 3
OH
Y1
1
2
OH O H
O H O
H
H
H
A1
C1
OH O R
H OH
B1
H
H
O
OH
H
0,2
Y0 Z0
Z1
OH O H
5
OH
X1
OH
6 H
1,5
Z2
2,4
A2
OH
H
B2
C2
H
2,5
A3
B3 C3
Figure 2.5.╇ Fragmentation pathways and naming scheme for polysaccharides.
generation of the B2-type and Y4-type ions is illustrated in Figure 2.6 for a six-sugar unit polysaccharide, obtained in positive ion mode. The B2-type ion is a charged oxonium ion that is contained within the sugar ring structure. The Y4-type ion carries a positive charge in the form of protonation. The production of the C2-type and Z4-type ions in positive ion mode is illustrated in Figure 2.7. Here, the C2-type ion carries a positive charge in the form of protonation derived from hydrogen transfer from the reducing side of the saccharide. The positive charge of the Z4-type ion is also from protonation, but the ion has also suffered loss of water to the nonreducing carbohydrate portion of the polysaccharide. In negative ion mode, the fragmentation of the polysaccharides also takes place with the glycosidic bonds; however, ring opening and epoxide formation are observed in the product ions. The production of the B-type and Y-type ions in negative ion mode is illustrated in Figure 2.8. For the B-type ion, the deprotonated precursor will dissociate where the hydroxyl group will go with the reducing portion of the sugar while the carbohydrate portion suffers ring opening and epoxide formation. A similar mechanism is observed for the Y-type ion except
CARBOHYDRATESâ•…â•… 67 OH O H
H H
H
OH
OH O H O
OH
HO
H
H
H+
H
H
O
OH
OH O R3
H
H OH
O H H
OH
OH
H
H
Production of B-type ion OH
OH
OH
+
O H
H H
H
OH
O H O
HO
H
H
OH
H
H
H
+
OH O R3
H
H OH
O H
HO
OH
OH
H
B2 ion
H
Neutral fragment
Production of Y-type ion OH O H
H H
H
OH
HO
O H O
H
H
OH
Neutral fragment
H
OH
O H
HO H
+
H
OH
H O R3
H
H OH
OH
OH
+
OH
H
Y4 ion
Figure 2.6.╇ Production of B2-type and Y4-type ions for a six-sugar unit polysaccharide in positive ion mode.
there is a proton transfer from the reducing portion of the structure to the carbohydrate portion, leaving the reducing end deprotonated and negatively charged. The production of the C-type and Z-type ions in negative ion mode is illustrated in Figure 2.9. In these pathways, the epoxide is forming on the reducing end of the polysaccharide, and there is no ring opening taking place. In positive ion mode, cleavage of the ring structure producing the A-type ion is not often observed. In negative ion mode, cleavage of the ring structure producing the A-type ion is observed. In both positive
68â•…â•… GLYCOSYLATION OF PROTEINS OH O H
H H
H
OH
OH O H O
OH
H
HO
H
H+
H
H
O
OH
OH O R3
H
H OH
O H H
OH
OH
H
H
Production of C-type ion OH O H
H H
H
OH
H+
O H
O H
O
HO
H
H
H
H
OH
OH
OH
+ OH
H OH
OH
OH
O R3
H OH
H
C2 ion
H
Neutral fragment
Production of Z-type ion OH O H
H H
H
OH
HO
O H
O H O
H
H
OH
H
OH
Neutral fragment
H
H
+ OH
H OH
OH
OH
OH
H O R3
H
+
OH
H
Z4 ion
Figure 2.7.╇ Production of C2-type and Z4-type ions for a six-sugar unit polysaccharide in positive ion mode.
and negative ion modes, the cleavage of the sugar ring producing the X-type ions is observed and produces a number of different types of product ion structures. When generating product ions of the oligosaccharides using highenergy CID, the products produced are different for protonated oligosaccharides than that of metal ion adducts. It has been observed that the production of cross-ring cleavages is more enhanced in the metal ion adduct oligosaccharides. In low-energy CID, the amount of glycosidic bond cleavage is low for the metal ion adduct species. This can be
CARBOHYDRATESâ•…â•… 69 O-
OH
OH
O H
H H
H
O H
O H O
OH
H
H
HO H
OH O R3
H
H OH
H
O
OH
H
OH
OH
H
H
Production of B-type ion OH
OH
O O H
H H
H
O H O
OH
H
H
HO
OH
OH O R3
H O-
H
H
H
+
H OH
O H
HO
OH
H
B2 ion
H
Neutral fragment
Production of Y-type ion OH O H
H H
H
OH
-O
O H O
HO
H
H
OH
H
Neutral fragment
O H H
+
H
OH O R3
H
H OH
OH
O
OH
H
OH
H
Y4 ion
Figure 2.8.╇ Production of B2-type and Y4-type ions for a six-sugar unit polysaccharide in negative ion mode.
explained due to the types of bonding that is taking place between the oligosaccharide and the proton or metal ion. In protonation, the proton is associated with the glycosidic oxygen, which is the most basic site in the structure. This destabilizes the glycosidic bond, resulting in chargelocalized and charge-driven fragmentation of the glycosidic bond. However, with the metal ion, the bonding is more complex where the charge of the metal ion is not directly associated with the glycosidic bond oxygen but is distributed to a number of local oxygen atoms. This results in a higher activation barrier for the induced fragmentation of
70â•…â•… GLYCOSYLATION OF PROTEINS OH O H
H H
H
OH
OH
O H
O H O
OH
H
H
HO
H
O
OH
OH O R3
H
H OH
H
H
OH
O-
H
H
Production of C-type ion OH O H
H H
H
O
OH
O H
O H
HO
H
H
OH
H
OH
H
+ O-
H OH
OH
OH
O
H
OH O R3
H H
H
C2 ion
Neutral fragment
Production of Z-type ion OH O H
H H
H
OH
HO
O H
O H O
H
H
OH
H OH
OH
OH
H
Neutral fragment
OH OH
H
+ O
H
OO R3
H H
H
Z4 ion
Figure 2.9.╇ Production of C2-type and Z4-type ions for a six-sugar unit polysaccharide in negative ion mode.
the oligosaccharide. This effect is illustrated in Figure 2.10 where Figure 2.10a is for the low activation barrier protonated oligosaccharide, and Figure 2.10b shows the high activation barrier metal ion adduct. 2.4.3 Complex Oligosaccharide Structural Elucidation It should be noted that it is actually quite difficult to completely sequence a highly branched and/or substituted polysaccharide that are often many units in length. The possible combinations are quite large
CARBOHYDRATESâ•…â•… 71 H OH
(a)
H
H OH
O
H
O
H
O
O
HO
H
OH
O
HO
H
H
H
OH
H
H
Low activation barrier H OH
H OH H
O
H
+
O
O
HO
HO
H H
O
HO
OH
H
OH
H H
H
(b) H HO
M H
H OH
O
O
H O
HO
H
OH
H
O O
HO
H
H
OH
H
H
X High activation barrier H HO
M H
H OH
O
H
O
O
O
HO
H H
O
HO
OH
H
H H
OH H
Figure 2.10.╇ Fragmentation pathway mechanisms for the cleavage of the glycosidic bond through (a) protonation and (b) metal ion adduct (modified from Cancilla et al.5).
72â•…â•… GLYCOSYLATION OF PROTEINS
and a systematic approach to sequencing complicated polysaccharides does not exist. For neutral carbohydrates, the best approach is the product ion mass spectral generation using metal adducts in positive ion mode with either electrospray ionization (ESI) or MALDI as the ionization source. Acidic oligosaccharides are measured in negative ion mode as deprotonated species using either ESI or MALDI as the ionization source. Typically, one cannot very readily interpret and assign a structure of a complicated oligosaccharide from its tandem mass spectrum. The best approach is the use of experience in combination with product ion spectral libraries to solve the structure of complicated, branched, and/or substituted oligosaccharides. There are some examples that with careful consideration, a complex oligosaccharide structure can be either partially or sometimes fully elucidated with product ion spectral interpretation. High-energy product ion spectra of the hybrid glycan (Man)5(GlcNAc)4 was obtained using a magnetic sector mass spectrometer with a MALDI ionization source, a collision cell, and an orthogonal TOF mass analyzer by Clayton and Bateman6 that allowed sequencing of the complex oligosaccharide. Figure 2.11 is the product ion spectrum of the high-energy collision-induced dissociation of the hybrid glycan (Man)5(GlcNAc)4. The spectrum has been split into two sections to better illustrate the rich abundance of product ions generated. The product ion spectrum contains B-, Y-, and X-type ions that allowed the structural elucidation of the oligosaccharide. Figure 2.12a, b illustrates the assignment of the product ions contained within the Figure 2.11 spectrum to the structure of the oligosaccharide. 2.5 THREE OBJECTIVES IN STUDYING GLYCOPROTEINS In applying MS in the characterization of glycosylated proteins, there are three objectives that the researcher is attempting to achieve: (1) to get an identification of the glycosylated peptides and proteins;, (2) to accurately determine the sites of glycosylation; and finally, to (3) determine the carbohydrates that make up the glycan and the structure of the glycan. 2.6 GLYCOSYLATION STUDY APPROACHES One approach that has been employed in glycosylation characterization of proteins has been to cleave the glycans from the proteins, separate them from the proteins, and subsequently, measure them by MS.
Relative abundance (%)
GLYCOSYLATION STUDY APPROACHESâ•…â•… 73
% 100 90 80 70 60 50 40 30 Na+ 20 23.0 10 0 0
x12.00 B2β B1α 226.2
475.3 1.5
X2
H4-N-OH 653.3
0.2
X0 124.1 C1γ C1γ
B1γ B1γ 185.1
35
A1
C2β Y1 C1α
Y2 C2α 447.3 H2 347.3 406.3 1.5 HN X1 0.2 N2 X1
C2γ 609.3 0.4 H3 A3 B 3γ H2N 0.2 X2 569.3
3.5 A3 786.4
H4
H5
H3N H2N2
50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 m/z
% 100 Ion a 90 H4N 80 874.4 70 60 50 40 30 20 10 0
1663.5
x12.00
MNa+
1255.4
H 5N
1061.4
H4N2
1.5
X3γ B3
Y3α C3
Y3γ
3.5 A4 1326.5 1.5
X3α
C4 Y3β 1.5X4α Y4α 1.5X3β 1460.6 1.5 X4β 1.5 X4γ H5N3 1529.6 B4
2.4
3.5
2.4
Y4γ Y4γ
A5
X3 X4
0.2
A5
900 950 1000 1050 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650
m/z Figure 2.11.╇ High-energy (800╯eV) MALDI╯±â•¯CID spectrum of the hybrid glycan (Man)5(GlcNAc)4 recorded from 2,5-dihydroxybenzoic acid (DHB). The type D ion is at m/z 874.4, and the presence of the bisecting GlcNAc residue is indicated by the ion 221 mass units lower at m/z 653.3. (Reprinted with permission. Harvey, D.J., Bateman, R.H., Green, M.R. High-energy collision-induced fragmentation of complex oligosaccharides ionized by matrix-assisted laser desorption/ionization mass spectrometry. J. Mass Spectrom. 1997, 32, 167–187. Copyright 1997. Copyright John Wiley & Sons Inc.)
However, this approach does not give any information as to what proteins and associated sites belonged to what glycans. 2.6.1 MS of Glycopeptides Increasingly, glycosylated proteins are being digested with an endoprotease producing glycopeptides. The glycopeptides are then analyzed by MS. The mass spectrometric analysis of the glycopeptides is not as well defined as is the case with other modifications due to the heterogeneous nature of the oligosaccharide modifications. With glycosylation, the mass shift associated with the modification is not constant as compared with acetylation, oxidation, and phosphorylation (e.g., phosphorylation adds 80╯Da to the peptide mass as HPO3).
(a)
(b)
Figure 2.12.╇ Scheme to show the formation of the (b) cross-ring and (a) glycosidic fragment ions for the spectrum shown in Figure 2.11. (Reprinted with permission. Harvey, D.J., Bateman, R.H., Green, M.R. High-energy collision-induced fragmentation of complex oligosaccharides ionized by matrix-assisted laser desorption/ionization mass spectrometry. J. Mass Spectrom. 1997, 32, 167–187. Copyright 1997. Copyright John Wiley & Sons Inc.) 74
GLYCOSYLATION STUDY APPROACHESâ•…â•… 75
2.6.2 Mass Pattern Recognition Sometimes, mass pattern recognition can be used in single-state precursor ion scans to identify glycosylation. Table 2.1 lists the monoisotopic masses of the monosaccharides commonly found in glycosylated peptides along with their associated residue masses formed through water loss. Glycosylation heterogeneity is observed in precursor mass spectra when a repeating pattern is observed. This type of repeating pattern is indicative of the subsequent addition of a monosaccharide to the glycan chain of the glycosylated peptide. 2.6.2.1 High Galactose Glycosylation Pattern.╇ The repeating pattern for a high galactose glycosylation pattern is illustrated in Figure 2.13 where a repeating value of 54.0 and 40.4╯Da are observed. The TABLE 2.1.╇ Common Monosaccharides and Their Associated Masses Residue Mass (Da)
[M╯+╯H]+ (m/z) Oxonium Ion
Formula
Mass (Da)
[M╯+╯H] (m/z)
C6O6H12 C6O6H12 C6O5H12 C11O9NH19 C8O6NH15
180.0634 180.0634 164.0685 309.1060 221.0899
181.0712 181.0712 165.0763 310.1138 222.0978
162.0528 162.0528 146.0579 291.0954 203.0794
163.0607 163.0607 147.0657 292.1032 204.0872
C8O6NH15
221.0899 222.0978 203.0794
204.0872
Monosaccharide
Mannose (Man) Galactose (Gal) Fucose (Fuc) Sialic acid (SiA) N-acetylglucosamine (GlcNAc) N-acetylgalactosamine (GalNAc)
+
904.1
100
958.1 54
%
678.3
54
718.7 759.1
1012.1 54
40.4
1066.1
0 550
600
650
700
750
800
850
900
950 1000 1050 1100 1150 m/z
Figure 2.13.╇ The repeating mass pattern for a high galactose glycosylation for a glycopeptide with molecular weight of 2709.3╯Da. Series associated with a 54.0-Da difference represents the addition of a galactose moiety to the plus 3 (triply) charge state of the glycopeptide as [M╯+╯3H]3+. The series associated with a 40.4-Da difference represents the addition of a galactose moiety to the plus 4 (quadruply) charge state of the glycopeptide as [M╯+╯4H]4+.
76â•…â•… GLYCOSYLATION OF PROTEINS
glycopeptide in the mass spectrum has a molecular weight of 2709.3╯Da. The series associated with 54.0╯Da represents the addition of a galactose moiety to the plus 3 (triply) charge state of the glycopeptide as [M╯+╯3H]3+. The series associated with 40.4╯Da represents the addition of a galactose moiety to the plus 4 (quadruply) charge state of the glycopeptide as [M╯+╯4H]4+. Each addition of a monosaccharide to the glycan chain is through a condensation reaction; therefore, the series will have a difference value of the monosaccharide minus water. 2.6.3 Charge State Determination According to the charge state, the difference between each series will be associated with a multiple of the charge. Table 2.2 lists the pattern difference for glycopeptide heterogeneity residue addition. 2.6.4 Diagnostic Fragment Ions There are also diagnostic fragment ions that can be produced during collision-induced dissociation product ion spectral collection of glycated peptides known as oxonium ions. Hexose (Hex, generic name for galactose and mannose) has an oxonium ion at m/z 163, fucose (Fuc) at m/z 147, SiA at m/z 292, N-acetylhexosamine (HexNAc) at m/z 204, and N-acetylglucosamine (GlcNAc) at m/z 204. The observance of these oxonium ions in product ion spectra has been used to help in the identification of the glycan modification on a peptide. However, care must be observed in using oxonium ions as diagnostic peaks, such as in precursor ion scanning, as species other than glycans can also generate similar isobaric product ions. 2.6.5 High-Resolution/High-Mass Accuracy Measurement and Identification Using a Finnigan LTQ-FT hybrid linear ion trap/Fourier transform ion cyclotron resonance (FTICR) mass spectrometer, Peterman et al. at TABLE 2.2.╇ Pattern Difference in Mass for Glycopeptide Heterogeneity Residue Addition Sugar
Formula
Hexose d-Hexose HexNAc SiA
C6O5H10 C6O4H10 C8O5NH13 C11O8NH17
+1
+2
+3
+4
+5
162.1 146.1 203.2 291.3
81.0 73 101.6 145.6
54.0 48.7 67.7 97.1
40.4 36.5 50.8 72.8
32.4 29.2 40.6 58.2
GLYCOSYLATION STUDY APPROACHESâ•…â•… 77
Thermo Electron Corporation (Somerset, NJ) reported a study of highresolution/high-mass accuracy measurement and identification of glycopeptides in bovine fetuin. Figure 2.14 is a full-scan mass spectrum of a high-performance liquid chromatography (HPLC) peak eluting from a reverse phase C18 column (retention time of 17.7 minutes) from a tryptic digest sample of bovine fetuin. The eluting peak was identified as T54–85 glycopeptide containing two glycoforms, the first as Hex5HexNAc4Neu5Ac3 with peaks at m/z 1176.7 for the +5 charge state and m/z 1470.7 for the +4 charge state, and Hex6HexNAc5Neu5Ac3 with peaks at m/z 1308.0 for the +5 charge state and m/z 1634.7 for the +4 charge state. Mass accuracies of the monoisotopic peak for each ion are labeled in the figure. Also included in the spectrum are the identified structures of the glycan modifications. The open squares are for HexNAc, the filled squares are for Hex, and the open triangles are for Neu5Ac. The
186.0
100
1.43 ppm
167.9
90 80 Relative Abundance
1634.70618 z=4
125.9
138.0
70
84.0
60
80
100
120
140
160 m/z
180
200
50 40
220
2.77 ppm 1470.64575 z=4 1.10 ppm 1307.96570 z=5
240
HexNac Hex Neu5Ac
30 20 10
1.51 ppm 1176.72107 z=5
536.16602 z = 1 746.48755
0 400
z=1
600
800
1000
1200 m/z
1400
1600
1800
2000
Figure 2.14.╇ Full-scan mass spectrum for the retention time of 17.6 minutes showing two T54–85 glycoforms with the oligosaccharide structures and mass accuracies listed. The open squares represent HexNAc, filled squares represent Hex, and open triangles represent Neu5Ac. The inset shows the survey scan acquired directly before the highresolution/high-mass accuracy spectrum acquisition showing the relative intensity of the characteristic fragment ions from HexNAc. (Reprinted with permission. This article was published in J Am Soc Mass Spectrom, Peterman, S.M., Mulholland, J.J. A novel approach for identification and characterization of glycoproteins using a hybrid linear ion trap/FT-ICR mass spectrometer, 2006, 17, 168–179. Copyright Elsevier 2006.)
78â•…â•… GLYCOSYLATION OF PROTEINS
inset illustrates a survey scan that was acquired directly before the fullscan mass spectrum. The survey scan is performed to look for diagnostic HexNAc fragment ions used to indicate the presence of a glycopeptide in the eluting peak. 2.6.6 Digested Bovine Fetuin Data-dependent product ion spectra were also collected for the glycopeptides eluting from the HPLC separation of the digested bovine fetuin. This was used for the sequencing of the oligosaccharides associated with the T54–85 peptide. Figure 2.15 illustrates the data-dependent MS/MS spectra for the (Fig. 2.15a) m/z 1470 +4 charge state glycopeptide and the (Fig. 2.15b) m/z 1635 +4 charge state glycopeptide. Across the top of the spectra, the associated mono- and oligosaccharides for
Relative Abundance
(a) 100
-Hex-HexNAc-HexNeu5Ac
-Hex
1644.7
80 1292.9
-Neu5Ac
-Hex
-Hex-HexNAc
-Neu5Ac
1741.8
60 40
1590.7 1397.9 1469.9
20
1687.8
1766.1
1863.7
0 (b)
-HexNAc -Hex -Hex-HexNAc
-Hex -Hex -HexNAc -Hex
Relative Abundance
100 80 60
1766.1
-Neu5Ac -Hex-HexNAc -Neu5Ac
1863.5
1293.0 1590.7
40
1809.5
1360.7
1300
1400
1960.5
1687.8
1489.2
20 0
-Neu5Ac
1500
1600 m/z
1700
1800
1887.8
1900
2000
Figure 2.15.╇ Full-scan data-dependent MS/MS spectra for the (a) m/z 1470 and (b) m/z 1635 glycopeptide precursor ions acquired in the linear ion trap. (Reprinted with permission. This article was published in J Am Soc Mass Spectrom, Peterman, S.M., Mulholland, J.J. A novel approach for identification and characterization of glycoproteins using a hybrid linear ion trap/FT-ICR mass spectrometer, 2006, 17, 168–179. Copyright Elsevier 2006.)
REFERENCESâ•…â•… 79
the difference between the product ions are listed. From the mass spectra, the structures of the glycans in conjunction with the saccharide residue identifications can be deduced successfully. REFERENCES 1.╇ Bahr, U.; Pfenninger, A.; Karas, M.; Stahl, B. Anal. Chem. 1997, 69, 4530–4535. 2.╇ Harvey, D.J. Rapid Commun. Mass Spectrom. 1993, 7, 614–619. 3.╇ Hofmeister, G.E.; Zhou, Z.; Leary, J.A. J. Am. Chem. Soc. 1991, 113, 5964–5970. 4.╇ Domon, B.; Costello, C.E. Glycoconj. J. 1988, 5, 397–409. 5.╇ Cancilla, M.T.; Penn, S.G.; Carroll, J.A.; Lebrilla, C.B. J. Am. Chem. Soc. 1996, 118, 6736–6745. 6.╇ Clayton, E.; Bateman, R.H. Rapid Commun. Mass Spectrom. 1992, 6, 719–720.
3
Sulfation of Proteins as Posttranslational Modification 3.1 GLYCOSAMINOGLYCAN SULFATION There are two primary types of sulfation that occur involving protein posttranslational modification (PTM): carbohydrate sulfation of cell surface glycans and sulfonation of protein tyrosine amino acid residues. We will begin with a look at carbohydrate sulfation, which is an important process that occurs in conjunction with extracellular communication. The glycosaminoglycan modification presented previously in Chapter 2 occurs with the portion of the membrane-bound protein that is protruding out of the surface of the cell. The predominant enzyme carbohydrate sulfotransferases is the protein responsible for adding the sulfation modification to the cell surface proteins and is found in the extracellular matrix. The process of glycosaminoglycan sulfation is a precise process and has been found to be associated for the activation of cytokines and growth factors and for inflammation site endothelium adhesion.1–4 3.2 CELLULAR PROCESSES INVOLVED IN SULFATION The biological modification of protein-bound carbohydrate sulfation is a process that has been associated with normal cellular processes, arthritis, cystic fibrosis, and pathogenic diseased states.4–8 Sulfation is used for the removal of proteins and other biomolecules, such as metabolic end products, steroids, and neurotransmitters,9 from the extracellular matrix or body due to the increased solubility of the sulfated species and the reduced bioactivity. Specific glycoproteins that have
Proteomics of Biological Systems: Protein Phosphorylation Using Mass Spectrometry Techniques, First Edition. Bryan M. Ham. © 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 81
82â•…â•… SULFATION OF PROTEINS AS POSTTRANSLATIONAL MODIFICATION
been observed to undergo sulfation modification include the gonadotropin hormones follitropin, lutropin, and thyrotropin,10 mucins,11 and erythropoietin.12 3.3 BRIEF EXAMPLE OF PHOSPHORYLATION In Chapter 4, we will see that phosphorylation PTM of peptides predominantly takes place on the free hydroxyl moiety of the serine (Ser), threonine (Thr), and tyrosine amino acid residues by kinase enzymes. The amino acid residue phosphorylation mechanism is through the use of the activated donor molecule adenosine triphosphate (ATP). There are similarities to sulfation as we will see next. 3.4 SULFOTRANSFERASE CLASS OF ENZYMES The eukaryotic cell process of sulfation is through a class of enzymes known as sulfotransferase. The sulfotransferase class of enzymes is composed of two general types that either sulfate small molecules such as steroids and metabolic products, the cytosolic sulfotransferase, or the membrane-bound, Golgi-localized sulfotransferase that will sulfate glycoproteins or the tyrosine residue of proteins (to be covered in the next section). The transfer of the sulfonate (SO−3 ) group to the glycoprotein (or amino acid hydroxyl group) is through the activated donor phosphoadenosine phosphosulfate (PAPS) that is produced through the combination of SO2− 4 and ATP synthesized by the protein PAPS synthase.13 An example of a sulfated glycoprotein is illustrated in Figure 3.1. The glycan has been O-linked to a general Ser/Thr amino acid residue within the protein’s peptide chain. The mechanism for the sulfation of a saccharide is illustrated in Figure 3.2. 3.5 FRAGMENTATION NOMENCLATURE FOR CARBOHYDRATES When glycated peptides are fragmented by tandem mass spectrometry (MS), the product ions produced are classified according to a naming nomenclature that is similar to the peptide nomenclature presented in Figure 3.3. The fragmentation nomenclature for carbohydrates is covered extensively in Chapter 2.14 However, a brief coverage is presented here to illustrate the products that are produced upon
SULFATED MUCIN OLIGOSACCHARIDESâ•…â•… 83 H
OH OH O
HO
H H
OH
HO H OH
OH
OH O
H O
H
H
HO H
OH O H
OH H
H H
H
OH
H
O
H
H
O
H
O
S O
Ser/Thr
O
H
O
O
O
HO O
OH
H
O H
H
OH H
H
OH H
Figure 3.1.╇ A sulfated glycoprotein. The glycan has been O-linked to a general Ser/Thr amino acid residue within the protein’s peptide chain.
collision-induced dissociation of sulfated and glycated mucins. Figure 3.4 illustrates the general types of fragmentation observed in tandem MS of carbohydrates. 3.6 SULFATED MUCIN OLIGOSACCHARIDES Glycated peptides have an enhanced response in the negative ion mode. It has been observed that different product ions are observed in the two different ion modes (positive and negative). In positive ion mode, product ions of types B and Y are observed, while in negative ion mode, Y-, B-, C-, and Z-type ions are often observed. The product ion spectra illustrated in Figure 3.5 are examples of sulfated mucin oligosaccharides that have been chemically released from the mucins and converted to alditols prior to high-performance liquid chromatography (HPLC) separation and mass spectral analysis. The two product ion spectra were obtained from isomers of an m/z 975 oligosaccharide collected in negative ion mode as deprotonated species [M╯−╯H]−. The spectra illustrate the production of B-, Y-, and Z-type ions from the fragmentation of the oligosaccharide. Also observed in the spectra is a peak at m/z 97 produced from the neutral loss of HSO−4 , indicative of sulfation.
84â•…â•… SULFATION OF PROTEINS AS POSTTRANSLATIONAL MODIFICATION OH
OH
OH
OH O
OH O
Sulfotransferase
H
H H
H
H
H H
OH
H
H
OH
O
OH O
S
OH
+
O
H2O
O
OH O
H
S
O O
O O P O
OH
P
HO
HO
O
OH
O OH
O
P
P
O
O
O
OH
HO O OH
N
OH
N N
N
N
N
N N
NH2 H2N
3′-phosphoadenosine-5′-phosphosulfate (PAPS)
3,5-ADP
Figure 3.2.╇ Mechanism for biosynthesis of sulfated saccharide.
Examples of fragmentation pathway mechanisms for the production of a select number of product ions illustrated in Figure 3.5 are presented in Figure 3.6. 3.7 TYROSINE SULFATION As will be discussed in Chapter 5, the tyrosine amino acid residue can be phosphorylated, though it is the least modified as compared with Ser and Thr. The PTM in the form of sulfation also takes place with tyrosine, and it is the most common form of PTM that involves tyrosine.
TYROSINE SULFATIONâ•…â•… 85
H 3N
R1
O
C H
C
N H
R2
O
C H
C
N H
R3
O
C H
C
N H
R4
O
C H
C
OH
Mobile proton
H 2N
R1
O
C H
C
R2 N H
C H
C
H
R3
O
N H
C H
C
N H
R4
O
C H
C
OH
O
R1 CH
H 2N
O
H 2N
C
O
C H
C
N H
R4
O
C H
C
R4
O
C H
C
OH
O
C HN
R3
CH R2 O C
O R1
CH
C CH
R2
N
H 2N
H H 2N
R3
O
C H
C
N H
OH
O C
O R1
CH
C CH H 2N
N
R3
O
C H
C
R4
O
C H
C
R2 H 3N
N H
OH
H
y2 ion b2 ion
Figure 3.3.╇ Fragmentation pathway leading to the production of the b and y ions from collision-induced dissociation from the polypeptide backbone chain.
86â•…â•… SULFATION OF PROTEINS AS POSTTRANSLATIONAL MODIFICATION Y2
1,5
Z2
X1
OH
H
H
Z1
OH O H
O H O
OH
HO
H
H
H
0,2
A1
C1
OH O R
H OH
B1
H
H
O
OH
H OH
Y0 Z0
OH O H
H
Y1
OH
H
2,4
A2
B2
H
2,5
C2
A3
B3 C3
Figure 3.4.╇ Carbohydrate fragmentation ion types and associated nomenclature.14 Y1β
Y2β
975x
Rel. Int (%)
100
HSO4– 97 199 139
100
B2α/Y2α 282
B2α 444 Y2α/Z1β 487 Y2α/1β 505
HexNAcol
Y2α
O
O B2α SO3–
Y2β 829
Y1β 667
Z1β 649
Y2α 813
x 5.0
975y
O
Hex O HexNAc
Y2α Y3α
Rel. Int (%)
Z1β
Fuc O Hex x 5.0
[M-H]– 975
Hex
975
Y1β O Z1β
Z2α
HexNAcol O
Fuc O Hex O HexNAc Y2α/Z1β O B3α 487 SO3– Y B 2α/1β B3α/Y3α 3α Y2α Y 505 590 1β Y3a 667 B3α/Y2α 444 813 829 Z2α 199 282 649 139
HSO4– 97
100
500
m/z 1000
Figure 3.5.╇ Electrospray ionization (ESI) tandem mass spectra of two isomers with [M╯−╯H]− m/z 975 that eluted at 46.5 (975x) and 49.5 minutes (975y) obtained from porcine stomach mucins. The fragmentation nomenclature is according to Figure 3.4. The five marked region has been magnified five times. (Reprinted with permission from Thomsson, K.A., Karlsson, H., Hansson, G.C. Anal. Chem. 2000, 72, 4543–4539. Copyright 2000 American Chemical Society.)
TYROSYLPROTEIN SULFOTRANSFERASES TPST1 AND TPST2â•…â•… 87
Y1β Y2β Z2β
Z1β CH2OH CHNHAc
o
o
O
O
CH 3 CHOH A0α
o
Y3α′′
CH2OSO3– o
Y3α′′′ O
o O
O
O o
4 CHOH A0α
B2α o
Y2α′′ O
CH2
B3α
Figure 3.6.╇ Fragment annotations applied in this study based on the suggested nomenclature by Domon and Costello14 and our previous report. (Reprinted with permission from Thomsson, K.A., Karlsson, H., Hansson, G.C. Anal. Chem. 2000, 72, 4543–4539. Copyright 2000 American Chemical Society.)
Within the total protein content of an organism, it is estimated that up to 1% of all tyrosine residues may be sulfated.15 The first reporting of the observance of the sulfate to tyrosine covalent modification was in 1954 by Bettelheim.16 Sulfation takes place on transmembrane spanning and secreted proteins. The sulfation mechanism within eukaryotic cells takes place in the trans-Golgi where the membrane-bound enzyme trosylprotein sulfotransferase catalyzes the modification (sulfation) of proteins synthesized in the rough endoplasmic reticulum (RER). 3.8 TYROSYLPROTEIN SULFOTRANSFERASES TPST1 AND TPST2 The mechanism for sulfation of the tyrosine residue forming tyrosine O-sulfate esters (arylsulfate) is the same as for glycosaminoglycan sulfation through the activated donor 3′-phosphoadenosine-5′phosphosulfate (PAPS). There are two different TPSTs that have been isolated and identified by researchers: TPST1 and TPST2.17,18 Others have observed that N-sulfation and S-sulfation can occur19 and that on rare occasions, Ser and Thr can be sulfated.20 The biosynthesis pathway for the sulfation of the tyrosine amino acid residue by TPSTs through the activated donor 3′-phosphoadenosine-5′-phosphosulfate (PAPS) is illustrated in Figure 3.7.
88â•…â•… SULFATION OF PROTEINS AS POSTTRANSLATIONAL MODIFICATION O OH O
S
O
O
H N
H N
N H
N H
O
Tyrosylprotein sulfotransferase (TPST)
O
OH O
S
O O
O O P O
OH
P
HO
HO
O
OH
O OH
O
P
P
O
O
O
OH
HO O OH
N
OH
N N
N
N
N
N N
NH2 H2N
3′-phosphoadenosine-5′-phosphosulfate (PAPS)
3,5-ADP
Figure 3.7.╇ Tyrosine sulfation reaction mechanism.
SULFATED PEPTIDE PRODUCT ION SPECTRAâ•…â•… 89
3.9 O-SULFATED HUMAN PROTEINS Sulfation of the tyrosine amino acid residue is directly involved in the recognition processes of protein-to-protein interactions of membranebound and secreted proteins. A recent article from Woods et al.21 at the National Institutes of Health (NIH) gives a listing of proteins that have been observed to be sulfated that includes plasma membrane proteins, adhesion proteins, immune components, secretory proteins, and coagulation factors to name a few. A listing of O-sulfated human proteins currently included in the Swiss-Prot database is shown in Table 3.1. The UniProt database currently lists 275 proteins that are tyrosine-sulfated (http://www.uniprot.org). There has not, at this point, been observed any type of enzymatic mechanism that would desulfate the modified proteins, and because sulfation is pH stable, proteins containing a sulfation modification can be observed in urine excretion.
3.10 SULFATED PEPTIDE PRODUCT ION SPECTRA Product ion mass spectra of sulfated peptides by collision-induced dissociation are similar to that observed for phosphorylated peptides in that the major product ion observed is for the neutral loss of sulfur trioxide SO3 (−80╯ Da) from the precursor ion, often with little other information in the way of product ions. This is illustrated in the product ion mass spectrum of Figure 3.8. In the figure, there are two predominant peaks at m/z 647 for the doubly protonated, sulfated precursor ion [M╯ +╯ 2H]2+ and at m/z 607 for the neutral loss of the sulfate modification [M╯ +╯ 2H-SO3]2+. Loss of 80╯ Da from the precursor ion is also observed with the neutral loss of HPO3 from the phosphorylated tyrosine and histidine amino acid residues. The exact mass of HPO3 is 79.9663╯ Da, while the exact mass of SO3 is 79.9568╯ Da, a difference of 119 parts per million (ppm). The fragmentation pathway mechanisms for the loss of the phosphate or sulfate modifications are the same, resulting with the original, unmodified tyrosine residue, as contrasted with the production of dehydroalanine produced by loss of H3PO4 (98╯ Da). The gas-phase fragmentation pathway mechanism for the neutral loss of the sulfate modification of tyrosine during collision-induced dissociation (CID) product ion production is illustrated in Figure 3.9.
90╅╅ SULFATION OF PROTEINS AS POSTTRANSLATIONAL MODIFICATION TABLE 3.1.╇ Proteins Listed in Swiss-Prot That Are Annotated as Having at Least One O-Sulfated Amino Acid Residue Swiss-Prot Name
1A01_HUMAN
(MHC class I antigen A*1) 7B2_HUMAN A2AP_HUMAN
AMD_HUMAN AMPN_HUMAN C3AR_HUMAN C5AR_HUMAN
CCKN_HUMAN CCR2_HUMAN
CCR5_HUMAN
CMGA_HUMAN CO4A_HUMAN
Description
HLA class I histocompatibility antigen, A-1 alpha chain precursor 83 Neuroendocrine protein 7B2 precursor Alpha-2-antiplasmin precursor (alpha-2plasmin inhibitor) (alpha2-PI) Peptidyl-glycine alphaamidating monooxygenase precursor Aminopeptidase N (EC 3.4.11.2) (hAPN) (alanyl aminopeptidase) C3a anaphylatoxin chemotactic receptor (C3a-R) (C3AR) C5a anaphylatoxin chemotactic receptor (C5a-R) (C5aR) (CD88 antigen) Cholecystokinins precursor (CCK) C\C chemokine receptor type 2 (C\C CKR-2) (CC-CKR-2) (CCR-2) (CCR2) C\C chemokine receptor type 5 (C\C CKR-5) (CC-CKR-5) (CCR-5) (CCR5) Chromogranin A precursor (CgA) (pituitary secretory protein I [SP-I]) Complement C4-A precursor (acidic complement C4)
Sulfation Site(s)a
Sequence Length
365 –
212
484
491
961
973
175, 418, 423, 912
966
174, 184, 318
482
11, 14
350
111, 113
115
26
374
3, 10, 14, 15
352
–
457
1417, 1420, 1422
1744
SULFATED PEPTIDE PRODUCT ION SPECTRAâ•…â•… 91
TABLE 3.1.╇ (Continued) Swiss-Prot Name
CO4B_HUMAN CO5A1_HUMAN
420, 421, 1601, 1604 CXCR4_HUMAN DERM_HUMAN FA5_HUMAN FA8_HUMAN
FA9_HUMAN FETA_HUMAN FIBG_HUMAN FINC_HUMAN GAST_HUMAN GP1BA_HUMAN HEP2_HUMAN
Description
Complement C4-B precursor (basic complement C4) Collagen alpha-1(V) chain precursor
Sulfation Site(s)a
1417, 1420, 1422
Sequence Length
1744
234, 236, 338, 340, 346, 347, 416, 417,
1838 C–X–C chemokine receptor type 4 (CXC-R4) (CXCR-4) Dermatopontin precursor (tyrosine-rich acidic matrix protein [TRAMP]) Coagulation factor V precursor (activated protein C cofactor) Coagulation factor VIII precursor (procoagulant component) Coagulation factor IX precursor (EC 3.4.21.22) (Christmas factor) Alpha-fetoprotein precursor (alpha-fetoglobulin) Fibrinogen gamma chain precursor Fibronectin precursor (FN) (cold-insoluble globulin [CIG]) Gastrin precursor (contains: gastrin 71 [component I]; gastrin) Platelet glycoprotein Ib alpha chain precursor (glycoprotein Ib alpha) Heparin cofactor 2 precursor (heparin cofactor II [HC-II])
21
352
23, 162, 164, 166, 167, 194 693, 724, 726, 1522, 1538, 1543, 1593 365, 414, 426, 737, 738, 742, 1683, 1699 201
201 2224 2351
461
–
609
444, 448
453
876, 881
2386
52 87
101
292, 294, 295
626
79, 92
499 (Continued)
92╅╅ SULFATION OF PROTEINS AS POSTTRANSLATIONAL MODIFICATION TABLE 3.1.╇ (Continued) Swiss-Prot Name
MFAP2_HUMAN MGA_HUMAN NID1_HUMAN OMD_HUMAN OPT_HUMAN ROR2_HUMAN SCG1_HUMAN SCG2_HUMAN SELPL_HUMAN
(CD162 antigen) SUIS_HUMAN THYG_HUMAN VTNC_HUMAN
Description
Microfibrillar-associated protein 2 precursor (MFAP-2) Maltase-glucoamylase, intestinal (includes: maltase [EC 3.2.1.20]) Nidogen-1 precursor (Entactin) Osteomodulin precursor (osteoadherin) Opticin precursor (oculoglycan) Tyrosine-protein kinase transmembrane receptor ROR2 precursor Secretogranin-1 precursor (secretogranin I [SgI]) (chromogranin B) Secretogranin-2 precursor (secretogranin II [SgII]) (chromogranin C) P-selectin glycoprotein ligand 1 precursor (PSGL-1) (selectin P ligand) 46, 48, 51 Sucrase-isomaltase, intestinal (contains: sucrase [EC 3.2.1.48]) Thyroglobulin precursor Vitronectin precursor (serum spreading factor) (S-protein) (V75)
Sulfation Site(s)a
Sequence Length
47, 48, 50
183
415, 424, 1281
1856
289, 296
1247
25, 31, 39
421
71
332
469, 471
943
173, 341
677
151
617
412 236, 238, 390, 399, 666, 762, 764 24 75, 78, 282
1826 2768 478
Reprinted with permission. This article was published in Biochim Biophys Acta, Monigatti, F., Hekking, B., Steen, H. Protein sulfation analysis—a primer, 2006, 1764, 1904–1913. Copyright Elsevier 2006. a Some proteins are known to be sulfated, but the sites of modification are not known. It can clearly be seen that sulfation preferentially occurs in clusters.
USE OF HIGHER ENERGY COLLISIONSâ•…â•… 93
3.11 USE OF HIGHER ENERGY COLLISIONS Increasing the collision energy will generate further product ions of the m/z 647 sulfated species that is illustrated in Figure 3.8, giving sequence information of the peptide in the form of b- and y-type ions. This can allow the sequencing of the peptide and subsequent identification of the protein where it came from. The higher energy collision-induced product ions though are generated from the precursor after the loss of the labile sulfo moeity; thus, the ability to identify the modification site is often not possible. The loss of the labile sulfo group is proton or charge driven. Conditions that allow or favor charge remote fragmentation can induce the production of product ions that still retain the sulfo [M + 2H] 2+ 647.72
100
[M + 2H-SO3 ] 2+
%
607.71
0 350
450
400
500
550
600
650
700
750
800
850
900
m/z
950
Figure 3.8.╇ Product ion mass spectrum of a sulfated peptide at m/z 647 for [M╯+╯2H]2+. The mass spectrum is dominated by the precursor ion and the associated SO3 neutral loss product ion at m/z 607.
O S
O
O O O
S
O
+ OH
H O
H N N H O
H N N H O
Figure 3.9.╇ Gas-phase fragmentation pathway mechanism for the neutral loss of the sulfate modification of tyrosine during CID product ion production.
94â•…â•… SULFATION OF PROTEINS AS POSTTRANSLATIONAL MODIFICATION
moeity. This condition will give the opportunity to determine the site of the sulfo modification of the peptide. 3.12 ELECTRON CAPTURE DISSOCIATION (ECD) One approach to product ion spectral generation that has demonstrated this is the use of ECD in Fourier transform ion cyclotron resonance (FTICR) mass spectrometers. As we saw previously, the ECD fragmentation mechanism is not based on the activation of the precursor through repeated collisions but through the capture of thermal electrons that tend to promote backbone dissociations while still retaining the labile modifications (sulfation, phosphorylation, etc.). Figure 3.10 illustrates a product ion spectrum of drosulfakinin, a polypeptide that is known to have a sulfated tyrosine residue, in the plus two charge state [M╯+╯2H]2+. The product ion spectrum contains a considerable amount of backbone fragmentation allowing sequence determination in conjunction with modification site determination. Again, one drawback to the technique of ECD is the required use of an FTICR mass
Figure 3.10.╇ ECD product ion mass spectrum of drosulfakinin. (Reprinted with permission from Haselmann, K.F., Budnik, B.A., Olsen, J.V., Nielsen, M.L., Reis, C.A., Clausen, H., Johnsen, A.H., Zubarev, R.A. Anal. Chem. 2001, 73, 2998–3005. Copyright 2001 American Chemical Society.)
SULFATION VERSUS PHOSPHORYLATIONâ•…â•… 95
spectrometer. As was presented previously, the technique of electron transfer dissociation used in ion traps will most likely contribute to studies of sulfation PTM of peptides. 3.13 SULFATION VERSUS PHOSPHORYLATION A comparison of the biology, biochemistry, and MS of sulfation versus phosphorylation is given in Table 3.2. The table gives a brief description TABLE 3.2.╇ Summary of the Relevant Characteristics of Phosphorylation and Sulfation as Protein Posttranslational Modifications Phosphorylation
Biology
Function
Biochemistry
Location
Intracellular
Reversibility Location of modifying enzyme
Reversible Cytosolic and nuclear
Activation, inactivation, modulation of protein interaction Radioactive isotopes Chemical stability
Modulation of protein interaction
Edman compatibility
P and 33P
Sulfation
Extracellular (membrane bound) Irreversible Membrane bound in trans-Golgi network
32
35
pY is stable
sY is acid labile
pS/pT are alkaline labile pE/pD/pH are acid labile pY compatibility limited solubility pS undergoes βelimination to dehydroalanine pT undergoes βelimination giving rise to many different side products
S
sY is hydrolyzed during TFA cleavage step
(Continued)
TABLE 3.2.╇ (Continued) Phosphorylation
Removal
Phosphatases pS/pT: harsh alkaline treatment, many side products
Enrichment
Mass spectrometry
α-pY and antiphosphodomain antibodies available IMAC, TiO2
Property
Acidic (2−)
Mass difference Stability during Ionization
+80╯Da (79.9663╯Da) ESI pS/pT: good pY: stable MALDI pS/pT partial loss possible
Stability during CID
Signature after CID Characteristic neutral loss Characteristic fragment ion
pY stable pY: stable pS/pT: sequence dependent loss of H3PO4 pS: dehydroalanine (69╯Da) pT: dehydroaminobutyric acid (83╯Da) –H3PO4 (−98╯Da) –(H3PO4╯+╯H2O) (−116╯Da) PO3– (−79╯Da)
Sulfation
Arylsulfatases of limited use Quantitative hydrolysis: 1╯M HCl, 95°C, 5 minutes No antibody No affinity reagents Acidic (1−) +80╯Da (79.9568╯Da) ESI sY easily lost under standard conditions MALDI +ve: complete loss −ve: partial loss Complete loss Multiple sYs are more stable None
–SO3 (−80╯Da) SO3– (−80╯Da)
Reprinted with permission. This article was published in Biochim Biophys Acta, Monigatti, F., Hekking, B., Steen, H. Protein sulfation analysis—a primer, 2006, 1764, 1904–1913. Copyright Elsevier 2006. IMAC, immobilized metal affinity chromatography; ESI, electrospray ionization; MALDI, matrixassisted laser desorption ionization; TFA, trifluoracetic acid.
96
REFERENCESâ•…â•… 97
of the subcellular protein locations associated with PTM, modification stabilities, and behaviors associated with mass spectral analysis such as ionization stabilities and masses corresponding to the two different modifications. REFERENCES ╇ 1.╇ Hooper, L.V.; Manzella, S.M.; Baenziger, J.U. FASEB J. 1996, 10, 1137–1146. ╇ 2.╇ Bowman, K.G.; Bertozzi, C.R. Chem. Biol. 1996, 6, R9–R22. ╇ 3.╇ Habuchi, O. Biochim. Biophys. Acta 2000, 1474, 115–127. ╇ 4.╇ Bowman, K.G.; Cook, B.N.; de Graffenried, C.L.; Bertozzi, C.R. Biochemistry 2001, 40, 5382–5391. ╇ 5.╇ Plaas, A.H.K.; West, L.A.; Wong Palms, S.; Nelson, F.R.T. J. Biol. Chem. 1998, 273(20), 12642–12649. ╇ 6.╇ Bayliss, M.T.; Osborne, D.; Woodhouse, S.; Davidson, C. J. Biol. Chem. 1999, 274(22), 15892–15900. ╇ 7.╇ Desaire, H.; Sirich, T.L.; Leary, J.A. Anal. Chem. 2001, 73(15), 3513–3520. ╇ 8.╇ Jiang, H.; Irungu, J.; Desaire, H. J. Am. Soc. Mass Spectrom. 2005, 16, 340–348. ╇ 9.╇ Falany, C.N. FASEB J. 1997, 22, 206–216. 10.╇ Green, E.D.; Baenziger, J.U. J. Biol. Chem. 1988, 26, 36–44. 11.╇ Thomsson, K.A.; Karlsson, H.; Hansson, G.C. Anal. Chem. 2000, 72, 4543–4539. 12.╇ Kawasaki, N.; Haishima, Y.; Ohta, M.; Satsuki, I.; Hyuga, M.; Hyuga, S.; Hayakawa, T. Glycobiology 2001, 11, 1043–1049. 13.╇ Hemmerich, S.; Rosen, S.D. Glycobiology 2000, 10, 849–856. 14.╇ Domon, B.; Costello, C.E. Glycoconj. J. 1988, 5, 397–409. 15.╇ Baeuerle, P.A.; Huttner, W.B. J. Biol. Chem. 1985, 260, 6434–6439. 16.╇ Bettelheim, F.R. J. Am. Chem. Soc. 1954, 76, 2838–2839. 17.╇ Beisswanger, R.; Corbeil, D.; Vannier, C.; Thiele, C.; Dohrmann, U.; Kellner, R.; Ashman, K.; Niehrs, C.; Huttner, W.B. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 11134–11139. 18.╇ Ouyang, Y.B.; Moore, K.L. J. Biol. Chem. 1998, 273, 24770–24774. 19.╇ Huxtable, R.J. Biochemistry of Sulfur. New York: Plenum, 1986. 20.╇ Medzihradszky, K.F.; Darula, Z.; Perlson, E.; Fainzilber, M.; Chalkley, R.J.; Ball, H.; Greenbaum, D.; Bogyo, M.; Tyson, D.R.; Bradshaw, R.A.; Burlingame, A.L. Mol. Cell. Proteomics 2004, 3, 429–440. 21.╇ Woods, A.S.; Wang, H.Y.J.; Jackson, S.N. J. Proteome Res. 2007, 6, 1176–1182.
4
Eukaryote PTM as Phosphorylation: Normal State Studies 4.1 MASS SPECTRAL MEASUREMENT WITH EXAMPLES OF HELA CELL PHOSPHOPROTEOME 4.1.1 Introduction As stated in the introduction in Chapter 1, the genomic deoxyribonucleic acid (DNA) sequencing that has been accomplished cannot give direct information concerning posttranslational modifications (PTMs) such as glycosylation (see Chapter 2) and sulfation (see Chapter 3). Another major form of PTM that we are looking at is the phosphorylation of proteins, a significant regulatory mechanism that controls a variety of biological functions in most organisms. Examples of phosphorylation regulating mechanisms include gene expression, cell cycle processes, apoptosis, cytoskeletal regulation, and signal transduction. It has been estimated that up to 30% of all of the proteins in humans exist in the phosphorylated form where 2% of the human genome encode for protein kinases (>2000 genes).1 In eukaryotic cells, the protein phosphorylation takes place with the serine (Ser), threonine (Thr), and tyrosine (Tyr) residues. The reversible protein phosphorylation of the Ser, Thr, and Tyr residues is an integral part of cellular processes involving signal transduction.2,3 The identification of the phosphorylation sites is important in understanding cellular signal transduction. 4.1.2 Protein Phosphatase and Kinase Protein phosphatases are enzymes responsible for the removal of phosphate groups from a target (i.e., reversible protein
Proteomics of Biological Systems: Protein Phosphorylation Using Mass Spectrometry Techniques, First Edition. Bryan M. Ham. © 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 99
100â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
phosphorylation), and protein kinases are enzymes responsible for the addition of phosphate groups to a target. These two enzymes work together to control cellular processes and signaling pathways. Greater attention has been given in the literature to the study of signaling pathways primarily involved with protein kinase as compared with specific types of phosphatase.4–9 However, the importance of studying protein phosphatase enzymes and their targets has been demonstrated in recently reported disease state studies where the abnormal condition has been attributed at least in part to malfunctioning protein phosphatase enzymes.10–12 In covalent modification of proteins such as phosphorylation, the activity of the modified enzyme has been altered in the form of activated, inactivated, or to otherwise regulate its activity upwardly or downwardly. The most common mechanism for phosphorylation is the transfer of a phosphate group from adenosine triphosphate (ATP) to the hydroxyl group of Ser, Thr, or Tyr within the protein. Figure 4.1 illustrates the general cellular mechanism involving protein kinase phosphorylation and protein phosphatase dephosphorylation. In the top portion of Figure 4.1, a cellular signal is received in a kinase/phosphatase cycle, often in the form of a messenger biomolecule such as lipid (diacylglycerol shown), which initiates the phosphorylation of the target protein by the protein kinase–ATP action. The enzyme has been phosphorylated resulting in its participation in the signaling cycle. Usually, the phosphorylated enzyme does not permanently stay in its modified form but will undergo dephosphorylation through interaction with protein phosphatase. Figure 4.2 illustrates the nonphosphorylated structure of the Ser, Thr, and Tyr residues, each having a hydroxyl moiety and the phosphorylated form of the amino acid residue. 4.1.3 Hydroxy-Amino Acid Phosphorylation While the Ser, Thr, and Tyr residues all have a side group hydroxyl moiety available for PTM, over 99% of the phosphorylated modification takes place with the Ser and Thr amino acid residues in eukaryotic cells.13 Recently, Olsen et al.3 have reported a slight variation to the widely referred study of Hunter and Sefton,14 where the relative abundances of amino acid residue phosphorylation were assigned as 0.05% for phosphotyrosine (pY), 10% for phosphothreonine (pT), and 90% for phosphoserine (pS). In Olsen’s recent study, this has been adjusted to 1.8% pY, 11.8% pT, and 86.4% pS where the larger percentage value allocated to pY was attributed to more sensitive methodology being employed, thus allowing the characterization of lower
MASS SPECTRAL MEASUREMENT WITH EXAMPLES â•…â•… 101 O H2 C HO
C CH O
H2C C
Incoming signal from messenger
Inactive form of enzyme OH O
P
N H2N
R
OH
O
N
O
O
P
N
O
O
N
HO
OH HO
OH
OH O
P
O
P O
OH
Adenosine triphosphate
ON
OFF
Protein kinase N H2N
O
N
Protein phosphatase
OH P
N
O
O
N
O
OH O
P OH
HO
OH
Adenosine diphosphate Active form of enzyme
O HO
OH
P O
Signal out
Figure 4.1.╇ Example of the general cellular mechanism involving protein kinase phosphorylation and protein phosphatase dephosphorylation. (Top) A cellular signal is received in the form of the messenger biomolecule diacylglycerol lipid that initiates the phosphorylation of the target protein by the protein kinase–ATP action. The phosphorylated enzyme propagates the signaling cycle and is then recycled through dephosphorylation by a phosphatase.
abundant phosphorylated proteins. However, the stoichiometry of the phosphorylated proteome (in the form of tryptic peptides) is small (≤1) in relation to the nonphosphorylated proteome (in the form of tryptic peptides), thus requiring that the sensitivity of the phosphorylated proteome analysis to be as optimal as possible.
102â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES O H2N
CH
C
OH
O
O
CH2 H2N
H2N
CH
C
CH2
CH
OH
OH
CH3
CH
C
OH
OH
Threonine (Thr)
Serine (Ser)
OH
Tyrosine (Tyr) O O
O H2N
CH
C
CH2
OH
H2N
CH
H3C
CH
P
OH
CH
C
OH
CH2
O
O O
C
H2N
O
O
Phosphorylated serine (pS)
O
P
O
O
Phosphorylated threonine (pT)
O O
P
O
O
Phosphorylated tyrosine (pY) Figure 4.2.╇ Structures of the nonphosphorylated (top) and phosphorylated (bottom) states of the amino acids serine, threonine, and tyrosine.
4.1.4 Traditional Phosphoproteomic Approaches Mass spectrometry (MS) is often used to identify the sites of phosphorylation in the protein backbone when studying cellular signaling pathways.15–18 Prior to the introduction of mass spectrometric methodology for phosphorylated protein analysis, researchers studied protein phosphorylation using 32P labeling followed by two-dimensional (2-D) polyacrylamide gel electrophoresis (PAGE) and finally, Edman
MASS SPECTRAL MEASUREMENT WITH EXAMPLES â•…â•… 103
sequencing. This methodology was time-consuming and involved the handling of radioactive isotopes. This has prompted a considerable amount of methodology development using mass spectrometric techniques that are able to measure whole proteomes from complex biological systems and not just single proteins that the Edman sequencing approach is most suited for. There are numerous studies reported in the literature of signaling pathways primarily involved with protein kinase.12,19–23 The study of protein phosphatase enzymes and their targets has also gained importance where there are recently reported numerous disease states that have been attributed at least in part to malfunctioning protein phosphatase enzymes.5,24–26 4.1.5 Current Approaches The mass spectrometric instrumentation in use today possesses the sensitivity needed for PTM analysis; therefore, often, the limiting factor in phosphorylated proteome analyses lies in the sample treatment prior to mass spectral measurement. Immobilized metal affinity chromatography (IMAC) is the methodology of choice for phosphorylated proteome cleanup and enrichment.27–29 Labeling is also the most widely used approach for relative quantitation of the phosphorylated proteome when comparing a normal state with a perturbed state. Labeling typically involves using stable isotopes with either 16O/18O, 2D-methanol or stable isrotope labeling by amino acids in cell culture (13C-SILAC), or isobaric peptide tags for relative and absolute quantification (iTRAQ) reagent.30–32 However, often, labeling procedures can cause an increase in sample complexity, can be cumbersome or incomplete, and can ultimately result in sample losses. 4.1.5.1 Phosphoproteomic Enrichment Techniques.╇ The two phosphoproteome enrichment methods of choice in proteomics laboratories are IMAC and metal oxide affinity chromatography (MOAC). These two approaches are based on capturing and enriching phosphorylated species on the peptide level after the proteins have been digested by an enzyme such as trypsin. There are a number of approaches that are based on the capture and enrichment of the phosphoproteome on the protein level such as the Pierce Phosphoprotein Enrichment Kit that uses affinity spin columns33 but typically is not as commonly used when nano-liquid chromatography (LC) nano-electrospray ionization (ESI) MS is the instrumental analyses performed. A second protein level approach is to perform immunoprecipitation of Tyr-phosphorylated proteins, but this is a highly specific methodology for Tyr phosphorylation studies and not the entire phosphoproteome.34,35
104â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES IDA-Fe(III)-phospho complex O P
protein digest
phosphopeptide
O C
O
O
Fe N O
O Ti
IMAC
rinse
O
O
C O C O
load
O
O
P
O
O Ti
MOAC
elute
IMAC or TiO2 resin nonphosphopeptide
digest reagent
enriched phosphopeptides
Figure 4.3.╇ Schematic diagram of phosphopeptide isolation by immobilized metal affinity chromatography (IMAC) through binding of the phosphate group with a metal–ligand complex such as the iron(III)-iminodiacetate (IDA) or iron(III)nitrilotriacetate (NTA) complexes where the Fe(III)-IDA forms a tridentate complex and the Fe(III)-NTA forms a tetradentate complex. (Reprinted with permission from Dunn, J.D., Reid, G.E., Bruening, M.L., Mass Spectrometry Reviews 2010, 29, 29–54. Copyright 2009 by Wiley Periodicals, Inc.)
4.1.5.2 IMAC.╇ IMAC is able to enrich phosphorylated peptides utilizing a binding of the phosphate group with a metal–ligand complex such as the iron(III)-iminodiacetate (IDA) or iron(III)-nitrilotriacetate (NTA) complexes. These complexes are illustrated in Figure 4.3 where the Fe(III)-IDA forms a tridentate complex and the Fe(III)-NTA forms a tetradentate complex. The middle column in Figure 4.3 illustrates the IMAC binding complex. A typical IMAC enrichment protocol entails washing of the IMAC column bed with organic solvents (to remove organics), water (to remove metals and water soluble compounds), and ethylenediaminetetraacetic acid (EDTA) to strip the column of all metals and remove them, activation of the column bed with Fe(III), loading of the sample, acidic washing, and a final basic elution of the
MASS SPECTRAL MEASUREMENT WITH EXAMPLES â•…â•… 105
phosphorylated peptides. To stabilize the peptide phosphate moieties and to prepare them for carbon 18 (C18) reversed-phase (RP) LC, the eluant is usually acidified to approximately pH 4 with formic, acetic, or trifluoracetic acid (TFA). A schematic diagram of an IMAC column and the ensuing enrichment steps is illustrated in Figure 4.3. A detailed step-by-step optimized protocol for IMAC phosphopeptide enrichment can be found in Section 4.2.4.9. A recent optimization step added to the IMAC phosphopeptide enrichment methodology includes the conversion of the acidic peptide residues aspartic acid (Asp) and glutamic acid (Glu) to methyl esters. This is performed in order to decrease the amount of nonspecific binding taking place between nonphosphorylated peptides containing Asp or Glu residues. Nonspecific binding can compete for the binding sites in the IMAC column bed and thus decrease the enrichment of the phosphorylated peptides and can also contribute to the presence of nonphosphorylated peptides in the final analysis of the enriched fraction. Removing nonspecific binding peptides from the enrichment fraction is thought to decrease ionization suppression of the phosphorylated peptides by nonphosphorylated peptides that are coeluting from the high-performance liquid chromatography (HPLC) column. The most common reagents used for methylating the acidic residues are acetyl chloride and/or thionyl chloride in methanol (methanolic HCl). Later in the chapter, in Section 4.3, we will look at an IMAC column bed study where the normal nonphosphorylated proteome of a HeLa cell protein extract is measured by collecting the flow through and NaCl salt wash of the IMAC column. Included in the study is a description of the influence of peptide molecular weight and charge state (CS) that appears to be dictating the functionality of the nonspecific binding of peptides not containing a phosphoryl group. 4.1.5.3 MOAC.╇ MOAC has recently found greater use in a number of proteomic laboratories that have incorporated different metal oxides including titanium dioxide (TiO2), aluminum oxide (Al2O3), niobium oxide (Nb2O5), and zirconium dioxide (ZrO2). The MOAC approaches have shown to be stable at both large pH ranges and elevated temperatures. It was first thought that methyl esterification was not necessary for MOAC with good recoveries and low nonspecific binding, but some studies have indicated otherwise. For example, in a study by Pinkse et al.36 using TiO2 MOAC, it was demonstrated that the nonmethylated form of a synthetic peptide containing a phosphorylation was recovered at 90% in the presence of its nonphosphorylated form. However, a
106â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES TiO2-phospho complex O P
protein digest
phosphopeptide
O C
O
O
Fe N O
O Ti
IMAC
rinse
O
O
C O C O
load
O
O
P
O
O Ti
MOAC
elute
IMAC or TiO2 resin nonphosphopeptide
digest reagent
enriched phosphopeptides
Figure 4.4.╇ Schematic diagram of phosphopeptide isolation by metal oxide affinity chromatography (MOAC). Titanium dioxide has a positively charged surface at acidic pH that has excellent selectivity in absorbing and enriching phosphorylated peptides. A bridging bidentate complex is formed between the phosphate group and the TiO2. (Reprinted with permission from Dunn, J.D., Reid, G.E., Bruening, M.L., Mass Spectrometry Reviews 2010, 29, 29–54. Copyright 2009 by Wiley Periodicals, Inc.)
simple tryptic digestion mixture from cyclic guanosine monophosphate (cGMP)-dependent protein kinase resulted in 11 phosphorylated peptides recovered along with a number of nonphosphorylated peptides. They determined that 98% of the nonmethylated form of the [Glu1]fibrinopeptide B was retained where the methylated form was not. For phosphopeptide enrichment, TiO2 has turned out to be the most widely used metal oxide. Titanium dioxide has a positively charged surface at acidic pH that has excellent selectivity in absorbing and enriching phosphorylated peptides. A bridging bidentate complex is formed between the phosphate group and the TiO2 as illustrated in Figure 4.4. The right end column in Figure 4.4 illustrates the MOAC binding complex. Titania has a pKa value of 4.4, and a pKb value of 7.7. Thus, at low pH values, the titania is positively charged, and at high pH,
MASS SPECTRAL MEASUREMENT WITH EXAMPLES â•…â•… 107
the titania is negatively charged. At a pH of ∼10 of an NH4OH eluent, the phosphorylated peptides dissociate from the TiO2 bed. As a reference, the first and second pKa values of pS and pT are ∼1.7 and 6. The typical TiO2 MOAC loading solutions are composed of 0.1% TFA with a pH of 1.9 and 0.1╯M acetic acid that has a pH of 2.7. These pH values are low enough to effectively protonate the acidic amino acid residues, thus preventing adsorption of nonphosphorylated peptides to the TiO2. Typically, in TiO2 MOAC, the peptides are loaded and washed under acidic conditions and then eluted under basic conditions. 4.1.5.4 Methylation of Peptides prior to IMAC or MOAC Enrichment.╇ Thionyl chloride (Sigma-Aldrich St. Louis, MO) was used for methyl esterification of the peptides.37 The steps involved in methylating the peptides are as follows: 1. Only use dry peptide material that has been previously extensively dried in a SpeedVac (Eppendorf, Hamburg, Germany). 2. Carefully add dropwise 40╯µL of thionyl chloride to 1╯mL methanol (both anhydrous). 3. Add thionyl chloride/methanol mixture to dry peptide at a ratio of 75╯µL thionyl chloride/methanol solution per 100╯µg peptide. 4. Vortex the reaction mixture for 5–10 minutes to ensure dissolution of the dry peptide material. 5. Sonicate the reaction mixture for 10 minutes. 6. Let the mixture react at room temperature for 1 hour. 7. Bring the methylated peptide to dryness in a SpeedVac (Eppendorf) and store at −80°C until further processed. For the methylated form of peptide, 730╯µL of a solution composed of 40╯µL of thionyl chloride in 1╯mL anhydrous methanol was added to 1╯mg of peptide. The mixture was sonicated for 10 minutes at 37°C for 10 minutes and allowed to react at room temperature for 1 hour. The solvent was removed in the SpeedVac and the peptide was reconstituted in IMAC loading solutions that were composed of 1:1:1 methanol/ acetonitrile/0.01% acetic acid at a ratio of 100╯µL solution to 100– 200╯µg peptides. 4.1.6 The Ideal Approach The ideal methodology for PTM studies would entail both high sample recovery and sample specificity while avoiding any additional
108â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
modifications to the proteins being studied. The first critical step in sample preparation is the lysis and solubilization of the sample’s compliment of proteins. This is followed by additional steps for sample cleanup and treatment that is often accompanied by protein/peptide loss at each step. Another limiting factor is the presence of nucleic acids during the IMAC enrichment step, which are known to poison or compete with the phosphorylated peptides for binding sites on the IMAC bed. Alternative approaches exist for the removal of ribonucleic acid (RNA) and DNA such as the addition of RNase and DNase to the whole cell lysate buffer,38 passing the lysate repeatedly through a tuberculin syringe fitted with a 21-gauge needle to shear the RNA and DNA mechanically,39 or using the QIAShredder (QIAGEN Inc., Valencia, CA).40 The use of RNase and DNase is often avoided in an effort to reduce the amount of additives and the production of low-molecularweight nucleic acid lysed products, which are difficult to remove and also poison the IMAC bedding. The choice of ultracentrifugation can be made over mechanical shearing for similar reasons that are associated with the production of low-molecular-weight nucleic acid species. 4.1.7 One-Dimensional (1-D) Sodium Dodecyl Sulfate (SDS) PAGE The use of 1-D SDS-PAGE is another alternative approach for whole cell lysate intact protein cleanup that includes removal of the nucleic acids. Current examples include a study by Gygi et al.5 who used preparatory gels for HeLa cell nuclear protein separation followed by whole gel digestion (cut into 10 regions) where 967 proteins revealed 2002 phosphorylation sites. A recent study by Mann et al.41 reported that five times more proteins from tear fluid were identified after gel electrophoresis as compared with in-solution digest. Other potential advantages for the use of 1-D SDS-PAGE gel electrophoresis is the ability to target specific molecular weight ranges that can be identified and excised from various band regions within the gels and a more efficient tryptic digest due to the enhanced accessibility of the protein backbone denatured into a linear orientation locked within the gel. Recoveries from gels can be an added concern though as all steps in protein/peptide preparation can contribute to an overall loss of protein. 4.1.8 Tandem MS Approach Mass spectrometric studies of phosphorylation PTMs typically employ tandem MS to generate product ion spectra of phosphorylated
MASS SPECTRAL MEASUREMENT WITH EXAMPLES â•…â•… 109
peptides. This usually entails a premass spectrometric separation of a complex mixture of phosphorylated peptides by RP C18 stationary phase HPLC with an ESI source (RP C18 HPLC ESI-MS/MS). Fragmentation pathway studies using ion trap MS of the phosphorylation of the three possible sites in peptides, Ser, Thr, and Tyr, have been studied and characterized, revealing different mechanisms. 4.1.8.1 pS Loss of Phosphate Group.╇ Peptides that contain pS generally will lose the phosphate group as the predominant product ion in the spectrum. This often results in limited information about the sequence of the peptide where other peptide backbone fragmentation is not well observed. Neutral loss of the phosphate group for the plus one charge state (+1 CS) is observed at 98.0╯Da, [M╯+╯H╯−╯H3PO4]+; for the plus two CS (+2 CS), neutral loss of the phosphate group is observed at 49.0╯Da, [M╯+╯2H╯−╯H3PO4]2+; for the plus 3 CS (+3 CS), neutral loss of the phosphate group is observed at 32.7╯Da, [M╯+╯3H╯−╯H3PO4]3+. 4.1.8.1.1â•… pS Collision-Induced Dissociation (CID) Spectra.╇ Figure 4.5 illustrates product ion spectra for a pS-containing peptide at m/z 987.5 (monoisotopic mass of 1972.98╯Da). The spectrum in Figure 4.5a is the MS/MS spectrum of the doubly charged phosphorylated peptide, [M╯+╯2H]2+, at m/z 987.5 where the predominant product ion is observed at m/z 938.7 for the neutral loss of the phosphate group for the doubly charged precursor (−49.0╯Da), [M╯+╯2H╯−╯H3PO4]2+. There is coverage in the spectrum of the peptide backbone sequence as illustrated by the b- and y-type ions; however, their response is quite low and often are not observed in product ion spectra of phosphorylated peptides when using tandem ion trap MS. The loss of the phosphate modification from the peptide is the preferred fragmentation pathway and is usually the predominant one observed. The phosphorylated peptide’s sequence is shown in the spectrum as RApSVVGTTYWMAPEVVK, where the phosphorylation is on the Ser amino acid residue that is third from the left. In the collision-induced fragmentation of the phosphorylated peptide, all of the y-type ions observed in Figure 4.5a (y5╯−╯y14) do not contain the phospho group. Only one of the b-type ions contains the phospho group at b12. This is due to a preferential cleavage taking place on the peptide backbone at the proline residue, while all of the other b-type ions include neutral loss of the phospho group from the Ser residue and are denoted as bn∆ , which equals bn╯−╯H3PO4. The loss of the phosphorylation that is associated with Ser is through a β (beta)-elimination mechanism producing dehydroalanine. The mechanism for dehydroalanine production through β-elimination is
Relative Abundance
(a) 100 95 90 85 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 0
+2 –H3PO4 938.7
y
Relative Abundance
9
7 6 5
RApS VVGTTYWMAPE VVK
b
4 5
10 1112 14 15 16
b∆n = bn-H3PO4
y12 1381.6 y5 ∆ 571.4 y6 396.3 b 5 642.5 495.3 b∆4
400
500
600
773.5
700
800
b∆14 1531.7
y9 b∆12 b 12 ∆ 1122.5 b∆111305.5 1403.4y13 y14 ∆ b∆16 b 15 b 10 1234.6 1480.6 1579.6 1103.3 1630.7 1729.6
y7
900
1000 1100 1200 1300 1400 1500 1600 1700 1800
m/z
(b) 100 95 90 85 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 0
1413 12
+2 –H2O
b12 1305.6
930.0
y 15 14 1312
98
7 6 5
3
RABVVGTTYWMA PEVVK
b
4 5 67
9 10 1112 141516
B = dehydroalanine y5 571.3
y7
y6 b4 y3 397.0 b6 345.3 b5552.3
642.4
b7
773.5
b11 1234.4
b9
917.4
653.3
y8 959.4
b10
y9 1122.5
1103.6
495.3
400
500
600
700
800
900
b14
b15 1630.6
1531.6
y12 y y14 13 1381.4 1480.6 1579.5
b16 1729.7 y15 1648.8
1000 1100 1200 1300 1400 1500 1600 1700 1800
m/z Figure 4.5.╇ Phosphoserine loses predominantly H3PO4 through β-elimination to produce dehydroalanine in tandem ion trap mass spectrometry. (a) MS2 spectrum of a doubly charged phosphopeptide ion (m/z 987.5). A loss of 98╯Da (H3PO4) is observed. The bn∆ label denotes a loss of 98╯Da (i.e., bn-H3PO4). (b) MS3 spectrum of the ion arising from loss of 98╯Da (m/z 938.7 of the doubly charged ion in [a]). The y14 and y15 fragments have a mass difference of 69╯Da that corresponds to the mass of dehydroalanine, identifying the product of phosphoserine after losing 98╯Da as dehydroalanine. The “B” label denotes the dehydroalanine residue. (Reprinted with permission. This article was published in J Am Soc Mass Spectrom, DeGnore, J.P., Qin, J. Fragmentation of phosphopeptides in an ion trap mass spectrometer, 1998, 9, 1175–1188. Copyright Elsevier 1998.) 110
MASS SPECTRAL MEASUREMENT WITH EXAMPLES â•…â•… 111
R
N H
H
O
C
C
CH2
P
R1
OH
O R
O O
O
N H
C
C
O
CH2 OH
Dehydroalanine
R1
+
O
P
OH
OH
Phosphate
OH
Phosphoserine β-elimination Figure 4.6.╇ β-Elimination mechanism for the fragmentation pathway producing dehydroalanine for the phosphorylated serine amino acid residue.
illustrated in Figure 4.6 for pS. This would represent the structure of the Ser in the peptide that is represented by the m/z 938.7 product ion in Figure 4.5, [M╯+╯2H╯−╯H3PO4]2+. Figure 4.5b is an example of the application of MS3 product ion spectral collection for the enhanced fragmentation of phosphorylated peptides. In this approach, the main product ion collected from MS2, m/z 938.7, is isolated and subjected to a third stage of fragmentation. In the MS3 product ion spectrum of Figure 4.5, there is a predominant peak for water loss from the m/z 938.7 precursor peak. This also does not afford much information for the sequence of the peptide; however, there are also observed in the spectrum numerous product ion peaks of the b and y types that have a much greater response than that observed in the MS2 product ion spectrum of Figure 4.5a. Notice that in the fragmentation of the phosphorylated peptide in Figure 4.5a, product ions derived through cleavage from both sides directly adjacent to the Ser residue did not take place. This means that a positive identification of the residue that contains the phosphate modification, namely the Ser residue, is not possible. Notice in the third-stage product ion spectrum of Figure 4.5b that fragmentation on both sides of the Ser residue did in fact take place. Because the precursor ion that was subjected to CID in the third-stage product ion spectrum of Figure 4.5b was the m/z 938.7 product ion from the second-stage fragmentation in Figure 4.5a, the Ser residue has been replaced in the sequence by a “B” ion, which stands for dehydroalanine. Due to the absence of the phosphate modification on the peptide chain, there is observed substantial sequence coverage in the product ions observed in Figure 4.5b, thus allowing specific identification of the peptide sequence and the location of the phosphorylation. Further studies concerning phosphorylated Ser
112â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
(pS) have demonstrated that loss of the phosphate H3PO4 group in product ion spectra is not dependent on the CS of the precursor ion. All three commonly observed CSs, +1, +2, and +3 for ESI CID product ion spectral collection resulted in the loss of the phosphate group as the predominant product ion peak with other minor losses as shown in Figure 4.5a. 4.1.8.2 pT Loss of Phosphate Group.╇ When the phosphorylation modification of a peptide takes place on the Thr amino acid residue, the predominant product ion spectral peak derived from CID is also observed to be through β-elimination of the phosphate group. The neutral loss of the phosphate group, −98╯ Da as [M╯+╯H╯−╯H3PO4]+, from the Thr residue produces dehydroaminobutyric acid. A product ion spectrum illustrating the major species produced from a peptide containing a phosphorylated Thr is illustrated in Figure 4.7. The sequence of the phosphorylated peptide illustrated in Figure 4.7 is RASVVGTpTYWMAPEVVK, which has a monoisotopic mass of 1972.98╯Da. The phosphorylation of the peptide has taken place on the Thr amino acid residue (pT). The CS of the peptide in the product ion spectrum of Figure 4.7 is +2 at m/z 987 as [M╯ +╯ 2H]2+. Loss of
Relative Abundance
100 90 80
MS2
939.0
×5
b5 513.5
y5
571.5
400
b∆8
y7
754.5 773.5
y6 642.5
600
b∆9 917.1
b9 1015.3
b∆10
b8 852.3
800
b
5
8 9 1011 12 1415 17
1122.3
b∆12 1305.5
b10
b12 1403.3
b11 1332.4
1201.4
1200
b15
b∆14 1531.5
1103.5
1000
7 6 5
RASWGTpT YWMAPEVVK
978.4
947.5
9
y
b17+2 +2 y –HPO3 9
70 60 50 40 30 20 10 0
+2 –H3PO4
1400
b14
1728.5
1629.5
1600
1800
m/z Figure 4.7.╇ Product ion spectrum of phosphorylated threonine illustrating the major species produced. The sequence of the phosphorylated peptide is RASVVGTpTYWMAPEVVK (monoisotopic mass of 1972.98╯Da). Charge state of the peptide is +2 at m/z 987 as [M╯+╯2H]2+. Loss of the H3PO4 phosphate group is the primary fragmentation pathway at m/z 939.0 as [M╯+╯2H╯−╯H3PO4]2+. (Reprinted with permission. This article was published in J Am Soc Mass Spectrom, DeGnore, J.P., Qin, J. Fragmentation of phosphopeptides in an ion trap mass spectrometer, 1998, 9, 1175–1188. Copyright Elsevier 1998.)
MASS SPECTRAL MEASUREMENT WITH EXAMPLES â•…â•… 113 H
O
H2N
C
C
H3C
CH
OH O H2N
C
H3C
CH
C
OH OH
+
O
P
OH
O O
P
OH
OH
Dehydoaminobutyric acid
Phosphate
OH
Phosphorylated threonine (pT) β-elimination Figure 4.8.╇ Mechanism for the production of the dehydroaminobutyric acid from phosphorylated threonine through neutral loss, β-elimination of the phosphate group.
the H3PO4 phosphate group is the primary fragmentation pathway that is observed in the product ion spectrum at m/z 939.0 as [M╯ +╯2H╯−╯ H3PO4]2+. In the product ions produced from CID of the m/z 987 species, there are a number of b-type ions that both contain the phosphate modification (b8, b9, b10, b11, b12, b14, b15) and that do not ∆ ∆ ∆ , b12 , b14 ). Figure 4.8 illustrates the fragmentation pathway (b8∆ , b9∆, b10 mechanism for the production of the dehydroaminobutyric acid from the neutral loss, β-elimination of the phosphate group from phosphorylated Thr amino acid. One difference in the product ion spectrum for fragmentation of phosphorylated Thr as compared with phosphorylated Ser, Figure 4.5a, is the product ion peak observed at m/z 947.5 for the neutral loss of 80╯ Da. The product ion at m/z 947.5 represents dephosphorylation through neutral loss of HPO3 from the precursor peak as [M╯+╯2H╯ −╯HPO3]2+. This particular loss is not observed in the product ion spectrum of phosphorylated Ser. The mechanism for dephosphorylation of the Thr amino acid is illustrated in Figure 4.9. The dephosphorylation results in the structure of the original amino acid residue of Thr. 4.1.8.3 pY Loss of Phosphate Group.╇ Product ion spectra of phosphorylated Tyr (pY) containing peptides also illustrate losses associated with 80 and 98╯Da, which would appear to be similar to the losses observed with phosphorylated Thr peptides. The 80-Da loss is due to dephosphorylation of the Tyr residue resulting in the original structure of the Tyr residue. The mechanism is illustrated in Figure 4.10. However, due to the structure of Tyr, a similar mechanism of β-elimination for
114â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES O H2N H3C
H C
C
O
OH
CH
H2N
H C
O
H3C
CH
O C
OH
+
P O
OH
H O
P
OH
O
OH
Threonine
HPO 3
Dephosphorylation of threonine Figure 4.9.╇ Mechanism for dephosphorylation of the threonine amino acid resulting in the structure of the original amino acid residue of threonine. O H2N
CH
O
C
OH
CH2
H2N
CH
C
OH
CH2
HO
+
O P O
O O
P
H O
OH
OH
Dephosphorylated tyrosine
HPO 3
Phosphorylated tyrosine (pY) Figure 4.10.╇ Mechanism for the dephosphorylation of tyrosine.
loss of the phosphate (−98╯Da, H3PO4) group is most probably not likely. The neutral loss of phosphate at 98╯Da has also been proposed to not happen through a two-step mechanism involving both water (H2O) loss of 18╯Da and HPO3 loss of 80╯Da, in any order. The neutral loss of 98╯Da is more than likely associated with some form of rearrangement in the fragmentation pathway mechanism.
MASS SPECTRAL MEASUREMENT WITH EXAMPLES â•…â•… 115
4.1.9 Alternative Methods: Infrared Multiphoton Dissociation (IRMPD) and Electron Capture Dissociation (ECD) Other mass spectrometric methods are used in phosphorylated peptide analysis that has been investigated to help increase the efficiency of the fragmentation that takes place. This can allow more direct approaches for modification location without the need of third-stage fragmentation experiments that require the interpretation of two individual spectra. One example of an alternative mass spectrometric approach has been the use of the Fourier transform ion cyclotron resonance (FTICR) mass spectrometer using IRMPD and ECD. Figure 4.11 illustrates a comparison of these two dissociation approaches. In the top figure, IRMPD was used to excite and dissociate the phosphorylated peptide. The phosphorylation is located on a Ser (pS) residue within the middle of the peptide chain having the sequence AKRRRL(pS)SLRASTS. In the product ion spectrum collected using IRMPD, the major product ions that were observed were all associated with the neutral loss of the phosphate group as [M╯+╯3H╯−╯H3PO4]3+ and also, phosphate and water loss as [M╯+╯3H╯−╯H3PO4╯−╯H2O]3+. As can also be observed in the top product ion spectrum of Figure 4.11, very little information is given concerning the peptide chain’s sequence. The bottom product ion spectrum of Figure 4.11 illustrates the effectiveness of using ECD for phosphorylated peptide sequence determinations. In this spectrum, there is an appreciable amount of peptide backbone fragmentation ions in the form of c-type and z-type ions. The major peak in the spectrum is the triply protonated precursor ion as [M╯+╯3H]3+. Also, notice that the phosphorylation was maintained within the structure of the product ions. The two product ion spectra are complimentary where the top spectrum is diagnostic for the determination of a phosphorylation of the peptide, while the bottom spectrum gives very good sequence coverage of the peptide. 4.1.10 Electron Transfer Dissociation (ETD) More recently, the ETD capability has been incorporated into linear ion traps that also allow significant peptide backbone cleavages similar to the ECD capability usually associated with FTICR mass spectrometers. Figure 4.12 illustrates the extensive fragmentation that can be achieved with ETD for a quite long sequence of peptide. The top spectrum is the dissociation spectrum for a 35-residue phosphopeptide that has a molecular weight of 4093╯Da. The phosphopeptide is in the plus six CS (+6) at m/z 683.3 for [M╯+╯6H]6+. As can be seen in the figure,
116â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES [M - H3PO4 - H2O + 3H]3+
[M - H3PO4 + 3H]3+
Off-Axis IRMPD
(y13 - H3PO4)3+
AKRRRL(pS)SLRASTS (y8 - H3PO4 -17)+
(b7 - H3PO4)2+
(b7 - H3PO4)+ 3+
[M + 3H]
(y9 - H3PO4)+
y6+
+
y7
400
500 [M + 3H]3+
600
700
800
900
m/z
AKRRRL(pS)SLRASTS
~ x 10
ECD
[M + 3H - 17]2+• 2+
[M + 2H] and [M + 3H]2+•
(z12 + 3H)2+• c112+ +
c3+
400
4
500
c102+
600
c122+ c132+
700
800
c7+
c8+
c9+ z10+•
900 1000 1100 1200
z11+•
m/z
Figure 4.11.╇ (Top) Product ion spectrum obtained from off-axis IRMPD FTICR MS/MS of a population of quadrupole- and SWIFT-isolated [AKRRRL(pS) SLRASTS╯+╯3H]3+ phosphopeptide ions. The spectrum is dominated by ions resulting from neutral losses of H3PO4, NH3, and H2O. Five (out of 13) peptide backbone bonds are broken, and the location of the phosphorylation site is identified only by observation of the (y8–H3PO4–NH3) ion (present at very low abundance) and the singly and doubly charged (b7–H3PO4) ions. Irradiation was for 500╯ms at ∼36-W laser power, and the data represent a sum of 10 scans. (Bottom) Product ion spectrum obtained from ECD (20-ms irradiation) FTICR MS/MS of the same quadrupole- and SWIFT-isolated phosphopeptide as in the top figure. Twelve out of 13 peptide backbone bonds are cleaved, and the location of the phosphate is readily assigned by observation of the abundant c7 ions. (Reprinted with permission. This article was published in J Chromatogr B, Chalmers, M.J., Kolch, W., Emmett, M.R., Marshall, A.G., Mischak, H. Identification and analysis of phosphopeptides, 2004, 803, 111–120. Copyright Elsevier 2004.)
MASS SPECTRAL MEASUREMENT WITH EXAMPLES â•…â•… 117
(a)
• • • • • • • • • •• • • • • • • ••• • • • • • • H S G F F H S S KKE E Q Q N N Q ATA G E H D A S I T R S pS L D R K * * * ** ** * * ** * * * * * * * * * * * * *** * c4 c3 c5 • c6 c7 • z2 • • z c z7 6 8 •c9 c2 z • 5 * z4 * * • y1 * * * •* * z3 * * 100 400 700 1000 z12 c 13 z c12 * z c14 * c15* z16 c16 • c10 z10* c11 11 z8 * * z9 • 13 • • • •* * z17* • * •* • •* • • * *• * z14 •z15 • m/z 1400 1100 1700 2000
(b)
z1 100
z9 c10 z10
1100
• E N A N S R S S A pH M S S N A I Q R * * z4 z5 z2 * z8 c9 c6 z c c 8 z 5 c c 7 6 7 c3 z3 4 • * 400 700 1000 z17 z12 z16 c13 c14 z15 c16 c18 c12 z13 z 14 c15 z11c 11 1400
1700
2000
Figure 4.12.╇ Phosphopeptide mass spectra. ETD mass spectra recorded on [M╯+╯6H]+6 ions at m/z 683.3 for a 35-residue phosphopeptide of molecular weight 4093 (a), and [M╯+╯3H]+3 ions from a phosphorylated His-containing peptide at the C-terminus of the septin protein, Cdc10 (b). Observed c and z· ions are indicated on the peptide sequence by ⎤ and ⎣, respectively. Observed doubly charged c and z· ions are indicated by an additional label, circle and asterisk, respectively. (Reprinted with permission from Chi, A., Huttenhower, C., Geer, L.Y., Coon, J.J., Syka, J.E.P., Bai, D.L., Shabanowitz, J., Burke, D.J., Troyanskaya, O.G., Hunt, D.F. PNAS 2007, 104, 2193–2198. Copyright 2007 National Academy of Sciences, U.S.A.)
for this very long sequence phosphorylated peptide, essentially complete coverage of the peptide chain has been achieved along with the determination of the phosphorylation site. The ETD product ion spectrum illustrated in Figure 4.12b is for a phosphorylated histidine (His) (pH) peptide. Phosphorylation on the His residue is an important type of PTM that is observed in prokaryotic proteomes and will be discussed in the next chapter. The rest of this chapter is an illustrative example of applying mass spectral techniques for the study of phosphorylation of a eukaryotic cell system as a PTM. The studies described utilize HeLa cells that are both normal and that have sustained induced DNA damage using the nonlinear peptide bleomycin to study the signaling cascade effect of
118â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
cellular perturbation. The discussion brings the reader through specific methodological approaches utilizing the most recent advances in proteomic sample preparation, phosphopeptide enrichment, mass spectral measurement, and treatment of the collected data sets.
4.2 THE HELA CELL PHOSPHOPROTEOME 4.2.1 Introduction To improve both coverage and confidence in protein identifications, ongoing optimization of proteomic methodologies are currently being investigated. In order to avoid the pitfalls associated with single-point analysis and undersampling, the optimization of sample preparation, inclusion of technical replicates (repeated instrumental analysis of the same sample), and biological replicates (multiple individual samples) are crucial steps in proteomic studies. The following work illustrates results where phosphopeptides were isolated from HeLa cells and analyzed by nano-RP-LC-MS/MS. A detergent-based protein extraction approach, followed with additional steps for nucleic acid removal, is shown to provide a simple alternative to the broadly used Trizol extraction. The measurement reproducibility from the evaluation of four technical replicates demonstrated low percent variance in peptide responses at approximately 3%, where additional peptide identifications were made with each added technical replicate. The inclusion of six technical replicates affords the optimal collection of peptide information for moderately complex protein extracts (approximately 4000 uniquely identified peptides per data set). 4.2.2 Background of Study For regulating cellular processes, such as in HeLa cells, involving signal transduction reversible phosphorylation of Ser, Thr, and Tyr residues in proteins represent an important mechanism in eukaryotes.2 To identify protein phosphorylation sites in eukaryotic systems such as HeLa cells, analytical approaches utilizing MS have been applied extensively.15–18 An enrichment step is a necessary requirement due to the often low stoichiometry of the phosphoproteome within a cell (≤1%) to identify low abundance phosphopeptides from complex mixtures. One current and widely used enrichment technique in phosphoproteomics is immobilized metal affinity chromatography,27–29 which has been optimized over the years for high specificity enrichment and
THE HELA CELL PHOSPHOPROTEOMEâ•…â•… 119
recovery of phosphopeptides.17,24,30 It is possible that the presence of nucleic acids in proteomic samples may interfere with IMAC enrichment steps by competing with phosphopeptides for available binding sites in the IMAC stationary phase. Because of this sample preparation, methods upstream of IMAC enrichment are highly important. Next, if the reader is unaware, Trizol extraction has been the most commonly used protein extraction method in IMAC applications.9,22,24,42 The Trizol approach removes nucleic acids that reduce phosphopeptide recovery and provides an increase in method sensitivity during IMAC enrichment; however, the Trizol method involves multiple steps, including phase separation and precipitation to remove RNA and DNA and collect protein. As a precipitation-based approach, the Trizol methodology may result in selective protein loss, and minor variations in sample handling during this procedure can undermine both uniform protein recovery among samples and sample quality.41 4.2.3 What is Covered In the following sections, we will look closely at a study in which samples were prepared using a detergent-based cell lysis method followed by in-solution or in-gel digestion and IMAC enrichment to explore the phosphoprotein coverage of HeLa cell lysates. As a comparison with existing techniques, samples were also prepared using Trizol extraction. It was observed that as compared with Trizol extraction, detergent-based extraction such as with the Roche Complete lysis approach is fast, requires a single, easily reproduced step, and gives a good protein yield but requires additional steps to remove nucleic acids prior to IMAC enrichment. To facilitate nucleic acid removal, several simple steps were taken during phosphopeptide sample preparation. The inclusion of both technical replicates (repeated instrumental analysis of the same sample) and biological replicates (multiple individual samples), as well as the effect of these sample preparation steps on coverage, was evaluated. We will see that these experimental design parameters are crucial to avoid the pitfalls associated with single-point analysis and undersampling. 4.2.4 Optimized Methods to Use for Phosphoproteomic Studies 4.2.4.1 Cell Culture.╇ A very common system used in proteomic studies are HeLa cells. The cultures are often grown in Dulbecco’s modified Eagle medium (DMEM) with high glucose (Invitrogen, Carlsbad, CA) supplemented with 10% fetal bovine serum (FBS) (Clontech,
120â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
Mountain View, CA) and 100 units/mL penicillin and 100╯µg/mL streptomycin (Invitrogen) at 37°C in 5% CO2. This is a generalized approach for preparing cell cultures for proteomic studies. Manufacturer guidelines are also very helpful in preparing a system for study. 4.2.4.2 Extraction of HeLa Cell Proteins.╇ In the example illustrated here, two sets of samples are prepared for analysis. In the first set (biological replicate 1), five nearly confluent 100-mm plates of cells were extracted with Trizol, and five matched plates of cells were solubilized with Roche lysis buffer. The sample solubilized with Roche Complete Lysis-M was then split into two equal portions. One portion was subjected to in-solution tryptic digestion, while the other portion was subjected to SDS-PAGE and in-gel tryptic digestion. For the second sample (biological replicate 2), five nearly confluent 100-mm plates of cells were again solubilized with Roche lysis buffer and divided into two portions for in-solution digest or in-gel digest. 4.2.4.3 Trizol Extraction and Tryptic Digestion.╇ Trizol reagent (Invitrogen), according to the manufacturer’s suggested protocol with the exception that the initial Trizol volume was doubled (∼2╯mL Trizol reagent/5╯×╯106 cells), was used to extract protein11: 1. The protein pellet was resuspended in 8╯M urea. 2. The proteins were then reduced in a 5╯mM solution of dithiothrietol (DTT). 3. The proteins were then alkylated in a 15╯mM solution of iodoacetamide. 4. The denatured and alkylated proteins were digested with modified trypsin at a 1:20 ratio for 4 hours at 37°C after twofold dilution with 50╯mM NH4HCO3 (pH 7.4). 5. After fivefold further dilution, a second trypsin digestion at a 1:20 ratio was performed overnight at 37°C. 6. The digestion was stopped by adding acetic acid to a final pH of ∼3.5–4. 4.2.4.4 Solid-Phase Extraction (SPE) Desalting.╇ A C18 RP peptide SPE cartridge is used to desalt the tryptic digests. A 1╯mL/100╯mg tube is usually sufficient to use for up to 5╯mg of protein extract: 1. Condition the SPE column with 3╯mL of methanol on an SPE vacuum chamber.
THE HELA CELL PHOSPHOPROTEOMEâ•…â•… 121
2. Rinse the column with 2╯mL of acidified water (0.1% TFA). 3. Slowly put the protein extract sample through the column using minimal vacuum (∼0.5–1╯mL per minute flow rate). 4. Wash the column containing the sample with 4╯mL of 95:5 H2Oâ•›:â•›ACN, 0.1% TFA. 5. Allow the column to go to dryness and whip the needles below the columns dry. 6. Place appropriate collection tubes under the columns. 7. Close off the tubes from the vacuum and add 1╯mL of 80:20 ACNâ•›:â•›H2O, 0.1% TFA. 8. Allow the elution buffer to slowly flow through the tube until the column is dry (∼0.5 to 1╯mL per minute flow rate). 9. When completed, remove the sample from the SPE vacuum chamber. 4.2.4.5 Converting Peptide Carboxyl Moieties to Methyl Esters.╇ To remove the possibility of nonspecific binding of peptides to the IMAC column bed, peptide residue carboxyl moieties are converted to methyl esters. The tryptic peptides are converted to peptide methyl esters according to the general procedure of White et al.22 except that a second methyl esterification step was performed to ensure complete esterification. Samples were reconstituted in IMAC loading solutions that were composed of 1:1:1 methanol/acetonitrile/0.01% acetic acid at a ratio of 100╯µL solution to 100–200╯µg peptides: 1. Only use dry peptide material that has been previously extensively dried in a SpeedVac. 2. Carefully add dropwise 40╯µL of thionyl chloride to 1╯mL methanol (both anhydrous). 3. Add thionyl chloride/methanol mixture to dry peptide at a ratio of 75╯µL thionyl chloride/methanol solution per 100╯µg peptide. 4. Vortex the reaction mixture for 5–10 minutes to ensure dissolution of the dry peptide material. 5. Sonicate the reaction mixture for 10 minutes. 6. Let the mixture react at room temperature for 1 hour. 7. Bring the methylated peptide to dryness in a SpeedVac (Eppendorf). 8. Reconstitute in IMAC loading solution composed of 1:1:1 methanol/acetonitrile/0.01% acetic acid at a ratio of 100╯µL solution to 100–200╯µg peptides.
122â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
4.2.4.6 Roche Complete Lysis-M, EDTA-Free Extraction.╇ As an alternative to the Trizol extraction approach, eukaryotic systems can be treated with a detergent-based lysis and extraction buffer. For this detergent-based approach, HeLa cells were lysed and extracted using the Roche Complete Lysis-M, EDTA-free kit (Roche Applied Science, Mannheim, Germany) according to the manufacturer’s guidelines. Phosphatase Inhibitor Cocktail Sets I and Set II (EMD Biosciences, San Diego, CA) were added to the extracts, following the manufacturer’s protocol. For the first portion split (described earlier), urea was added to the extract to a final concentration of 8╯M, and the proteins were reduced, alkylated, and digested as described earlier. Following tryptic digestion, ultracentrifugation (166,000╯×╯g for 30 minutes at 4°C) was used to deplete nucleic acids from the sample prior to SPE desalting. Following SPE desalting, extracts were methyl esterified and the phosphopeptides enriched using IMAC, as described later. Biological replicate 1 was analyzed using a liquid chromato-graphy linear ion trap-Fourier transform mass spectrometer (LC-LTQ-FT MS), while biological replicate 2 was analyzed using LC-LTQ-Orbitrap MS (Thermo Fisher Scientific, Bremen, Germany). 4.2.4.7 1-D SDS-PAGE Cleanup.╇ To remove nucleic acids from the second portion split, as well as to investigate sample cleanup and recovery, samples of the total cell lysates prepared with the Roche Complete Lysis-M, EDTA-free kit were separated using 1-D SDS-PAGE as a preparatory stage as described elsewhere.42 Briefly, the separations were performed according to the manufacturer’s guidelines using a Mini-PROTEAN 3 Cell (Bio-Rad, Hercules, CA) and 1-mm-thick Ready Gel Tris-HCl gels with a 4%–20% gradient acrylamide composition (Bio-Rad). Precision Plus Protein Standards (Bio-Rad) ranged from 10 to 250. Prior to gel loading, the protein samples were mixed with a dye solution that contained the reducing agent Bond-Breaker TCEP (Pierce, Rockford, IL) and heated at 95°C for 4 minutes. Approximately 3╯mg of extracted protein determined by the bicinchoninic acid (BCA) protein assay (Pierce) were subjected to SDS-PAGE on two gels (1.5╯mg per gel) at a constant voltage of 200╯V. The gels were fixed, stained, destained, and then stored until analyzed.42 4.2.4.8 In-Gel Reduction, Alkylation, Digestion, and Extraction of Peptides.╇ Multiple identical lanes were pooled for each of the two gels, and the resulting two gel samples were digested. Details of in-gel reduction, alkylation, digestion, and peptide extraction have been described elsewhere.42 A C18 RP peptide Macrotrap SPE cartridge
THE HELA CELL PHOSPHOPROTEOMEâ•…â•… 123
(Michrom BioResources, Inc., Auburn, CA) was used to desalt the in-gel tryptic digests. Peptides were converted to methyl esters as described earlier, and the samples were then reconstituted in IMAC loading solution (1:1:1 methanol/acetonitrile/0.01% acetic acid) at a ratio of 100╯µL to 100–200╯µg of peptide. The first biological replicate Roche extract sample was analyzed using LC-LTQ-FT MS, while the second biological replicate sample was analyzed using LC-LTQ-Orbitrap MS. 4.2.4.9 Phosphopeptide Enrichment Using IMAC.╇ Phosphopeptides are enriched using an IMAC protocol that includes advances and optimizations recently summarized by Ross et al.37 with the exception of using thionyl chloride during the methyl esterification process. Also suggested is using custom-packed IMAC Macrotrap cartridges with a 50-µL bed volume (Michrom BioResources, Inc.) for phosphopeptide enrichment. The procedure consists of the following: 1. The column is stripped with 500╯µL 50╯mM EDTA (adjusted to pH 9–10 with ammonium hydroxide) at a flow rate of 50╯µL/min. 2. The column is washed with 1000╯µL nanopure water at 100╯µL/min. 3. The column is activated with 375╯µL 100╯mM FeCl3 at 25╯µL/min. 4. The column is washed to remove excess metal ions with 400╯µL 0.1% acetic acid at 50╯µL/min. 5. The column is loaded with approximately 1.5╯mg sample reconstituted in IMAC loading solution (1:1:1 methanol/ acetonitrile/0.01% acetic acid) at a ratio of 100╯µL to 100–200╯µg of peptide at 4╯µL/min. 6. The column is washed with 400╯µL wash buffer (100╯mM NaCl, 1% acetic acid, and 25% acetonitrile) at 25╯µL/min. 7. The column is re-equilibrated with 300╯µL of 0.01% acetic acid. 8. The column is eluted with 250╯µL of 50╯mM Na2HPO4 (pH ∼8.5). 9. The eluate is immediately acidified with acetic acid to a pH of ∼4. 4.2.5 Description of Instrumental Analyses 4.2.5.1 RP/Nano-HPLC Separation.╇ Peptide mixtures from HeLa cell extracts were separated using an automated dual-column phosphoproteome nano-HPLC platform assembled in-house and has now been reported in the literature.43 All portions of the separation system that come in contact with peptide mixtures with the exception of the autosampler syringe (but including the valve apparatus and transfer lines)
124â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
are nonmetal to minimize the loss of phosphopeptides. The platform includes two 103-mL syringe pumps (Model 100DM, Teledyne Isco, Inc., Lincoln, NE) controlled using a single series D controller and a 1.5-mL mobile phase mixer, which was built in-house. One pump is dedicated to mobile phase A and is operated at 1000╯psi, and the other is dedicated to mobile phase B and is operated at 1500╯psi. Eight, twoposition Valco valves (Valco Instruments Co., Houston, TX) are used, including a six-port injection valve with a 10-µL sample loop, two fourport valves for mobile phase and mixer purge selection, and a 10-port and two four-port valves for directing the sample to either of two pairs of SPE and analytical columns. Two four-port valves are used to connect the pump to either the fluidic system or to a pair of refill reservoirs. With the two-column design, samples can be loaded, desalted, and analyzed using one pair of SPE and analytical columns while the other pair is being re-equilibrated, which allows for continuous sample analysis. The SPE precolumns are prepared from 150-µm i.d., ∼10-cm-long fused-silica capillaries packed in-house with 5-µm octadecylsilane (ODS-AQ) C18 material (YMC Co., Ltd., Kyoto, Japan) to a bed length of 4╯cm. The SPE precolumns are double fritted (one Kasil® potassium silicate, PQ Corporation, Valley Forge, PA, chemical frit at each end) due to the procedural backwashing of the SPE columns directly after sample loading and prior to analytical column separation. The two analytical separation columns are composed of 50-µm i.d. fused silica (Polymicron Technologies Inc., Phoenix, AZ), 40-cm-long capillaries packed in-house with 5-µm ODS-AQ C18 RP material. The tips coupled to the columns for electrospray are 10-µm i.d. open tubular fused silica that have been etched with hydrofluoric acid (HF) for uniform tip bevel and opening.44 The SPE precolumn and tips are connected to the analytical column using PicoClear unions (New Objective, Inc., Woburn, MA). An in-house constructed rack assembly supports the valve and column system and was fitted to a PAL autosampler (Leap Technologies, Carrboro, NC) for automated sample loading and analysis. Peptide samples were loaded onto the SPE precolumn and backwashed with 0.1╯M acetic acid in nanopure water. A voltage of 2.3╯kV is applied at the split “tee” at the head of the column instead of at the union between column and the ESI tip to minimize loss of phosphopeptides. The ESI tips are positioned at the MS inlet, using a set of encoding translation stages (Newport, Irvine, CA). All components of the LC system are controlled by custom software that runs on a laptop computer that communicates with the various hardware components via a 16-port USB hub and that triggers MS data acquisition using a contact closure connection.
THE HELA CELL PHOSPHOPROTEOMEâ•…â•… 125
The HPLC mobile phases were composed of 0.1╯M acetic acid in nanopure water (A), and 70% acetonitrile/0.1╯M acetic acid in nanopure water (B). The system was equilibrated at 1000╯psi for 20 minutes with 100% mobile phase A. Next, an exponential gradient was created by valve switching from pump A to B, which displaced mobile phase A in the mixer with mobile phase B. The gradient was controlled by the split flow (∼9╯µL/min) under constant pressure conditions. The final composition of mobile phase B was approximately 70% by the end of the HPLC run (180 minutes). 4.2.5.2 MS Analysis.╇ A linear ion trap/Fourier transform hybrid MS was used for some of the product ion spectral data set collection, where data-dependent analysis (DDA) data sets were collected for the 10 most abundant species after each high-resolution MS scan by the LTQ-FT (100,000 resolution and mass scan range of 400–2000 m/z). A linear ion trap/Orbitrap hybrid MS was also used for some of the product ion spectral data set collection. Data-dependent data sets were collected for the 10 most abundant species after each high-resolution MS scan by the LTQ-Orbitrap (100,000 resolution and mass scan range of 300–2000 m/z). Data sets were also collected with high mass accuracy precursor scans by the LTQ-Orbitrap, data-dependent MS/MS of the top five peptides, followed by MS3 of the neutral loss peak in the MS2 scan that was associated with a precursor peak loss corresponding to phosphate loss (i.e., a neutral loss of 32.7╯Da [3+], 49.0 [2+], 65.4, and 98.0 [1+]). To enhance identification of phosphopeptides, data sets were collected following an additional gas-phase (GP) separation45 within the MS, which entails scanning for shorter, predefined m/z ranges, which are 300–850 and 750–1575, both with the precursor scan at 100,000 resolution. 4.2.6 Current Approaches for Peptide Identification and False Discovery Rate (FDR) Determination To identify peptides, all data collected from LC-MS/MS analyses (LCLTQ-FT MS/MS and LC-LTQ-Orbitrap MS/MS) were analyzed using SEQUEST and the following search criteria for phosphorylated peptides: static methyl esterification on D-, E-, and C-termini of the peptides in conjunction with dynamic phosphorylation of S, T, and Y residues, all searched as full tryptic cleavage products. As the precursor masses were collected with high mass accuracy, the SEQUEST parameter file also contained a search criteria cutoff of ±1.5╯Da for the precursor masses. A no-enzyme search was performed for the standard
126â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
extract. Data were searched against the human International Protein Index (IPI) database (version 3.20 containing 61,225 protein entries; available at www.ebi.ac.uk/IPI). To determine the FDR, the IPI database was searched as a decoy database; that is, the reversed human IPI was appended to the forward database and included in the SEQUEST search. The FDR was estimated from the forward and reverse (decoy) filtered matches and was calculated as a ratio of two times the number of false positive peptide identifications to the total number of identified peptides.46 For phosphorylated peptide search results (fully tryptic only), the following filtering criteria were applied for an FDR╯≤╯5%: 1+ CS, XCorr╯≥╯1.4; 2+ CS, XCorr╯≥╯2.4; 3+ and 4+ CS, XCorr╯≥╯3.3, all CSs with DelCn2╯≥╯0.13. All phosphopeptide filtering criteria included a mass error cutoff within ±6.5╯ppm. For the standard extract, the following filtering criteria were applied for an FDR╯≤╯5%: 1+ CS, DelCn2╯≥╯0.1, XCorr╯≥╯1.5, both partially and fully tryptic ends; 2+ CS, DelCn2╯≥╯0.1, XCorr╯≥╯2.2, fully tryptic ends; 2+ CS, DelCn2╯≥╯0.1, XCorr╯≥╯4.0, partially tryptic ends; 3+ CS, DelCn2╯≥╯0.1, XCorr╯≥╯2.9, fully tryptic ends; 3+ CS, DelCn2╯≥╯0.1, XCorr╯≥╯4.6, partially tryptic ends. High-confidence identifications were obtained using the accurate mass and time tag approach and in-house developed programs Viper and MultiAlign that have been described elsewhere.47 4.2.7 Results of the Protein Extraction and Preparation This section looks at a comparison of three types of methodology approaches for performing eukaryote phosphoproteomic analyses. Figure 4.13 illustrates an overview of the steps and methodologies used in this study. 4.2.7.1 Detergent Lysis, Trizol, and Ultracentrifugation.╇ In this study, normal HeLa cells (i.e., unperturbed cells) were lysed prior to protein extraction and solubilization. The associated differences in the study include the use of the detergent-based Roche Complete lysis kit versus Trizol lysis and extraction. Also, the use of 1-D SDS-PAGE to separate extracted proteins was incorporated, and ultracentrifugation was used to facilitate removal of nucleic acids from the protein digest in the detergent extraction approach prior to SPE cleanup (i.e., desalting and further removal of nucleic acids). A clear gelatinous substance was observed after centrifugation and decanting as a pellet on the bottom of the centrifuge tubes thought to be composed of nucleic acids. High recovery of peptides (98%) was observed following
THE HELA CELL PHOSPHOPROTEOMEâ•…â•… 127
Normal HeLa cells Biological replicate 1 & Biological replicate 2
Trizol lysis
Roche lysis
In-solution digest
In-solution digest
Ultracentrifugation
1-D gel digest
SPE Methylation IMAC Phosphopeptides nano-RP-LC ESI-MS/MS
Figure 4.13.╇ Overview of the methodology studied in the analysis of HeLa cell total proteome coverage. The initial step of the study is composed of the lysis of the normal HeLa cells and subsequent protein extraction and solubilization. Key differences include the use of the Roche Complete lysis kit versus Trizol lysis and extraction, and incorporation of 1-D SDS-PAGE separation of extracted proteins. (Reprinted with permission from Ham et al. J. Proteome Res. 2008, 7, 2215–2221. Copyright 2008 American Chemical Society.)
ultracentrifugation. When ultracentrifugation was performed on the undigested extract, protein loss was greater and ranged from 15% to as high as 48%. 4.2.7.2 Nucleic Acid Removal with SDS-PAGE.╇ Finally, SDS-PAGE was also used as an alternative approach for removing nucleic acids prior to IMAC enrichment of the phosphopeptides.5,41 Advantages of a gel-based approach is the ability to target specific molecular weight ranges of proteins for more comprehensive phosphopeptide identification without additional fractionation prior to digestion and IMAC enrichment. Other advantages include a more efficient tryptic digest
128â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
due to the enhanced accessibility of the protein backbone denatured into a linear orientation locked within the gel. Low throughput can be a disadvantage due to the labor intensive aspect of in-gel digestions and generally low recoveries, that is, 40.0╯±â•¯16.8%, (n╯=╯4). The overall recovery of this approach is comparable with other approaches for comprehensive phosphopeptide identification. Additional losses are usually expected to occur during fractionation steps such as strong cation exchange (SCX) that are often required when applying nongelbased approaches. As an example, recoveries from SPE used for desalting/detergent cleanup steps were approximately 51.4╯±â•¯15.6% (n╯=╯9, includes data from all three approaches). 4.2.8 HeLa Cell Phosphoproteome Methodology Comparison The results from the two biological replicates for the Roche Complete in-solution digest method and the Roche Complete in-gel digest method phosphoproteomic analyses are listed in Table 4.1. The complementary nature of the two extraction methodologies are illustrated in these results. For the normal HeLa cell phosphoproteome, a combined total of 651 phosphorylation sites and 597 unique phosphoproteins were identified. Spectra for the 597 phosphopeptides along with SEQUEST identification information are included in the SpectrumLook Software Package (see Section 4.4). Table 4.1 shows the three types of sample processing procedure (Roche Complete in-solution digest, Roche Complete in-gel digest, and Trizol) results that allow an assignment of the efficiencies. The Roche Complete in-solution digest approach yielded the greatest number of phosphorylated protein identifications followed by the in-gel digest approach.
TABLE 4.1.╇ Phosphoproteomic Comparison of Total HeLa Cell Lysate Methodology Biological Replicate 1
Number of Unique
Phosphopeptides Phosphorylated sites Phosphoproteins
Roche
Roche
InSolution
In-Gel
172 337 311
Biological Replicate 2 Rep1
Roche
Roche
Rep2
Trizol
Total
InSolution
In-Gel
Total
Total
143 195
116 222
302 521
153 267
135 179
248 397
380 651
294
260
498
313
301
459
597
THE HELA CELL PHOSPHOPROTEOMEâ•…â•… 129
4.2.8.1 Roche In-Solution versus Trizol Extraction.╇ Figure 4.14 shows the overlap in unique phosphopeptides and phosphoproteins between the two extraction methods (namely, Roche in-solution digest and the Trizol extraction) for sample 1 in the form of Venn diagrams. The phosphoproteins identified in the Roche solution digest sample contained approximately 57% of the unique phosphorylated peptides (74% of the phosphoproteins) identified in the Trizol sample. We can conclude from this that the complement of proteins within the two extracts is similar, which is consistent with our observations for other samples prepared using the two methodologies. In contrast to the Trizol extraction, the Roche lysis approach does not require numerous protein precipitation steps that can result in poor recovery of precipitated proteins. We can also conclude from this that the good overlap in protein with the Roche lysis affords a good alternative to Trizol extraction. 4.2.8.2 In-Solution and In-Gel Digests Phosphoproteome Coverage.╇ To compare the reproducibility of phosphorylated protein identifications as a function of the method used for extraction and digestion, the sample workup and analysis of biological replicates was performed. 4.2.8.2.1╅ Biological Replicates.╇ The extent of overlap in unique phosphopeptide and phosphorylated protein identifications measured Roche in-solution
106
Trizol
66
57% overlap in unique phosphopeptides
50
Trizol
Roche in-solution
119
192
68
89% overlap in unique phosphopeptides
Figure 4.14.╇ Venn diagrams comparing the overlap in unique phosphorylated peptides and proteins for the samples from the Roche in-solution digest and the Trizol extraction. There was a 57% overlap in unique phosphorylated peptides (left) and an 89% overlap in unique phosphorylated proteins between the two extraction methodologies. (Reprinted with permission from Ham et al. J. Proteome Res. 2008, 7, 2215–2221. Copyright 2008 American Chemical Society)
130â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
between the two biological replicates is illustrated in a set of Venn diagrams in Figure 4.15. For the in-solution digest samples (Fig. 4.15a), an overlap of 53% in phosphopeptides and 64% in phosphoproteins identified between biological replicates is seen in the results. There is also a similar observation for the overlap between biological replicates for the in-gel digest where it is 55% for phosphopeptides and 70% for phosphorylated proteins (Fig. 4.15b). However, there is a decrease to 28% for unique phosphopeptides and 40% for phosphorylated proteins in the overlap between the in-solution digest of one biological replicate with that of the in-gel digest of the second replicate as illustrated in Figure 4.15c. A 64% overlap in phosphopeptides and a 72% overlap in phosphorylated proteins between the two biological replicates are obtained (Fig. 4.15d) when the unique phosphorylated peptides and proteins identified in the in-solution and the in-gel digestion samples are combined. This indicates that the increase in the number of overlapped phosphopeptides/proteins between these two sample preparation methods provides complementary coverage and, when combined, will offer more comprehensive coverage of the HeLa cell phosphoproteome. 4.2.8.2.2â•… The Effect of Performing Technical Replicates on Phosphoproteome Coverage.╇ In the following analyses, four technical replicates were obtained by repeatedly collecting nano-RP-LC-MS/MS data sets for each extraction methodology (the Trizol sample was measured and reported for biological replicate 1 only) to investigate the influence of technical replicates on the phosphoproteome coverage. The total number of unique phosphopeptide identifications was determined from the individual data set identifications (FDR╯<╯5%) after they were sequentially added. New unique phosphorylated peptides were observed with each technical replicate. This demonstrates that new information can be obtained with each analysis. To illustrate this observation, the number of unique phosphopeptides determined from biological replicate 1 nano-RP-LCMS/MS analysis were 96 (data set 1), 110 (data sets 1–2), 161 (data sets 1–3), and 172 (data sets 1–4). 4.2.8.2.3â•… Multiple (10) Technical Replicate Study.╇ Quality control samples (peptides from a standard Shewanella extract) that had been analyzed with at least 10 technical replicates to investigate the contribution of successive technical replicates were plotted versus the number of unique peptide identifications (Fig. 4.16). The observed trends for the Shewanella extract analyzed on an LTQ-Orbitrap mass
THE HELA CELL PHOSPHOPROTEOMEâ•…â•… 131 Biological replicate 2
Biological replicate 1
Biological replicate 1
(a)
(a) 81 In-solution
Biological replicate 2
91
53%
199 112 119
72
192
114 68
192
91 68
64%
In-solution
In-solution
(b)
(b) 74
In-gel
69
55%
210 84
61
119 70%
In-gel
In-gel
In-solution
(c)
(c) In-solution
38 134
28%
97
190
121 192 40%
68 180 In-gel
In-gel
(d)
(d) 152 111
In-solution plus in-gel
64%
327 87
124 119 72% In-solution plus in-gel
192
129 68 In-solution plus in-gel
Figure 4.15.╇ Venn diagrams of the overlap in unique phosphorylated peptides (left) and in unique phosphorylated protein identifications (right) between the two biological replicates. The combination of the two sample preparation approaches, Roche insolution digest and Roche in-gel digest, gives a more comprehensive coverage of the HeLa cell phosphoproteome. (Reprinted with permission from Ham et al. J. Proteome Res. 2008, 7, 2215–2221. Copyright 2008 American Chemical Society.)
132â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
Unique peptides identified
(a)
LTQ-Orbitrap–Shewanella
9000 8000 7000 6000
3
2
y = 8.959x – 246.21x + 2227.5x + 1640.5 2 R = 0.9968
5000 4000 3000 0
2
4
6
8
10
12
Data set count
11T-FTICR–Shewanella
Unique peptides identified
(b)
10000 9000 8000 7000 y = 15.632x 3 – 338.01x 2 + 2454.9x + 2588.2 R 2 = 0.99
6000 5000 4000 0
2
4
6
8
10
12
Data set count Figure 4.16.╇ (a) Ten technical replicates for a Shewanella extract analyzed on an LTQOrbitrap mass spectrometer (∼4000 unique peptides per data set). (b) Ten technical replicates on an 11T-FTICR mass spectrometer (∼4000 unique peptides per data set). (Reprinted with permission from Ham et al. J. Proteome Res. 2008, 7, 2215–2221. Copyright 2008 American Chemical Society.)
spectrometer (Fig. 4.16a) and an 11T-FTICR mass spectrometer48 (Fig. 4.16b) showed that up to six technical replicates are required for optimal identifications at the sample complexity analyzed (∼4000 unique peptides per data set identified). It can be gathered from this that a more complex sample will most probably require additional technical replicates for optimal identification.
THE HELA CELL PHOSPHOPROTEOMEâ•…â•… 133
HeLa cell phosphoproteome 142 classes and 92 unique phosphoproteins identified Overall CV = 6.9% (unique phosphopeptide responses)
Biological replicate 1 CV = 5.3% 89% coverage classes 80% coverage unique
In-solution
In-gel
Trizol
1A 1B 1C 1D 2A 2 B 2C 2 D 3A 3 B 3C 3 D CV = 3.1% 56% classes 57% unique
CV = 2.9% 54% classes 39% unique
CV = 4.0% 43% classes 42% unique
Biological replicate 2 CV = 4.0% 77% coverage classes 71% coverage unique
In-solution
In-gel
1A 1B 1C 1D 2A 2 B 2C 2 D CV = 2.2% 54% classes 53% unique
CV = 2.0% 51% classes 39% unique
Figure 4.17.╇ Influence of technical replicates on the different tiers of the study. The bottom tier lists the coefficient of variance (CV) for each of the four sets of technical replicates that were collected by nano-RP-LC-MS/MS. The middle tier lists the percent variance on the biological replicate level at 5.3% and 4.0% for replicates 1 and 2, respectively. The top tier lists the overall variance of the study at 6.9%. Coverage values were derived by denoting the 597 phosphoproteins identified by the study as the true population and calculating a percentage from the identification values listed in Table 4.1. (Reprinted with permission from Ham et al. J. Proteome Res. 2008, 7, 2215–2221. Copyright 2008 American Chemical Society.)
4.2.8.2.4â•… Illustration of Technical Replicate Influence.╇ Lastly, to illustrate the influence of technical replicates at various stages of the different methods, a tree structure of the types of replicates is illustrated in Figure 4.17. The coefficient of variance (CV) for each of the four sets of technical replicates that were collected for each of the three extraction and digest methodologies can be found at the bottom of the figure. The aligned and normalized peptide intensities (log based) of the four technical replicates within a subset was used to determine the CV value, keeping only unique phosphorylated peptides (FDR╯<╯5%) identified with an incidence of ≥2 observances. To derive the coverage values, the 597 phosphoproteins identified by the entire study were assigned as being the true population of the phosphoproteome. The percent coverage was then calculated using the identification values listed in Table 4.1. We can see that the coverage, as well as the CV values, is similar for the 20 technical replicates.
134â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
4.2.8.2.5â•… Technical Replicate Influence Conclusion.╇ From the results, it can be gathered that the low CV values demonstrate reproducibility within the replicated nano-RP-LC-MS/MS analysis. At the biological replicate levels (level 2 of Fig. 4.17), the percent variance are also low at 5.3% and 4.0% for biological replicates 1 and 2, respectively. A CV of 6.9% that includes the treatment of 20 technical replicates was the final overall variance of the study. In a protein expression study using iTRAQ-labeled peptides followed by capillary HPLC-ESI-qQ-time-offlight (TOF)-MS/MS, Wright et al.49 observed an average CV of ±11% variation at the technical replicate level (triplicate injections). 4.2.9 Overall Conclusion Cell lysis and protein extraction and solubilization methods influence both the amount and types of proteins that are identified by MS. Of the three methods investigated in this study (Roche Complete insolution digest, Roche Complete in-gel digest, and Trizol), solubilization using the Roche Complete lysis kit in conjunction with either in-solution digestion or in-gel digestion gave the highest yield of phosphorylated protein identifications along with good reproducibility between biological replicates. Among the phosphorylated proteins identified from the Roche in-solution digest sample, ∼80% overlapped with those identified from the Trizol sample. The Roche in-solution method also yielded nearly 1.8-fold more nonoverlapping phosphoprotein identifications, which supports this method as a good alternative to the Trizol extraction approach. The combined number of phosphoprotein identifications from in-solution digestion and in-gel digestion of the protein extract obtained with the Roche Complete lysis kit illustrates the advantage of combining different sample preparation strategies to obtain greater coverage of a cellular proteome. The inclusion of biological replicates can confirm observations made in protein identifications, which increases confidence in observed proteome expression studies. Careful choice and execution of sample preparation methodologies allows the attainment of acceptable overlaps of the identified proteome in the biological replicates. While fractionation techniques can be applied to further reduce the sample complexity and increase the phosphoprotein coverage, technical replicates might still be required for comprehensive identifications. The inclusion of four technical replicates in this study, measured with CV╯<╯5% in peptide responses, demonstrated that information is added with the addition of each technical replicate. For a moderately complex protein extract such as the Shewanella standard, six technical replicates appears optimal
NONPHOSPHOPROTEOME HELA CELL ANALYSISâ•…â•… 135
information collection; for complex extracts, more technical replicates may be required. 4.3 NONPHOSPHOPROTEOME HELA CELL ANALYSIS The flow through and salt wash during IMAC sample loading onto the column represent the nonphosphorylated proteome of the HeLa whole cell lysate. During IMAC enrichment of the second sample set, column flow through, as well as the NaCl and weak acid washes were collected, and peptides were measured using BCA analysis. Table 4.2 shows the fraction of starting material in each of these nonphosphorylated peptide fractions for each of the three extraction conditions (Roche in-solution digest, Roche in-gel digest, and Trizol in-solution digest). Because of the low concentration of peptides in the weak acid wash, this fraction was not further processed for protein identification. In general, for the three methods compared, the IMAC flow through and salt wash fractions constituted approximately 80% of the original starting material loaded onto the column. The remaining 20% represents irreversible binding of peptides to the IMAC column. 4.3.1 IMAC Flow Through Peptide Analysis The IMAC flow through fraction contained the bulk of the nonphosphorylated peptide material (see Table 4.2), so this was subjected to SCX fractionation, and each of the 15 separate fractions was analyzed by nano-RP-LC-MS/MS for protein identification. SEQUEST results obtained for each set of the 15 fractions were combined and filtered TABLE 4.2.╇ Eluant Amount Recovered from IMAC Flow Through and Wash Fractions
IMAC Fraction
Flow through NaCl wash 0.01% HOAc wash Total flow through % Recoveredb a
Roche In-Solution
Trizol In-Solution
Roche In-Gel
Digest (mg)
Digest (mg)
Digest (mg)a
0.713 0.107 0.006 0.826 78
0.749 0.123 0.003 0.874 81
0.213 0.051 0.002 0.267 78
Determined from one gel. Based on starting material prior to IMAC enrichment and postmethylation (Roche in-solution, 1.059╯mg; Trizol in-solution, 1.079╯mg; Roche in-gel, 0.343╯mg) as determined by BCA analysis. b
136╅╅ EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES TABLE 4.3.╇ Peptide Counts and Percentage Contained within Different Charge States for the IMAC NaCl Wash and Flow Through Fractions Roche InSolution
Total peptide â•… Count Unique peptide â•… Count Unique protein â•… ID Method total â•… Unique protein +1 CS (%)a +2 CS (%)a +3 CS (%)a
Roche In-Gel
Trizol Digest
NaCl Wash
Flow Through
NaCl Wash
Flow Through
NaCl Wash
Flow Through
4434
1609
3760
6869
1443
716
1505
965
1312
3351
539
437
650
558
583
1416
279
282
886 1 81 18
21 74 4
1559 2 84 14
15 77 8
426 <0.1 70 30
17 75 9
a
Based on total peptide measured for comprehensive CS coverage. CS, charge state.
for comparison and FDR determination. The results listed in Table 4.3 show that 558 unique proteins were identified from the Roche insolution digest IMAC flow through fraction, 1416 unique proteins from the Roche in-gel digest flow through fraction, and 282 unique proteins were identified from the Trizol in-solution digest flow through fraction (all with FDR╯<╯5%). 4.3.2 IMAC NaCl Wash Peptide Analysis Peptides in the IMAC salt wash buffer (composed of 100╯mM NaCl, 1% acetic acid, and 25% acetonitrile) that are routinely used to remove weakly binding nonphosphorylated peptides from the IMAC bed also represent part of the nonphosphorylated proteome of the HeLa whole cell. Table 4.2 illustrates that for the Roche in-solution digestion and the Trizol digestion, the amount of peptide recovered from the IMAC NaCl wash was ∼0.1 and ∼0.05╯mg for the extracted gel. The NaCl wash fractions were analyzed by nano-RP-LC-MS/MS, which included a DDA for each of the three conditions in which the top 10 eluting peptides were isolated and subjected to CID for product ion spectral collection and subsequent peptide identification. A single full range of m/z 300–2000 was used in MS survey scans for the top 10 DDA. A gas-phase
NONPHOSPHOPROTEOME HELA CELL ANALYSISâ•…â•… 137 Trizol Extract In-Solution Digest
Trizol Extract In-Solution Digest
(d)
(a) 143 384 Roche Lysis In-Solution Digest
(73%)
572 195
Roche Lysis In-Solution Digest
Roche Lysis In-Gel Digest
(b)
320
(70%)
360
Trizol Extract In-Solution Digest
(c) 127 Roche Lysis In-Gel Digest
572
192 (63%)
504 68
Roche Lysis In-Solution Digest
Roche Lysis In-Solution Digest
360
22268
Roche Lysis In-Gel Digest
(e)
252 384
155 192 (70%)
(65%)
Trizol Extract In-Solution Digest
(d) 504
195
Roche Lysis In-Gel Digest
132 192 (59%)
22268
Figure 4.18.╇ Venn diagram comparison of single nano-RP-LC-MS/MS top 10 datadependent analysis versus gas-phase separation analysis. Left side (a, b, and c) compares the three sample preparation methodologies when analyzed by a single nano-RP-LC-MS/MS top 10 data-dependent analysis. Right side (d, e, and f) represents the results of the gas-phase separation analysis and subsequent SEQUEST identifications by combining the four data set results.
fractionation (GPF)16 was also performed on the IMAC NaCl wash fractions in which the sample was repeatedly analyzed by DDA, using several narrow m/z ranges as survey scans. Figure 4.18 shows a comparison of the SEQUEST results for the top 10 DDA runs and those obtained using GPF. The three sample preparation methods analyzed by a single nano-RP-LC-MS/MS top 10 DDA are compared on the left side of Figure 4.18 (a, b, and c). Note that the Venn diagram for the Roche lysis in-solution digest in Figure 4.18a shows that 384 proteins were identified in a single top 10 DDA run. The Trizol in-solution digest yielded 195 identified proteins, and the overlap of 143 proteins equates to 73% of the Trizol identifications. The results from the GP separation analysis and subsequent SEQUEST identifications (obtained from four
138â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
data sets combined) are represented on the right side of Figure 4.18 (d, e, and f). The identifications for the Roche in-solution digest increased by 49%, from 384 to 572 identified proteins (Fig. 4.18a, e). There is an overlap of 80% in identified proteins between the single top 10 DDA run and the combined GPF runs for the Roche in-solution digest. Combining the two sets results in 650 unique protein identifications. In general, a significant increase in the number of unique proteins was observed using the GPF approach (from 14% to 49%) compared with using a single top 10 DDA. 4.3.3 IMAC Flow Through versus NaCl Wash Comparison A total of 886 and 426 unique proteins were identified from the IMAC flow through and NaCl wash for the Roche and Trizol in-solution digests, respectively (see Table 4.3). For the Roche in-gel digest, a total of 1559 unique proteins were identified from the IMAC flow through and NaCl wash (see Table 4.3). Expectedly, more proteins were identified in the flow through fraction than the NaCl wash fraction in the in-gel digestion since the bulk of peptide material was present in the flow through (see Table 4.2). However, fewer than expected proteins were identified from the flow through fractions for the in-solution digestion approach, most likely due to the presence of detergent (primarily composed of 3[(3-cholamidopropyl)dimethylammonio]-1-propanesulfonate hydrate (CHAPS)) in the in-solution digest carried through each step. As a result, the flow through was directly collected, concentrated, and subjected to SCX without any further cleanup. Of the three approaches, the Roche in-gel digest provided the greatest number of unique protein identifications, while the Trizol digest provided the lowest. 4.3.4 Gene Ontology Comparison The Babelomics50 bioinformatics tool was used to compare the subcellular locations and functions of the proteins that were identified from the IMAC flow through and NaCl wash fractions. The protein identifications from the two IMAC fractions were combined for each extraction methodology and compared (i.e., Roche in-solution vs. Roche in-gel vs. Trizol). Nine statistically significant differences were observed in the distribution of protein biological processes, all related primarily to nuclear processes, with an increased nuclear process association for the Trizol extraction. Figure 4.19 is a bioinformatics comparison of the cellular component localization of the HeLa cell nonphosphorylated proteome for the Roche Complete in-solution digest (top bars in figure)
NONPHOSPHOPROTEOME HELA CELL ANALYSISâ•…â•… 139
Cellular component. Level : 3 Non-memberane-bound organelle
0
20 40 17.67%
Membrane-bound organelle
Vesicle Extracellular region part Virion part
80
Receptor complex Neuromuscular junction Unlocalized protein complex
0.38% 0% 0.90% 0% 0.13% 0% 0.26% 0% 0 20
(*) Unadjusted P-value; FOR(indep.)adjusted P-value;
P-values (*) <1e–05 0.00001 <1e–05 0.00005 0.02020 1
0.38% 1.33% 3.20% 1.33% 4.23% 4% 0.13% 0%
0.30751 1 0.72010 1
Cell part Synapse part
100
49.33% 29.71% 61.33% 55.06% 69.33%
Organelle part
Extracellular matrix part
60
40
60
1
1
1
1
97.44% 98.67% 1
1
1
1
1
1
1
1
1
1
80 100
Cluster query Cluster reference
Figure 4.19.╇ Babelomics50 bioinformatics comparison of the cellular component localization of the HeLa cell nonphosphorylated proteome for the Roche Complete in-gel digest (top bars in figure) versus the Trizol extraction (bottom bars in figure).
versus the Trizol extraction (bottom bars in figure). The comparison indicates that the Trizol method is biased toward nuclear proteins during cell lysis and extraction. No significant differences were observed between the Roche Complete in-solution digest and the SDS-PAGE cleanup fractions. Interestingly, there were no significant differences observed among subcellular locations and protein function when comparing the two fractions from the same extraction methodology. The protein similarity for the two fractions, however, is supported by a 72% overlap observed for the NaCl wash fraction (891 unique proteins identified, all three methods combined) and the flow through fraction (1639 unique proteins identified, all three methods combined). In summary, a total of 1888 unique proteins were identified from the IMAC flow through fractions and the IMAC NaCl wash fractions combined (Table 4.1). This result is comparable with the previously reported 1200 proteins identified by Fountoulakis et al.51 who used 2-D gel
140â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
electrophoresis followed by peptide mass fingerprint analysis using matrix-assisted laser desorption ionization time-of-flight (MALDITOF)/MS. 4.3.5 IMAC Bed Nonspecific Binding Study In contrast to the 72% overlap in observed proteins between the IMAC flow through and NaCl wash fraction, there is only a 17% overlap in the associated peptides. The two fractions encompass the same compliment of proteins as demonstrated by the 72% overlap in identifications; however, they contain quite different sets of peptides as demonstrated by the 17% overlap. This indicates that the nonspecific binding to the IMAC bed is directly peptide dependent. The observed CSs of the peptides from the two IMAC fractions are presented in Table 4.3. Note that the flow through fractions for all approaches contain a larger percentage of the +1 CS than the NaCl wash fraction. Also, a greater fraction of the +3 CS is retained by the IMAC resin as demonstrated by the increased percentage of the +3 CS in the NaCl wash fraction. This type of nonspecific binding can potentially be attributed to the following: (1) a weak chelating metal complex being formed between the iron(III) and these amino acid residues with low hydrophobicity and high gas basicity (relative); (2) a weak, noncovalent electrostatic interaction between the peptides and the IMAC bed; (3) a hydrophilic– hydrophilic attraction between the amino acids and the iminodiacetic acid resin of the IMAC column; or (4) a hydrophobic interaction between the peptides and the IMAC bead matrix. The graphs in Figure 4.20 illustrate peptide mass [M╯+╯H] as a function of CS and the total number of peptides observed for that respective CS. Figure 4.20a represents the NaCl wash fraction of the Roche insolution digest approach while Figure 4.20b represents the flow through fraction. This graphically illustrates the increased observance of the +3 CS in the NaCl wash fraction (Fig. 4.20a) and the +1 CS in the flow through fraction (Fig. 4.20b). This would indicate that the nonspecific binding is a property (chemical and/or physical) that is associated with the amino acid residues that constitute the peptide chain. Similar behavior is observed across all three sample treatments with respect to the relationship between nonspecific binding and CS (data not shown), demonstrating the reproducibility of the observed effect. Graphs of peptide mass [M╯+╯H] versus unique peptide count for the IMAC flow through and NaCl wash fractions are illustrated in Figure 4.21. In all three extraction methodology comparisons (Fig. 4.21a Roche in-gel digest, Fig. 4.21b Roche in-solution digest, and Fig. 4.21c Trizol),
(a) 450
CS+2
400 350
Peptide count
300 250
CS+3
200 150
CS+1
100 50 0 400
900
1400
1900
2400
2900
3400
3900
Peptide mass [M + H] (b) 200 180
CS+2
160
Peptide count
140 120
CS+1
100 80
CS+3
60 40 20 0 400
900
1400
1900
2400
2900
3400
3900
Peptide mass [M + H] Figure 4.20.╇ Graphs of the peptides mass [M╯+╯H] as a function of charge state and the total number of peptides observed for that respective charge state for (a) NaCl wash fraction of the Roche in-solution digest approach and (b) flow through fraction. Similar behavior is observed for all three charge states with respect to the relationship between nonspecific binding and charge state demonstrating reproducibility of the observed effect. 141
(a) 300
IMAC flow through fraction Peptide count
250 200
IMAC NaCl wash fraction
150 100 50 0 400
1400 2400 Peptide mass [M + H]
3400
(b) 160
Peptide count
140
IMAC NaCl wash fraction
120 100
IMAC flow through fraction
80 60 40 20 0 400
1400
2400
3400
Peptide mass [M + H] (c) 70
IMAC NaCl wash fraction
Peptide count
60 50
IMAC flow through fraction
40 30 20 10 0 400
1400
2400
3400
Peptide mass [M + H] Figure 4.21.╇ Graphs of peptide mass [M╯+╯H] versus unique peptide count for the IMAC flow through and NaCl wash fractions for (a) Roche in-gel digest, (b) Roche in-solution digest, and (c) Trizol in-solution digest. The IMAC flow through encompasses a wider molecular weight distribution range with the apex shifted to a higher molecular weight as compared with the IMAC NaCl wash fraction, indicating that the nonspecific binding is less of a reversed-phase interaction and more of a property that is dictated by the amino acid residues that comprise the peptide chain.
REVIEWING SPECTRA USING THE SPECTRUMLOOK SOFTWARE PACKAGEâ•…â•… 143
the IMAC flow through encompasses a wider molecular weight distribution range with the apex shifted to a higher molecular weight as compared with the IMAC NaCl wash fraction. This would indicate that the nonspecific binding is less of an RP interaction than of a property that is dictated by the amino acid residues that compose the peptide chain. This is in agreement with the CS correlation observed for the nonspecific binding.
4.4 REVIEWING SPECTRA USING THE SPECTRUMLOOK SOFTWARE PACKAGE A software package called SpectrumLook is available that allows readers to inspect the fragmentation (MS/MS) spectra for the phosphopeptides identified in this study. Using this software, readers can visually browse the MS/MS spectra that led to the phosphopeptide identifications, including viewing annotations for the identified b and y ions, and neutral loss ions where appropriate. This software is supported by the Microsoft Windows platform. There are six files included with the SpectrumLook package that can be accessed at http:// ncrr.pnl.gov/data/#HomoSapiensData, Supplemental data to HeLa Cell Phosphoproteome publication (registration not required). Note: To access the file, right click on the aforementioned link and select “Open Weblink in Browser.” 1. SpectrumLook_Installer.msi—the installer. To install, doubleclick on the file and follow the installation prompts. During installation, a shortcut to run the SpectrumLook program is placed at Start╯→╯Programs╯→╯PAST Toolkit╯→╯SpectrumLook. Alternatively, navigate to the C:\Program Files\SpectrumLook\ folder and double-click file “SpectrumLook.exe.” 2. MT_Human_PP5_grouped.mzXML—the phosphopeptide specÂ� tra in mzXML format. 3. MT_Human_PP5_grouped_syn.txt—a summary of the identifications determined by SEQUEST. See the Readme.txt file for a description of the columns in this file. 4. MT_Human_PP5_grouped.ini—a parameter file that specifies the appropriate parameters for these data when browsing them with SpectrumLook. 5. Readme.txt and RevisionHistory.txt—text files that describe the SpectrumLook software.
144â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
REFERENCES ╇ 1.╇ Manning, G.; Whyte, D.B.; Martinez, R.; Hunter, T.; Sudarsanam, S. Science 2002, 298, 1912–1934. ╇ 2.╇ Chalmers, M.J.; Kolch, W.; Emmett, M.R.; Marshall, A.G.; Mischak, H. J. Chromatogr. B 2004, 803, 111–120. ╇ 3.╇ Olsen, J.V.; Blagoev, B.; Gnad, F.; Macek, B.; Kumar, C.; Mortensen, P.; Mann, M. Cell 2006, 127, 635–648. ╇ 4.╇ Zhou, H.; Watts, J.D.; Aebersold, R. Nat. Biotechnol. 2001, 19, 375–378. ╇ 5.╇ Beausoleil, S.A.; Jedrychowski, M.; Schwartz, D.; Elias, J.E.; Villen, J.; Li, J.; Cohn, M.A.; Cantley, L.C.; Gygi, S.P. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 12130–12135. ╇ 6.╇ Amanchy, R.; Kalume, D.E.; Iwahori, A.; Zhong, J.; Pandey, A. J. Proteome Res. 2005, 4, 1661–1671. ╇ 7.╇ Ballif, B.A.; Roux, P.P.; Gerber, S.A.; MacKeigan, J.P.; Blenis, J.; Gygi, S.P. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 667–672. ╇ 8.╇ Hoffert, J.D.; Pisitkun, T.; Wang, G.; Shen, R.F.; Knepper, M.A. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 7159–7164. ╇ 9.╇ Yang, F.; Stenoien, D.L.; Strittmatter, E.F.; Wang, J.H.; Ding, L.H.; Lipton, M.S.; Monroe, M.E.; Nicora, C.D.; Gristenko, M.A.; Tang, K.Q.; Fang, R.H.; Adkins, J.N.; Camp, D.G.; Chen, D.J.; Smith, R.D. J. Proteome Res. 2006, 5, 1252–1260. 10.╇ Arroyo, J.D.; Hahn, W.C. Oncogene 2005, 24, 7746–7755. 11.╇ Brady, M.J.; Saltiel, A.R. Recent Prog. Horm. Res. 2001, 56, 157–173. 12.╇ Oliver, C.J.; Shenolikar, S. Front. Biosci. 1998, 3, D961–D972. 13.╇ Brown, L.; Borthwick, E.B.; Cohen, P.T. Biochim. Biophys. Acta 2000, 1492, 470–476. 14.╇ Hunter, T.; Sefton, B.M. Proc. Natl. Acad. Sci. U.S.A. 1980, 77, 1311–1315. 15.╇ McLachlin, D.T.; Chait, B.T. Curr. Opin. Chem. Biol. 2001, 5, 591–602. 16.╇ Qian, W.J.; Goshe, M.B.; Camp, D.G., II; Yu, L.R.; Tang, K.; Smith, R.D. Anal. Chem. 2003, 75, 5441–5450. 17.╇ Garcia, B.A.; Shabanowitz, J.; Hunt, D.F. Methods 2005, 35, 256–264. 18.╇ Salih, E. Mass Spectrom. Rev. 2005, 24, 828–846. 19.╇ Smith, R.D.; Anderson, G.A.; Lipton, M.S.; Pasa-Tolic, L.; Shen, Y.; Conrads, T.P.; Veenstra, T.D.; Udseth, H.R. Proteomics 2002, 2, 513–523. 20.╇ Janssens, V.; Goris, J. Biochem. J. 2001, 353, 417–439. 21.╇ Ceulemans, H.; Bollen, M. Physiol. Rev. 2004, 84, 1–39. 22.╇ Ficarro, S.B.; McCleland, M.L.; Stukenberg, P.T.; Burke, D.J.; Ross, M.M.; Shabanowitz, J.; Hunt, D.F.; White, F.M. Nat. Biotechnol. 2002, 20, 301–305. 23.╇ Kim, J.E.; Tannenbaum, S.R.; White, F.M. J. Proteome Res. 2005, 4, 1339–1346.
REFERENCESâ•…â•… 145
24.╇ Moser, K.; White, F.M. J. Proteome Res. 2006, 5, 98–104. 25.╇ Ballif, B.A.; Villen, J.; Beausoleil, S.A.; Schwartz, D.; Gygi, S.P. Mol. Cell. Proteomics 2004, 3, 1093–1101. 26.╇ Gentile, S.; Darden, T.; Erxleben, C.; Romeo, C.; Russo, A.; Martin, N.; Rossie, S.; Armstrong, D.L. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 5202–5206. 27.╇ Posewitz, M.C.; Tempst, P. Anal. Chem. 1999, 71, 2883–2892. 28.╇ Cao, P.; Stults, J.T. Rapid Commun. Mass Spectrom. 2000, 14, 1600–1606. 29.╇ Kocher, T.; Allmaier, G.; Wilm, M. J. Mass Spectrom. 2003, 38, 131–137. 30.╇ Stover, D.R.; Caldwell, J.; Marto, J.; Root, R.; Mestan, J.; Stumm, M.; Ornatsky, O.; Orsi, C.; Radosevic, N.; Liao, L.; Fabbro, D.; Moran, M.F. Clin. Proteomics 2004, 1, 069–080. 31.╇ Gruhler, A.; Olsen, J.V.; Mohammed, S.; Mortensen, P.; Faergeman, N.J.; Mann, M.; Jensen, O.N. Mol. Cell. Proteomics 2005, 4, 310–327. 32.╇ Zhang, Y.; Wolf-Yadlin, A.; Ross, P.L.; Pappin, D.J.; Rush, J.; Lauffenburger, D.A.; White, F.M. Mol. Cell. Proteomics 2005, 4, 1240–1250. 33.╇ Nilsson, C.L.; Dillon, R.; Devakumar, A.; Shi, S.D.-H.; Greig, M.; Rogers, J.C.; Krastins, B.; Rosenblatt, M.; Kilmer, G.; Major, M.; Kaboord, B.J.; Sarracino, D.; Rezai, T.; Prakash, A.; Lopez, M.; Ji, Y.; Priebe, W.; Lang, F.F.; Colman, H.; Conrad, C.A. Quantitative phosphoÂ� proteomic analysis of the STAT3/IL-6/HIF1alpha signaling network: an initial study in GSC11 glioblastoma stem cells. J. Proteome Res. 2010, 9, 430–443. 34.╇ Zhang, G.A.; Neubert, T.A. Use of detergents to increase selectivity of immunoprecipitation of tyrosine phosphorylated peptides prior to identification by MALDI quadrupole-TOF MS. Proteomics 2006, 6, 571–578. 35.╇ Schmidt, S.R.; Schweikart, F.; Andersson, M.E. Current methods for phosphoprotein isolation and enrichment. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2007, 849, 154–162. 36.╇ Pinkse, M.W.H.; Uitto, P.M.; Hilhorst, M.J.; Ooms, B.; Heck, A.J.R. Selective isolation at the femtomole level of phosphopeptides from proteolytic digests using 2D-nanoLC-ESI-MS/MS and titanium oxide precolumns. Anal. Chem. 2004, 76, 3935–3943. 37.╇ Ndassa, Y.M.; Orsi, C.; Marto, J.A.; Chen, S.; Ross, M.M. J. Proteome Res. 2006, 10, 2789–2799. 38.╇ Garrels, J. J. Biol. Chem. 1979, 254, 7961–7977. 39.╇ Leimgruber, R.M. In The Proteomics Protocols Handbook, Walker, J.M., Ed. Totowa, NJ: Humana Press, 2005; p. 4. 40.╇ Leimgruber, R.M. Proteomics 2002, 2, 135–144. 41.╇ de Souza, G.A.; Godoy, L.M.F.; Mann, M. Genome Biol. 2006, 7, R72. 42.╇ Ham, B.M.; Jacob, J.T.; Cole, R.B. Anal. Bioanal. Chem. 2007, 387, 889–900.
146â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: NORMAL STATE STUDIES
43.╇ Ham, B.M.; Yang, F.; Jayachandran, H.; Jaitly, N.; Monroe, M.E.; Gritsenko, M.A.; Livesay, E.A.; Zhao, R.; Purvine, S.O.; Orton, D.; Adkins, J.N.; David, G. Camp, II, D.G.; Rossie, S.; Smith, R.D. J. Proteome Res. 2008, 7, 2215–2221. 44.╇ Kelly, R.T.; Page, J.S.; Luo, Q.; Moore, R.J.; Orton, D.J.; Tang, K.; Smith, R.D. Anal. Chem. 2006, 78, 7796–7801. 45.╇ Yi, E.C.; Marelli, M.; Lee, H.; Purvine, S.O.; Aebersold, R.; Aitchison, J.D.; Goodlett, D.R. Electrophoresis 2002, 23, 3205–3216. 46.╇ Elias, J.E.; Gygi, S.P. Nat. Methods 2007, 4, 207–214. 47.╇ Jaitly, N.; Monroe, M.E.; Vladislav, P.A.; Clauss, T.R.W.; Adkins, J.N.; Smith, R.D. Anal. Chem. 2006, 78, 7397–7409. 48.╇ Harkewicz, R.; Belov, M.E.; Anderson, D.A.; Pasa-Tolic, L.; Masselon, C.D.; Prior, D.C.; Udseth, H.R.; Smith, R.D. J. Am. Soc. Mass Spectrom. 2002, 13, 144–154. 49.╇ Gan, C.S.; Chong, P.K.; Pham, T.K.; Wright, P.C. J. Proteome Res. 2007, 6, 821–827. 50.╇ Al-Shahrour, F.; Minguez, P.; Vaquerizas, J.M.; Conde, L.; Dopazo, J. Nucleic Acids Res. 2005, 33(Web Server issue),W460–W464. 51.╇ Fountoulakis, M.; Tsangaris, G.; Oh, J.; Maris, A.; Lubec, G. J. Chromatogr. A 2004, 1038, 247–265.
5
Eukaryote PTM as Phosphorylation: Perturbed State Studies 5.1 STUDY OF THE PHOSPHOPROTEOME OF HELA CELLS UNDER PERTURBED CONDITIONS BY NANO-HIGHPERFORMANCE LIQUID CHROMATOGRAPHY HPLC ELECTROSPRAY IONIZATION (ESI) LINEAR ION TRAP (LTQ)-FT/MASS SPECTROMETRY (MS) 5.1.1 Introduction The approach used to study the deoxyribonucleic acid (DNA) damage response (DDR) signaling pathways using HeLa cells involved four cellular conditions that enabled the unambiguous monitoring of protein phosphatase type 5 (PP5) targets by comparing a catalytically inactive form of PP5 to its wild-type (WT) counterpart. Figure 5.1 illustrates the major steps involved in the search for targets of PP5. Key steps include the cellular treatment with doxycycline for PP5 overexpression, DNA damage induction by bleomycin treatment, nucleic acid removal by ultracentrifugation, nonlabeled mass spectral differential quantification followed by bioinformatic discovery treatment of the data. The nonlabeled mass spectral analysis of the treated HeLa cells in biological replicate 1 (BR1) resulted in a total of 122 unique phosphorylated peptides identified, and in BR2, 227 unique phosphorylated peptides were identified. Spectra for all phosphopeptides and their SEQUEST identification information are included in the SpectrumLook Software Package (see Section 5.1.9) in compliance with the recent standards for the identification of phosphorylation sites.1 The 122 unique phosphorylated peptides identified in BR1 and the 227 identified in BR2 were subsequently used in the search for potential PP5 targets.
Proteomics of Biological Systems: Protein Phosphorylation Using Mass Spectrometry Techniques, First Edition. Bryan M. Ham. © 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 147
Biological replicates 1 & 2
HeLa WT PP5
HeLa HQ PP5
Untreated - 48 hrs for control PP5
HeLa WT PP5
HeLa HQ PP5
Doxycycline - 48 hrs to overexpress PP5
Bleomycin - 1 hr for DNA damage Cell lysis / nucleic acid removal by ultracentrifugation
In-solution digest SPE Esterification IMAC Phosphopeptides nano-RP-LC ESI-MS/MS 100
100
90
90
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0 400
600
800
m/z
1000
1200
1400
0 400
600
800
m/z
1000
1200
1400
IPI Sequest Database searching
Bioinformatics Figure 5.1.╇ Major steps involved in the search for targets of PP5 including the cellular treatment with doxycycline for PP5 overexpression, DNA damage induction by bleomycin treatment, nucleic acid removal by ultracentrifugation, nonlabeled mass spectral differential quantification followed by bioinformatic discovery treatment of the data. Methodology resulted in a total of 122 phosphorylated peptides identified in biological replicate 1, and 227 phosphorylated peptides identified in biological replicate 2, both used in the search for potential PP5 targets. (Reprinted with permission from Ham et al. J. Proteome Res. 2010, 9, 945–953. Copyright 2010 American Chemical Society.)
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 149
5.1.2 Ataxia Telangiectasia Mutated (ATM) and ATM and Rad3-Related (ATR) The DDR is a global phosphorylation-signaling cascade process involved in sensing the damaged DNA condition and coordinating a number of responses to cope with and repair the cellular perturbed state. As compared with specific types of phosphatase, large-scale phosphoproteomic studies of the DDR in cells have usually been conducted on protein kinases such as ATM and ATR, which are primary serine (Ser)/threonine (Thr) kinase proteins involved in the DNA damage repair biological process within cells. Global phosphoproteomic studies can give insight into overall processes involving kinase phosphorylation but are limited in targeting specific protein-signaling pathways. We performed a targeted label-free approach under cellular DNA damage conditions using LC-MS accurate mass and time (AMT) tags to evaluate changes in protein phosphorylation associated with PP5, a Ser/Thr protein phosphatase involved in cellular physiological damaged DNA response. BR analysis of bleomycin-treated DNA-damaged HeLa cell expressing either WT PP5 or mutant inactive PP5 resulted in the identification of 122 and 227 unique phosphorylated peptides, respectively, which were further used in the search for potential PP5 targets. Searching against the human International Protein Index (IPI) database resulted in four potential unique target proteins of PP5. Localization and functional analysis of the phosphorylated proteins indicated a cytoskeletal-associated component for the DDR. 5.1.3 Background of Study 5.1.3.1 PP5.╇ PP5 is a Ser/Thr protein phosphatase involved in cellular physiological damaged DNA response. Although both protein phosphatases and kinases work together to control cellular processes and signaling pathways,2,3 greater attention has been given in the literature to the study of signaling pathways primarily involved with protein kinases as compared with specific types of phosphatase (e.g., PP5).4–9 However, the importance of studying protein phosphatase enzymes and their targets has been demonstrated in disease states that are attributed at least in part to malfunctioning protein phosphatase enzymes.10–12 PP5 is a 58-kDa protein that is present in all eukaryotic cells primarily localized in the nucleus or cytoplasm, observed highest in the brain.13–15 PP5 is involved in the activation of ATM and ATR, both Ser/ Thr kinases involved in the DNA damage repair biological process within cells.16 Figure 5.2 illustrates the activation of the PP5 complex
150â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
DNA damage Oxidation IR Radiation
DNA damage response Targets of PP5 required?
PP5 complex
DNA-PKcs, ATM, and ATR kinases
Cell cycle arrest and DNA repair or apoptosis Figure 5.2.╇ DNA damage response mechanism and place of PP5 unknown substrates. In the DNA damage response mechanism, the targets of PP5 required for the activation of ATM and ATR kinases are not well understood. PP5 is a major part of the glucocorticoid receptor (GR)-hsp90 heterocomplex that is composed of a dimeric hsp90, a GR, and a p23 protein that stabilizes the complex and is directly involved in GR regulation.
upon DNA damage. In the DDR mechanism, the targets of PP5 required for the activation of ATM and ATR kinases are not well understood. ATM, a 370-kDa kinase, is activated by double-strand breaks in DNA often due to chemotherapeutic drugs or ionizing radiation, while the ATR kinase is usually activated by replication-type-associated problems with DNA brought about by ultraviolet light and hypoxia or hydroxyurea treatment.17,18 PP5 has been shown to be activated by arachidonic acid in vitro. However, the levels needed are in excess as compared with what is expected to be present in vivo; therefore, the identification of a natural activator of PP5 has yet to be identified.16 PP5 is a major part of the glucocorticoid receptor (GR)-hsp90 heterocomplex that is composed of a dimeric hsp90, a GR, and a p23 protein that stabilizes the complex and is directly involved in GR regulation.19,20 A possible target of PP5 is p53, a tumor suppressor protein known to induce expression of p21, and under normal conditions, p53 may be dephosphorylated by PP5, thus allowing the cellular cycle to progress to the S-phase.21 Other possible roles for PP5 include maintaining the dephosphorylated state of the anaphase-promoting complex (APC), which is involved in the initiation of anaphase and mitosis exiting22 and phosphatase-Big Potassium (BK) channel activation.16
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 151
5.1.3.2 Functions of PP5.╇ Experiments have shown that when PP5 is absent or overexpressed (OE) in a catalytically inactive state, ATM and ATR are not activated.23,24 This may ultimately lead to cell cycle arrest or apoptosis. Okadaic acid (OA) has been used as a known inhibitor of PP5 where the response of cells’ DNA damage was ascertained by studying the phosphoproteome using mass spectrometric methodologies. Using the microbial toxin okadaic acid, WT PP5 was inhibited in a study of the Rac-dependent KCNH2 channel stimulation mechanism in a rat pituitary cell line (GH4C1). The signaling was restored using a toxic-insensitive mutant of PP5 (Y451A), which was engineered.25 The results of the study demonstrated that PP5 is a direct molecular effector for activated Rac-GTP. Further studies investigating the function and targets of PP5 include recent work using T51B rat liver epithelial cells where p53 is activated using low concentrations of okadaic acid.26 In this model, the OA acts as a tumor promoter where the two phosphatases negative regulators of p53, PP2A and PP5, are activated. It was observed that for low tumor-promoting doses of OA in T51B cells, the blockade of PP5 is not required for p53 activation. This has indicated that there is a specific role for PP2A in this particular cellular process. 5.1.3.3 DDR of PP5.╇ Here, a study and methodology is reported that compares phosphoproteins observed in cells expressing WT PP5 or a catalytically inactive mutant form of PP5 in the presence of DNA damage. The HeLa Tet-ON cell line was used for the investigation of targets of WT PP5 and an OE form of PP5 (two cellular conditions). A mutant HeLa cell line (H304Q) was also used in two cellular conditions as the catalytically inactive PP5 form as a control and an OE catalytically inactive PP5 form. The four conditions for the protein expression study are illustrated in Figure 5.3. DNA damage was induced with the addition of the glycosylated linear nonribosomal peptide antibiotic bleomycin (produced by the bacterium Streptomyces verticillus) 1 hour prior to cell lysis. In the mutant form of PP5, targets of the phosphatase enzyme are expected to be upregulated in their phosphorylation. Bleomycin will create a break in the DNA double-strand, initiating the DNA damage phosphoprotein signaling cascade event as illustrated in Figure 5.4. 5.1.4 Review of Optimized Approach to Study 5.1.4.1 Producing Cell Cultures.╇ HeLa Tet-ON cell lines (BD Biosciences Clontech, Palo Alto, CA) stably transfected with either WT
152â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
Control
HeLa cells with WT PP5
HeLa cells with mutant PP5 H304Q
Induced
HeLa cells with OE WT PP5
HeLa cells with OE mutant PP5 H304Q
Figure 5.3.╇ Four cultures for phosphoprotein expression. Design of HeLa cell DNA damage response mechanism study. HeLa Tet-ON cell line was used for the investigation of targets of WT PP5 and an overexpressed (OE) form of PP5 (two cellular conditions). A mutant HeLa cell line (H304Q) was also used in two cellular conditions as the catalytically inactive PP5 form as a control and an OE catalytically inactive PP5 form.
PP5 or catalytically inactive H304Q (MT) PP5 under inducible conditions were used for this study: 1. HeLa cells were grown in Dulbecco’s modified Eagle medium (DMEM)/high-glucose media (Invitrogen, Carlsbad, CA) supplemented with 10% Tet system-approved fetal bovine serum (FBS) (Clontech, Mountain View, CA), penicillin/streptomycin (Invitrogen), 200╯µg/mL geneticin (Invitrogen Gibco, Carlsbad, CA), and 200╯µg/mL hygromycin B (BD Biosciences, San Jose, CA) at 37°C in 5% CO2. 2. Twenty-four hours after plating, both the WT and MT PP5 cell lines were treated with 2╯µg/mL doxycycline (BD Biosciences Clontech) for 48 hours to overexpress the WT and MT PP5 or left untreated as a control. 3. After 48 hours of induction, all cells (control and OE) were treated with 12.5╯µg/mL bleomycin sulfate (EMD Biosciences Calbiochem, San Diego, CA) to induce DNA damage for 1 hour at 37°C. 5.1.4.2 Protein Extraction╇ 5.1.4.2.1â•… Roche Complete Lysis-M, EDTA-Free Kit.╇ Nearly confluent 100-mm plates for MT PP5 (seven plates for control sample and eight plates for OE sample) and WT PP5 (seven plates for control sample and eight plates for OE sample) HeLa cells were extracted using the Roche Complete Lysis-M, ethylenediaminetetraacetic acid
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 153
SH
NH
HO
Bleomycin
S H N
N H
NH
S
OH OH
HO
HN
OH
HN
NH 2
OH
H2N
HO
O
HN
OH
HN NH H2N
HN
H2 N
O OH HN
OH O
NH O
OH HO
HO O
O
NH2
HO
OH
DNA degradation Strand breaks and free base release
DNA damage initiates phosphoprotein signaling cascade event Figure 5.4.╇ Design of HeLa cell DNA damage response mechanism study. DNA damage was induced with the addition of the glycosylated linear nonribosomal peptide antibiotic bleomycin (produced by the bacterium Streptomyces verticillus) 1 hour prior to cell lysis. Bleomycin will create a break in the DNA double-strand, initiating the DNA damage phosphoprotein signaling cascade event.
154â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
(EDTA)-free kit (Roche Applied Science, Mannheim, Germany) according to the suggested manufacturer’s guidelines: 1. Urea was added to a final concentration of 8╯M to denature and halt phosphatase activity. 2. Ultracentrifugation (100,000╯×╯g, 1 hour, 4°C) was used to remove nucleic acids prior to tryptic digestion.27 3. The proteins were reduced with dithiothrietol (DTT) and free sulfhydryl (–SH) groups alkylated with iodoacetamide. 4. Proteins were then digested with modified trypsin at a 1:50 ratio for 4 hours at 37°C then a second trypsin digestion at a 1:50 ratio was performed overnight at 37°C. 5. The digestions were stopped with the addition of acetic acid to an approximate pH of 3.5–4. 6. Strata carbon 18 (C18)-T columns (500╯mg/3╯mL) were used to desalt the tryptic digests. 7. The tryptic peptides were then converted to peptide methyl esters to prevent the nonspecific binding of free carboxyl groups of the immobilized metal affinity chromatography (IMAC) resin. The general procedure of White et al.28 was followed with the modification of a second methylation step performed to ensure that complete methylation has taken place or the use of thionyl chloride in place of acetyl chloride. 8. Samples were reconstituted in IMAC loading solution composed of 1:1:1 methanol/acetonitrile/0.01% acetic acid at a ratio of 100╯µL for 100–200╯µg of peptide. 5.1.4.3 Phosphopeptide Enrichment by IMAC.╇ IMAC was employed for phosphopeptide enrichment.29–31 Advances and optimization in IMAC methodology have recently been summarized by Ross et al.,32 and most have previously been incorporated into our standard protocol except for the use of thionyl chloride during the methylation process. A custom-packed IMAC Macrotrap cartridge (Michrom BioResources, Inc., Auburn, CA) was used for phosphopeptide enrichment. The general methodology used in these studies was composed of the following steps: 1. Strip the column with 500╯µL of 50╯mM EDTA (adjusted to pH 9–10 with ammonium hydroxide) at a flow rate of 50╯µL/min. 2. Wash the column with 1000╯µL of nanopure water at 100╯µL/min.
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 155
3. Activate the column with 375╯µL of 100╯mM FeCl3 at 25╯µL/min. 4. Remove excess metal ions with 400╯µL of 0.1% acetic acid solution at 50╯µL/min. 5. Load the sample at 4╯µL/min. 6. Wash with 400╯µL of wash buffer composed of 100╯mM NaCl, 1% acetic acid, and 25% acetonitrile at 25╯µL/min. 7. Re-equilibrate the column with 300╯µL of 0.01% acetic acid. 8. Elute the phosphopeptides with 250╯µL of 50╯mM Na2HPO4 (pH ∼8.5). 9. Immediately acidify the eluant with acetic acid to a pH of ∼4. 5.1.4.4 Reversed-Phase (RP)/Nano-HPLC Separation.╇ Peptide mixtures from HeLa cell extracts were separated using an automated dual-column phosphoproteome nano-HPLC platform assembled inhouse that has been described elsewhere.33 Briefly, all portions of the separation system that come in contact with peptide mixtures with the exception of the autosampler syringe are nonmetal to minimize the loss of phosphopeptides. Two pairs of solid-phase extraction (SPE) and analytical columns are used on the system during analysis. The SPE precolumns are 150-µm i.d. fused silica ∼10╯cm long, packed in-house with 5-µm ODS-AQ C18 material (YMC Co., Ltd., Kyoto, Japan) to a bed length of 4╯cm. The SPE precolumns are double fritted (one Kasil® potassium silicate, PQ Corporation, Valley Forge, PA, chemical frit at each end) due to the procedural backwashing of the SPE columns directly after sample loading and prior to analytical column separation. The two analytical separation columns are composed of 50-µm i.d. fused silica (Polymicron Technologies Inc., Phoenix, AZ), 40╯cm long, packed in-house with 5-µm ODS-AQ C18 RP material. The tips coupled to the columns for electrospray are 10-µm i.d. open tubular fused silica that have been etched with hydrofluoric acid (HF) for uniform tip bevel and opening.34 The SPE precolumn and tips are connected to the analytical column using PicoClear unions (New Objective, Inc., Woburn, MA). An in-house constructed rack assembly supports the valve and column system and was fitted to a PAL autosampler (Leap Technologies, Carrboro, NC) for automated sample loading and analysis. The HPLC mobile phases were composed of 0.1╯M acetic acid in nanopure water (A), and 70% acetonitrile/0.1╯M acetic acid in nanopure water (B). The system was equilibrated at 1000╯psi for 20╯minutes with 100% mobile phase A. Next, an exponential gradient was created by valve switching from pump A to B, which displaced mobile phase A in the mixer with mobile phase B. The gradient was controlled by the
156â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
split flow (∼9╯µL/min) under constant pressure conditions. The final composition of mobile phase B was approximately 70% by the end of the HPLC run (180╯minutes). 5.1.4.5 LTQ-FT/MS/MS.╇ A linear ion trap/Fourier transform hybrid mass spectrometer (Thermo Electron Corp., Bremen, Germany) was used for both peptide quantitative data set collection and product ion spectral data set collection. For the label-free quantitative measurement, the FT was scanned at a resolution of 100,000 for a mass range of m/z 400–2000. A total of six technical replicates each for quantitation were collected for the OE WT PP5 and the OE MT PP5 conditions. A total of three technical replicates were collected for the WT PP5 control condition, and a total of four replicates were collected for the MT PP5 control condition for quantitation. For peptide fragmentation and sequencing, data sets were collected for the top 10 most abundant species in the inclusion list after each high-resolution MS scan by the FT mass spectrometer (100,000 resolution and mass scan range of m/z 400–2000). Data sets were also collected with high mass accuracy precursor FT scans (100,000 resolution), data-dependent MS/MS of the top five peptides, followed by MS3 of the neutral loss peak in the MS2 scan that was correlated with a precursor peak loss associated with phosphorylation (i.e., a neutral loss of 32.7╯Da [+3], 49.0 [+2], or 98.0 [+1]). To enhance the identification of the phosphorylated peptides associated with the OE WT PP5 condition and the OE mutant PP5 condition, top five data-dependent MS2 coupled with MS3 phospho neutral loss scanning data sets were collected with the further addition of gas-phase fractionation (GPF)35 within the mass spectrometer. This essentially entails the scanning for shorter, predefined m/z ranges such as m/z 300–850 and m/z 750–1575, both with precursor scan at 100,000 resolution. 5.1.4.6 Protein Identification and False Discovery Rate (FDR) Determination.╇ All results collected from LC-MS/MS analyses were searched by SEQUEST as fully tryptic with static methylation on D-, E-, and the C-terminus of the peptides in conjunction with dynamic phosphorylation of S, T and Y residues, and a cutoff of ±2.5╯ Da for the precursor masses. For an FDR of ≤5%, the following filtering criteria were applied: DelCn2╯≥╯0.13; +1 CS XCorr╯≥╯1.4; +2 CS XCorr╯ ≥╯ 2.4; +3 CS XCorr╯≥╯3.3; +4 CS XCorr╯ ≥╯3.3. The identified phosphorylated peptides were also constrained to a precursor mass error of ≤6.5╯ ppm. The human IPI database (version 3.20 containing 61,225 protein entries, available at http://www.ebi.ac.uk/IPI) was
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 157
searched for protein identification. For FDR determination, the IPI database was searched using a decoy database where the reversed human IPI was appended to the forward database and included in the SEQUEST search. The error rate (FDR) was estimated from the forward and reverse (decoy) filtered matches. The FDR is calculated as the percentage of the false positives to the total number of identified phosphorylated peptides.36 5.1.4.7 Phosphopeptide Quantitative Differential Comparison.╇ The in-house developed programs “Decon-2LS,” “MultiAlign,” and “DAnTE” were used to process the mass spectral data of the nonlabeled quantitative comparison of the WT and mutant PP5 phosphoproteome of the HeLa cells (available in downloadable software packages at http://ncrr.pnl.gov/software). The program Decon-2LS is used to extract “features” (peptide mass and retention time) from the deisotoped data sets. The SEQUEST search results were then used to construct an AMT database. Using the filtering cutoff values listed earlier for an FDR╯≤╯5%, the identified phosphorylated peptides were aligned using the program MultiAlign that allows the comparison of the same phosphorylated peptide measured in multiple data sets. The identified and aligned phosphorylated peptides are then response-normalized using the DAnTE program. The filtered and identified phosphorylated peptides obtained from the experiments are log (base 2) transformed and response normalized separately as BR1 and BR2. For each BR, this entails the application of an initial linear regression method that tries to fit a regression line for each data set within a selected factor (e.g., replicate) against a reference data set chosen as the one with the least amount of missing data. The data sets then undergo a median absolute deviation (MAD) adjustment so that all of the data sets will have the same spread of abundance intensity. This is achieved by choosing a suitable scaling factor for each data set to adjust its abundance values. Finally, the data sets undergo mean centering where the mean of the data set (in a column) is subtracted from each, where the resulting data sets will have a centered mean and all means are set to the maximum in all the data sets. 5.1.4.8 Data Set Peak Matching and Alignment.╇ The algorithms used in the MultiAlign software are an extension of the recently reported methodology of Jaitly et al.,37 allowing a direct comparison of the data from multiple nano-RP-LC-MS experiments. The differentially expressed phosphorylated peptides are measured using mass spectrometric techniques and are subsequently listed as potential
158â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
Figure 5.5.╇ Cluster chart of the alignment of 23 data sets obtained from nano-RP-LCMS experiments of the mutant PP5 condition versus the WT PP5 condition in biological replicate 1. Monoisotopic masses and their associated abundances, called “features,” are aligned to a base file, generating a normalized elution time (NET) and masscorrected precursor peptide. Cluster chart regions can be expanded to look for associated features.
targets of the PP5 phosphatase enzyme. Figure 5.5 is a cluster chart illustrating the alignment of the 23 data sets obtained by nano-RP-LCMS experiments of the mutant PP5 condition versus the WT PP5 condition in BR1. The cluster chart contains the unique mass classes (UMCs) that have been extracted from the deisotoped data sets (pek files). A UMC represents a cumulative feature from an LC-MS analysis where a single peptide’s mass is detected and often spread over a number of mass spectral scans. In BR1, there were 8147 features observed in the 23 data sets. The log of the intensities is normalized to a baseline file that is selected, and a plot is drawn illustrating the log intensity of the aligned data sets (y-axis) to the log intensity of the baseline file (x-axis). A normalized cluster (log) intensity plot is illustrated in Figure 5.6a of the aligned data sets. Figure 5.6b is an intensity ratio histogram of the 23 aligned data sets. The intensity patterns observed can be used to identify data sets that have suffered significant changes in associated abundances. The 23 data sets used in the study are exhibiting similar behavior in their abundances of the cluster populations indicating repeatable analysis being obtained during the nano-RP-LC-MS experiments. The MultiAlign peak matching resulted in 4141 peptides identified that equated to 1400 phosphorylated peptides. BR2 was treated in an identical fashion, resulting in the identification of 1785
Figure 5.6.╇ (a) Normalized cluster (log) intensity plot of the 23 aligned data sets obtained from biological replicate 1. (b) Intensity ratio histogram of the 23 aligned data sets. The intensity patterns observed of the 23 data sets used in the study are exhibiting similar behavior in their abundances of the cluster populations, indicating repeatable analysis being obtained during the nano-RP-LC-MS experiments. 159
160â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
phosphorylated peptides. The 1400 phosphorylated peptides identified in BR1 and the 1785 identified in BR2 were used further processing in the search for potential PP5 targets. 5.1.4.9 Phosphopeptide Response Normalization.╇ The filtered and identified phosphorylated peptides obtained from the experiments are log (base 2)-transformed and response-normalized separately as BR1 and BR2. For each BR, this entails the application of an initial linear regression method that tries to fit a regression line for each data set within a selected factor (e.g., replicate) against a reference data set chosen as the one with the least amount of missing data. The data sets then undergo a MAD adjustment so that all of the data sets will have the same spread of abundance intensity. This is achieved by choosing a suitable scaling factor for each data set to adjust its abundance values. The scaling factor for the ith data set can be obtained as
MADi I
∏
I i =1
MADi
,
(5.1)
where
MADi = median j { yij − median( yij ) }.
(5.2)
Finally, the data sets undergo mean centering where the mean of the data set (in a column) is subtracted from each, where the resulting data sets will have a centered mean centered and all means are set to the maximum in all the data sets. Figure 5.7 illustrates the normalization of the data sets. In Figure 5.7a, the box plots are of the log-transformed data sets prior to the normalization processes. The normalized data sets are illustrated in the box plot of Figure 5.7b where the data sets are well-centered according to their means. 5.1.5 Phosphoproteome Gene Ontology (GO) Comparison The Babelomics38 bioinformatics suite of tools was used to compare the cellular component, intracellular association, and molecular functions of the phosphorylated proteins identified from the WT and mutant PP5 containing HeLa cells with DNA damage. This comparison is based on the entire complement of identified phosphorylated proteins in each BR (122 and 227 unique phosphorylated peptides, respectively). The comparisons of the identified phosphoproteins in BR1 with BR2
Mutant_cntl_top10.pek
WT Mutant_OE_quant_run3.pek
Mutant_OE_quant_run2.pek
WT_OE_top10.pek
WT_OE_quant_run4.pek WT_OE_quant_run5.pek WT_OE_quant_run6.pek WT_OE_quant_run7.pek WT_OE_quant_run8.pek
WT_OE_quant_run4.pek
WT_OE_quant_run5.pek
WT_OE_quant_run6.pek
WT_OE_quant_run7.pek
WT_OE_quant_run8.pek
WT_cntl_quant_run3.pek
WT_OE_quant_run2.pek
WT_cntl_quant_run3.pek
WT_cntl_quant_run2.pek
WT_OE_quant_run2.pek
WT_cntl_quant_run2.pek
WT_cntl_quant_run1.pek
WT_OE_top10.pek
Mutant_OE_top10.pek WT_cntl_quant_run1.pek
Mutant_OE_top10.pek
Mutant_OE_quant_run6.pek
15 Mutant_OE_quant_run5.pek
20
Mutant_OE_quant_run6.pek
25 Mutant_OE_quant_run4.pek
30
Mutant_OE_quant_run5.pek
WTOE
Mutant_OE_quant_run4.pek
Mutant_OE_quant_run3.pek
Mutant_OE_quant_run_1.pek
WT
Mutant_OE_quant_run2.pek
Mutant_OE_quant_run_1.pek
HQOE Mutant_OE_quant_run1.pek
Mutant_cntl_top10.pek
Mutant_cntl_quant_run4.pek
Mutant_cntl_quant_run3.pek
HQOE
Mutant_OE_quant_run1.pek
HQ
Mutant_cntl_quant_run4.pek
Mutant_cntl_quant_ run1_col2.pek
HQ
Mutant_cntl_quant_run3.pek
(b) Mutant_cntl_quant_run2.pek
(a)
Mutant_cntl_quant_run2.pek
Mutant_cntl_quant_ run1_col2.pek
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 161 WTOE
25
20
15
10
Figure 5.7.╇ Box plots illustrating the normalization of the data sets in biological replicate 1. (a) Box plots of the log-transformed data sets prior to the normalization processes. (b) Box plots of the normalized data sets illustrating well-centered means.
162â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
showed extensive similarity between the two replicates and are illustrated in Figure 5.8. 5.1.5.1 GO Cellular Component.╇ The GO of the identified phosphoproteins’ cellular component percentages (Fig. 5.8a) included 100% and 94% intracellular associated (BR1 and BR2, respectively), 20% and 17% membrane, and the remainder at 20% and 14% associated with organelle lumen and cell fraction. A deeper look at the GO cellular component association (Fig. 5.8b) reveals 68% and 73% intracellular membrane-bound organelle, and 32% and 33% intracellular nonmembrane-bound organelle. This is followed by 53% and 66% nucleus, and 16% and 21% cytoskeleton. The association of the phosphorylated proteins observed with the nucleus is expected due to the induced DNA damage of the cells in the study; however, the GO association of 20% and 16% actin cytoskeleton, 10% and 16% microtubule cytoskeleton, and 30% and 32% cytoskeletal part was not. The cytoskeleton association was observed in the differential study with the phosphorylation of the lamin-A/C cytoskeletal protein (see Tables 5.1 and 5.2). The highest associations observed for the molecular function (Fig. 5.8c) were similar for the two replicates with 81% and 62% protein binding, 38% and 34% nucleic acid binding, 24% and 25% nucleotide binding, and 10% and 20% ion binding. 5.1.6 Potential Regulated Target Proteins of PP5 Upregulated species in the mutant condition as compared with the WT condition can be considered possible targets of PP5 as the catalytic activity of PP5 has been inhibited in the mutant condition. The PP5 enzyme is a minor enzyme in the PP-type family; thus, a BR was included in the study for increased target identification confidence. 5.1.6.1 Analysis of Variance (ANOVA).╇ The aligned and normalized phosphopeptides for each BR underwent ANOVA for statistical testing of the differences in phosphopeptide response associated with the conditions of the study filtered for a probability factor P╯≤╯0.05. For all of the processing steps of this study, the phosphopeptides for each BR were ANOVA-tested separately. In BR1, a total of 32 unique phosphopeptides were observed to possess a statistically significant difference in their response according to the conditions of the study (includes both upregulation and downregulation). In BR2, a total of 56 unique phosphopeptides were observed to posses a statistically significant response difference according to study conditions (includes both
20
14
BR1 BR2
32
33
e) ul ct Cyt in os an ke d let m o icr n ot ub
66
53
(a
73
nm Int em rac br ellu an la e r bo un Nu d cle us
68
no
Percent (%)
m
(b)
BR1 BR2
20
I em ntra br ce an llu e lar bo un d
BR1 BR2
17
Or
br an
e
94
ga ce nel ll f le ra lum ct io en n /
In tra 100
M em
Percent (%)
(a)
ce llu
la
r
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 163
21
16
38
34
24
bi
nd
in
g
BR1 BR2
Io n
62
BR1 BR2
Nu bi cleo nd ti in de g
81
BR1 BR2
Nu c bi leic nd a in cid g
Percent (%)
(c)
P bi rot nd ein in g
BR1 BR2
25 10
BR1 BR2
BR1 BR2
BR1 BR2
20
BR1 BR2
Figure 5.8.╇ Comparison of the (a) cellular component, (b) cellular component––extended, intracellular association, and (c) molecular functions of the phosphorylated proteins identified from the DNA-damaged HeLa cells containing WT and mutant forms of PP5.
AQT*PPGPSLSGSKS* PCPQEK VKAQT*PPGPSLSGSKS* PCPQEK
00099730
−2.07 −1.98
−2.39 −0.46
0.61
1.05 −1.11
−1.42 −2.09
0.52 −0.36
0.82 0.12
0.02
−1.61
0.83
0.37
−2.19
OEa
Cntla
BR2 OEa
Cntla
BR1
Y box-binding protein 1 (YB-1) Lamin A/C Apoptotic chromatin condensation inducer in the nucleus (acinus) Splicing coactivator SRm300
Protein
T1003,6,57,59 S101457
S226,57–61 S384,59 S3886,59,62
Y162, S1763
54–56
Residue (Previously Identified)
Yes44
Yes38 Yes37,39,40
Yes
34–36
Involved in DDR
Down
Up Down
Down
Regulation
Reprinted with permission from Ham et al. J. Proteome Res. 2010, 9, 945–953. Copyright 2010 American Chemical Society. a The values in each of the following columns––BR1 Cntl, BR1 OE, BR2 Cntl, BR2 OE––represent the difference of the (log base 2) average response from PP5 (H304Q) and (WT PP5) (i.e., average PP5 [H304Q] response––average [WT PP5] response). If the average difference (within each biological replicate) going from the control (CnTl) to the overexpressed (OE) condition increases, then the regulation is up; if the average difference from the control to the OE condition decreases, then the regulation is down. DDR, DNA damage response.
00021405 00007334
NYQQNY* QNSESGEKNEGSES* APEGQAQQR SGAQASSTPLS*PTR S*KSPS*PPRLTEDR
Peptide
00031812
Reference (IPI)
TABLE 5.1.╇ Overlap of Differentially Expressed Phosphoproteins between Biological Replicates (BR) 1 and 2, with Identical Phosphopeptides
[SGT]PPRQGSIT* SPQANEQ[SVT]PQRR [SGT]PPRQGSITSPQANEQ [SVT]PQRR L[SS]LRAS*TSKSESSQK RL[SS]LRAS*TSKSESSQK L[SS]LRAS*TSKS*ESSQK RLS*S*LRAS*TSKSESSQK RLS*S*LRASTSK RLS*S*LRASTSKSESSQK SGPKPFS*APKPQTSPS*PK SGPKPFSAPKPQT*SPSPK
b
Peptide
0.52 −0.91 0.77
−2.28
−1.61
−1.16
OEa
BR1
2.35 1.33 1.36
1.24
Cntla
BR1
−5.19 −3.20 −2.87 0.95
−0.97
1.10
OEa
BR2
−2.06 −1.35 −0.53
1.25
Cntla
BR2
Adenylyl cyclaseassociated protein 1
40S ribosomal protein S6
Splicing coactivator SRm300
Protein
S310 (No) T307 (No)
S235,3,59,62 S236,3,59,62 S240,43,59,62 S24443,59
6
S846, T848, T856 (No), T8666,60
6
Residue (Previously Identified)
No
Yes 41–43
Yes
44
Protein Involved in DDR
Down Down Down Down Down Down Down Up
Down
Down
Regulation
Reprinted with permission from Ham et al. J. Proteome Res. 2010, 9, 945–953. Copyright 2010 American Chemical Society. a The values in each of the following columns––BR1 Cntl, BR1 OE, BR2 Cntl, BR2 OE––represent the difference of the (log base 2) average response from PP5 (H304Q) and (WT PP5) (i.e., average PP5 [H304Q] response––average [WT PP5] response). If the average difference (within each biological replicate) going from the control (Cntl) to the overexpressed (OE) condition increases, then the regulation is up; if the average difference from the control to the OE condition decreases, then the regulation is down. b Residues placed within brackets represent ambiguity in phosphorylation site assignment. DDR, DNA damage response.
00008274
00021840
00099730
Reference (IPI)
TABLE 5.2.╇ Overlap of Differentially Expressed Phosphoproteins between Biological Replicates (BR) 1 and 2, with Similar but Not Identical Peptides
166â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
up- and downregulation). Of these two sets of differentially expressed phosphopeptides, there were a total of four unique phosphopeptides observed to overlap between the two BRs. Searching against the human IPI database resulted in four unique potential target proteins of PP5. 5.1.6.2 Four Potential Target Proteins.╇ All of the four different potential target proteins of PP5 are known to be involved in the DDR mechanism. All four potential target proteins were observed in each of the four sets of comparisons, where numeric values represents the log(2) average response difference (i.e., a negative value represents downregulation, while positive represents upregulation). 5.1.6.2.1╅ Nuclease-Sensitive Element-Binding Protein 1.╇ The nucleasesensitive element-binding protein 1 phosphorylated peptide, NYQQN Y*QNSESGEKNEGSES*APEGQAQQR, IPI00021405.3, identified in the study contained phosphorylation on two known sites: the tyrosine (Tyr)162 residue39 and the Ser176 residue.3 This protein has been identified previously as having a possible role in DNA repair,40 and to shuttle between the cytoplasm (localization primarily in the nonphosphorylated state) and the nucleus (translocation to nucleus in phosphorylated state) after DNA damage.37 Being previously known as having an implication with the DDR mechanism makes nucleasesensitive element-binding protein 1 a strong PP5 target candidate. 5.1.6.2.2╅ Cytoskeletal-Associated Protein Lamin A/C.╇ Of special note is the observance of the phosphorylation of the cytoskeletal-associated protein lamin A/C (SGAQASSTPLS*PTR, IPI00514204.3). The phosphorylated peptide observed in both BRs was identified with phosphorylation on the Ser23 residue that has also previously been reported in a phosphorylated state.3 Cytoskeletal proteins may play a part in signal transduction and intracellular transport through the cytoplasm to the nucleus.41 A potential target of PP5, p53, a tumor suppressor protein, has been shown to be transported toward the nucleus by the microtubule-cytoskeleton network as a response to DNA damage.42 PP5 has also been shown to participate in the nucleocytoplasmic shuttling of the GR that makes up part of the PP5 heterocomplex.43 The observance of differentially expressed cytoskeletal phosphoproteins with DNA damage may suggest a signaling pathway link between DNA damage and cytoskeletal protein phosphorylation. Furthermore, the observance of the phosphorylation event with the cytoskeletal protein is suggesting a link between the action of the PP5 heterocomplex and these proteins.
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 167
5.1.6.2.3â•… Apoptotic Chromatin Condensation Inducer.╇ The next protein that contains phosphorylation is the apoptotic chromatin condensation inducer in the nucleus protein (IPI00007334.1), which is a caspase-3-activated protein that is required for apoptotic chromatin condensation.44 There are two phosphorylation sites identified with the peptide S*KSPS*PPRLTEDR on the Ser384 and Ser388 residues, both previously reported phosporylation sites.3,5 Caspase-3 is a known enzyme in the p53 DDR mechanism pathway and checkpoint system.45 5.1.6.2.4â•… Ser/Thr Protein Kinase Kkialre-Like 1.╇ The last protein identified as a strong potential target of PP5 is the Ser/Thr protein kinase kkialre-like 1 (cyclin-dependent kinase like-1, CDKL1). The peptide identified, KT*LGDLIPR, has one phosphorylation site on the Thr889 residue, which is located in the Pkinase domain spanning amino acid residue 636 to 946.46 In contrast, it has been reported that phosphorylation in the activation loop on the Thr161 by Cdk-activating kinase (Cak) activates Cdk1.47,48 The kinase Cdk1 is a known component involved in the G2/M DNA damage checkpoint mechanism.49 The mitotic process is placed into arrest when DNA double-strand breaks are detected either during DNA synthesis or repair. Upon successful DNA synthesis or repair, Cdk1 kinase activity is inhibited by phosphorylation of the Thr14 and Tyr15 residues by the kinases Wee-1 and Myt1.50,51 Interestingly, the bleomycin-induced DDR mechanism involves proteins that are also associated with the mitosis process. In mitosis, Cdk1 is involved in the phosphorylation of lamins leading to nuclear membrane breakdown,52 microtubule-associated proteins involved in the assembly of the mitotic spindle and centrosome separation,53 and lastly, chromosome condensation proteins.54 5.1.7 GO Differential Comparison To investigate the localization (cellular component) and molecular function of the phosphorylated proteins with differential regulation due to the mutant PP5 condition, GO terms were determined for the entire complement of statistically significant regulated phosphorylated proteins for each of the BRs using the Blast2GO55 suite of bioinformatic programs. For each BR, a multivariate ANOVA analysis was performed on the four study conditions (i.e., WT Cntl, H304Q Cntl, WT OE, and H304Q OE) to ascertain which phosphorylated peptides’ mass spectral responses indicated statistically significant differences (P╯≤╯0.05) in regulation. Using this subset of phosphorylated peptides, the cellular component localizations were observed to be similar
168â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
between the two BRs, with BR2 being slightly more intricate due to the greater number of study identifications (e.g., 33 phosphopeptides in BR1 vs. 56 in 2). 5.1.7.1 GO Cellular Component.╇ The cellular component distributions of the differentially expressed phosphorylated peptides after ANOVA analysis are similar to those illustrated in Figure 5.8 (entire complement of phosphorylated peptides). Figure 5.9 illustrates the cellular component distribution of the phosphorylated peptides after ANOVA analysis for BR2. In Figure 5.9a, a third level of localization is presented where a very similar localization is observed as that illustrated in Figure 5.8 for the entire phosphorylated proteome measured in the study. A very interesting result was a comparison of the sixth level of the cellular component where numerous cytoskeletonassociated localizations are observed. A pie chart illustrating the localization of BR2 is illustrated in Figure 5.9b for the sixth level of the cellular component localization of ANOVA-treated phosphorylated peptides. Observed are numerous cytoskeleton-associated localizations (54% in total), such as the 6% cortical, 18% microtubule, 18% actin, and 12% intermediate filament. 5.1.7.2 Influence of Classes or Categories of Proteins.╇ The number of statistically significant differentially expressed phosphorylated protein overlap between BRs can be influenced due to the phosphorylation action of multiple components of the signaling pathways as compared with that of specific key individual proteins. Thus, classes or categories of proteins may be observed to differentiate under study conditions instead of specific proteins. This was observed by Matsuoka et al.56 in an ATM and ATR substrate analysis responding to DNA damage induced by irradiation, for example, the molecular function of binding. The difference associated with the cytoplasm may be due to PP5’s primary localization in the cytoplasm. 5.1.7.3 Molecular Function Interacting Modules.╇ The molecular function interacting modules were derived for the two BRs and observed to have similar interactions. Figure 5.10 illustrates the molecular function interacting module for BR2 broken into three sections for a total of five primary nodes. Each node in the figure contains a descriptive GO term. In Figure 5.10a, there are three primary nodes presented as signal transducer activity, catalytic activity, and transcription activity. Figure 5.10b contains the most extensive interactive module composed of binding. The binding includes nucleic acid, protein,
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 169
(a)
12% - membranebound organelle
2% - cell projection 5% - membrane
2% - organelle lumen 18% - intracellular part
6% - non-membranebound organelle
8% - intracellular organelle part 4% - protein complex 2% - membrane part 5% - ribonucleoprotein complex 2% - cell fraction
14% - intracellular organelle 3% - organelle membrane
(b)
6% - cortical cytoskeleton 12% - microtubule cytoskeleton
6% - cytoplasmic microtubule
18% - intracellular
6% - microsome 12% - intermediate filament cytoskeleton
6% - cell-substrate junction 6% - pigment granule
18% - actin cytoskeleton
6% - stress fiber
6% - nuclear body 6% - nuclear replication fork
6% - nuclear DNA-directed RNA 6% - adherens polymerase complex junction
Figure 5.9.╇ (a) Third level of the cellular component localization of ANOVA-treated phosphorylated peptides in biological replicate 2. (b) Sixth level of the cellular component localization of ANOVA-treated phosphorylated peptides in biological replicate 2. Observed are numerous cytoskeleton-associated localization (54% in total), such as the 6% cortical, 18% microtubule, 18% actin, and 12% intermediate filament.
molecular function
(a) signal transducer activity
transcription regulator activity
catalytic activity
kinase activity
nucleotidetriphosphatase activity
protein kinase activity
GTPase activity
transcription cofactor activity
transcription elongation factor activity
transcription corepressor activity
protein Ser/Thr kinase activity
molecular function
(b)
binding
nucleic acid binding
protein binding
DNA binding
RNA binding
cytoskeletal protein binding
rRNA binding
actin binding
tubulin binding
microtubule binding
nucleotide binding
purine nucleotide binding
guanyl nucleotide adenyl nucleotide binding binding
GTP binding
ATP binding
molecular function
(c)
structural molecule activity
binding
ion binding metal ion binding magnesium ion binding
cation binding
structural constituent of ribosome
transition metal ion binding
zinc ion binding
Figure 5.10.╇ Molecular function interacting modules for the ANOVA-treated phosphorylated peptides in biological replicate 2 broken into three sections for a total of five primary nodes. (a) Three primary nodes as signal transducer activity, catalytic activity, and transcription activity. (b) Most extensive interactive module composed of nucleic acid, protein, and nucleotide binding. (c) Ion binding and structural molecule activity. 170
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 171
nucleotide, and finally, ion (included as part of Fig. 5.10c). The module in Figure 5.10c continues the binding interaction and also structural activity. The interactions illustrated in Figure 5.10a are what are typically observed in a DDR study where signal transduction and catalytic activity are present. However, the binding and structural activities in Figure 5.10b, c indicate the presence of other DDR processes. PP5 is primarily a cytoplasmic protein; thus, we propose that a similar effect is taking place with PP5 as that observed with p53 where upon DNA damage, cytoskeletal protein signaling events and binding interactions are taking place to move PP5 from the cytoplasm to the nucleus. 5.1.7.3.1╅ Validation of Target Phosphorylation Sites.╇ In cells overexpressing PP5 (H304Q), the phosphorylation of PP5 substrates is predicted to be upregulated. In addition, the phosphorylation of targets indirectly influenced by PP5 activity should also be observed; in these cases, target sites may increase or decrease. Because WT and inactive PP5 were OE for 48 hours, it is also possible that the expression, rather than the phosphorylation, of the candidate targets identified here was altered. This scenario would also constitute indirect regulation since PP5 influences several pathways that modify protein expression.57 In the case of ribosomal S6, we confirmed the LC-MS finding that phosphorylation of S235 and/or S236 was decreased in cells overexpressing PP5 (H304Q) using quantitative Western blot analysis with a phospho site-specific antibody together with antibody recognizing total S6 (Fig. 5.11 and Table 5.2). Phospho-specific antibodies are not available for analyzing the other target proteins. Nevertheless, for two other putative PP5 targets, we documented that changes in protein expression do not
Control
Induced
WT H304Q WT Total S6 Cadherin
H304Q
Control
Induced
WT H304Q WT
H304Q PhosphoS6 Cadherin
Figure 5.11.╇ Phosphorylation of S6 Ser 235/236 in bleomycin-treated HeLa cells overexpressing PP5 (H304Q) or WT PP5. HeLa cells expressing control or induced WT PP5 or PP5 (H304Q) were treated with bleomycin, then lysates were prepared and subjected to Western blot analysis for phospho-Ser 235/236 S6 or total S6. Cadherin was monitored as the loading control. The log base 2 value of the difference of control PP5 (H304Q) response–WT PP5 response was 1.842 and for overexpressed PP5 (H304Q)–WT PP5 was 0.580 in the blots shown. Western blot analyses performed for two independent biological replicates yielded similar results.
172â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
Control
Induced
WT H304Q
WT H304Q
YB-1 GAPDH Figure 5.12.╇ YB-1 levels in bleomycin-treated cells overexpressing PP5 (H304Q) relative to cells overexpressing WT PP5. HeLa cells expressing control or induced WT PP5 or PP5 (H304Q) were treated with bleomycin, then lysates prepared and subjected to Western blot analysis for YB-1. Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) was monitored as the loading control. Western blot analyses were performed twice with similar results.
account for the observed changes in phosphopeptide levels. Quantitative Western blot analysis shows that levels of Y box-binding protein 1 (YB-1), a nuclear protein involved in translational regulation, remains unchanged in the conditions represented by the four different samples (Fig. 5.12). In the case of SRm300, we also identified phosphopeptides whose levels were not changed compared with cells overexpressing WT PP5 and PP5 (H304Q), which is consistent with the conclusion that phosporylation of a subset of sites, rather than protein expression, was selectively altered by PP5 (H304Q). For all overlapping peptides found to be differentially regulated in both BRs, the trend in regulation was consistent. 5.1.7.3.2╅ Preferential Changes in Proline (Pro)-Directed Phosphorylation as a Function of PP5 Activity during DNA Damage.╇ We also performed analyses to explore whether PP5 preferentially dephosphorylates certain kinase consensus sequences. Figure 5.13 compares phosphorylation sites identified in HeLa cells that express only native PP533 with phosphorylated peptides that pass the ANOVA test across the four conditions (the average of BR1 and BR2) for 11 distinct kinase phosphorylation motifs. The percentage of the phosphorylation sites appears similar for each of the motifs with the exception of Pro-directed (p[ST] P) and casein kinase II (p[ST]XX[DE]) signature sites. Among the differentially abundant phosphopeptides representing the six PP5 candidates, over 50% of the serine/threonine (S/T) phosphorylation sites were Pro-directed compared with 25% in an untreated extract prepared in a similar manner from HeLa cells expressing only native PP5.33 This finding suggests that targets for Pro-directed S/T kinases such as cyclin-dependent protein kinases, mitogen-activated protein (MAP) kinases, and GSK3 kinase were selectively altered by changing PP5 activity, which is consistent with the established involvement of these
% of total phosphorylation sites
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 173 30.0 25.0
HeLa cell phosphoproteome Average BR1 and BR2 passing ANOVA
20.0 15.0 10.0 5.0 0.0
Figure 5.13.╇ A comparison of 11 distinct kinase phosphorylation motifs, correlating phosphorylation sites identified in HeLa cells expressing only native PP5 and in phosphorylated peptides passing the ANOVA test across all four PP5 conditions (the average of biological replicates 1 and 2).
pathways and the associated kinases in DDRs,58,59 and with reports showing that PP5 can inhibit or block MAP kinase pathways.60,61 Although PP5 is required for the activation of ATM and ATR, no serine-glutamine (SQ)- or threonine-glutamine (TQ)-directed phosphorylation sites, which represent optimal sites for ATM and related DNA damage activated kinases, were found in our study. Peptides containing these sites may be in low abundance and require selective enrichment to be effectively detected.56 5.1.7.3.3╅ Comparing GO Differences in Cells with DNA Damage Overexpressing WT or Inactive PP5.╇ We also explored the molecular function of the differentially phosphorylated proteins identified. To do so, GO terms were determined for the entire complement of identified phosphorylated proteins using the Blast2GO55 suite of bioinformatic programs. The terms were then compared with the statistically significant regulated phosphorylated proteins for each of the BRs. Figure 5.14 compares the functional categories and percent distributions for the entire complement of phosphorylated proteins identified (average of the two BRs) with the phosphorylated proteins passing ANOVA. The proportion of proteins identified in the two major functional categories, that is, binding proteins and proteins with catalytic activity, was the
50 5
10
Average of entire BR1 and BR2 Average of ANOVA BR1 and BR2
Au
xil
ia
ry
tra
ns
po rt
pr ot ei Tr n an ac sp tiv or ity te r Tr ac Ca an t ivi ta sla ty lyt tio ic n a ct re ivi gu ty la to ra ct En ivi zy ty m Bi e Ch n r di eg ap ng ul er at on o ra e re ct gu ivi ty la to ra St c tiv ru M ot ct ity ur or Tr a ac an lm tiv sc ol ity rip ec tio ul e n M ac re ol tiv ec gu ity ul la ar to ra tra ct ns ivi du ty ce ra ct ivi ty
Percent distribution (%)
174â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
Figure 5.14.╇ Functional categories and the percent distributions (the average of biological replicates 1 and 2) for the entire complement of phosphorylated proteins identified versus the phosphorylated proteins passing the ANOVA test.
same in both cases. Proteins involved in translation regulator activity (e.g., ribosomal protein S6) doubled, increasing from 3% (total phosphoproteins identified) to 6% (phosphoproteins elevated or decreased in bleomycin-treated cells expressing PP5 (H304Q) compared with phosphoproteins from cells expressing WT PP5). This result suggests that PP5 influences the phosphorylation status and/or expression levels of proteins controlling translation, a process that is dramatically affected by DNA damage.62 Because the level of statistically significant differentially expressed phosphorylated protein overlap between BRs can be influenced by the phosphorylation action of multiple components of a signaling pathway,56 classes or categories of proteins as opposed to specific key individual proteins may undergo differential regulation under study conditions. This observation was made in the analysis of ATM and ATR substrates during DNA damage, where kinases activated by DNA damage influenced the phosphorylation of multiple components of particular pathways rather than key individual proteins within the process. The increase in phosphoproteins controlling translation activity in cells overexpressing inactive PP5 and subjected to DNA damage may represent a similar pattern.
STUDY OF THE PHOSPHOPROTEOME OF HELA CELLSâ•…â•… 175
5.1.8 Conclusion These studies indicate that PP5 is involved in a signaling network associated with the cytoskeletal proteins, including phosphorylation, binding, and a nucleocytoplasmic shuttling mechanism. This study also demonstrates the effectiveness of protein phosphatase-based models in DDR phosphorylated signaling experiments. Possible targets of a phosphatase such as PP5 can be identified based on mass spectrometric analyses of HeLa cell extracts where upregulated, downregulated, and unique phosphorylated peptides are measured. Application of bioinformatic tools based on mass spectral phosphorylated peptide identifications can be effectively utilized to further explore mass spectral results giving insights into localization, molecular functions, and interacting modules. 5.1.9 Reviewing Spectra Using the SpectrumLook Software Package A software package called SpectrumLook is available that allows readers to inspect the fragmentation (MS/MS) spectra for the phosphopeptides identified in this study. Using this software, readers can visually browse the MS/MS spectra that led to the phosphopeptide identifications, including viewing annotations for the identified b and y ions, and neutral loss ions where appropriate. This software is supported by the Microsoft Windows platform. There are six files included with the SpectrumLook package that can be accessed at http:// ncrr.pnl.gov/data/#HomoSapiensData, supplemental data to Ser/Thr PP5 publication (registration not required). Note: To access the file, right click on the aforementioned link and select “Open Weblink in Browser.” 1. SpectrumLook_Installer.msi––the installer. To install, doubleclick on the file and follow the installation prompts. During installation, a shortcut to run the SpectrumLook program is placed at Start╯→╯Programs╯→╯PAST Toolkit╯→╯SpectrumLook. Alternatively, navigate to the C:\Program Files\SpectrumLook\ folder and double-click file “SpectrumLook.exe.” 2. MT_Human_PP5_grouped.mzXML––the phosphopeptide specÂ� tra in mzXML format. 3. MT_Human_PP5_grouped_syn.txt––a summary of the identifications determined by SEQUEST. See the Readme.txt file for a description of the columns in this file.
176â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
4. MT_Human_PP5_grouped.ini––a parameter file that specifies the appropriate parameters for these data when browsing them with SpectrumLook. 5. Readme.txt and RevisionHistory.txt––text files that describe the SpectrumLook software. REFERENCES ╇ 1.╇ Bradshaw, R.A.; Burlingame, A.L.; Carr, S.; Aebersold, R. Reporting protein identification data: the next generation of guidelines. Mol. Cell. Proteomics 2006, 5, 787–788. ╇ 2.╇ Chalmers, M.J.; Kolch, W.; Emmett, M.R.; Marshall, A.G.; Mischak, H. Identification and analysis of phosphopeptides. J. Chromatogr. B 2004, 803, 111–120. ╇ 3.╇ Olsen, J.V.; Blagoev, B.; Gnad, F.; Macek, B.; Kumar, C.; Mortensen, P.; Mann, M. Global, in vivo, and site-specific phosphorylation dynamics in signaling networks. Cell 2006, 127, 635–648. ╇ 4.╇ Zhou, H.; Watts, J.D.; Aebersold, R. A systematic approach to the analysis of protein phosphorylation. Nat. Biotechnol. 2001, 19, 375–378. ╇ 5.╇ Beausoleil, S.A.; Jedrychowski, M.; Schwartz, D.; Elias, J.E.; Villen, J.; Li, J.; Cohn, M.A.; Cantley, L.C.; Gygi, S.P. Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 12130–12135. ╇ 6.╇ Amanchy, R.; Kalume, D.E.; Iwahori, A.; Zhong, J.; Pandey, A. Phosphoproteome analysis of HeLa cells using stable isotope labeling with amino acids in cell culture (SILAC). J. Proteome Res. 2005, 4, 1661–1671. ╇ 7.╇ Ballif, B.A.; Roux, P.P.; Gerber, S.A.; MacKeigan, J.P.; Blenis, J.; Gygi, S.P. Phosphoproteomic analysis of the developing mouse brain. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 667–672. ╇ 8.╇ Hoffert, J.D.; Pisitkun, T.; Wang, G.; Shen, R.F.; Knepper, M.A. Quantitative phosphoproteomics of vasopressin-sensitive renal cells: regulation of aquaporin-2 phosphorylation at two sites. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 7159–7164. ╇ 9.╇ Yang, F.; Stenoien, D.L.; Strittmatter, E.F.; Wang, J.H.; Ding, L.H.; Lipton, M.S.; Monroe, M.E.; Nicora, C.D.; Gristenko, M.A.; Tang, K.Q.; Fang, R.H.; Adkins, J.N.; Camp, D.G.; Chen, D.J.; Smith, R.D. Phosphoproteome profiling of human skin fibroblast cells in response to low- and high-dose irradiation. J. Proteome Res. 2006, 5, 1252–1260. 10.╇ Arroyo, J.D.; Hahn, W.C. Involvement of PP2A in viral and cellular transformation. Oncogene 2005, 24, 7746–7755. 11.╇ Brady, M.J.; Saltiel, A.R. The role of protein phosphatase-1 in insulin action. Recent Prog. Horm. Res. 2001, 56, 157–173.
REFERENCESâ•…â•… 177
12.╇ Oliver, C.J.; Shenolikar, S. Physiologic importance of protein phosphatase inhibitors. Front. Biosci. 1998, 3, D961–D972. 13.╇ Ollendorff, V.; Donoghue, D.J. The serine/threonine phosphatase PP5 interacts with CDC16 and CDC27, two tetratricopeptide repeat-containing subunits of the anaphase-promoting complex. J. Biol. Chem. 1997, 272, 32011–32018. 14.╇ Russell, L.C.; Whitt, S.R.; Chen, M.-S.; Chinkers, M. Identification of conserved residues required for the binding of a tetratricopeptide repeat domain to heat shock protein 90. J. Biol. Chem. 1999, 274, 20060–20063. 15.╇ Brown, L.; Borthwick, E.B.; Cohen, P.T.W. Drosophila protein phosphatase 5 is encoded by a single gene that is most highly expressed during embryonic development. Biochim. Biophys. Acta 2000, 1492, 470–476. 16.╇ Chinkers, M. Protein phosphatase 5 in signal transduction. Trends Endocrinol. Metab. 2001, 12, 28–32. 17.╇ Roy, K.; Wang, L.; Makrigiorgos, G.M.; Price, B.D. Methylation of the ATM promoter in glioma cells alters ionizing radiation sensitivity. Biochem. Biophys. Res. Commun. 2006, 344, 821–826. 18.╇ Hammond, E.M.; Giaccia, A.J. The role of ATM and ATR in the cellular response to hypoxia and re-oxygenation. DNA Repair 2004, 3, 1117–1122. 19.╇ Chen, M.-S.; Silverstein, A.M.; Pratt, W.B.; Chinkers, M. The tetratricopeptide repeat domain of protein phosphatase 5 mediates binding to glucocorticoid receptor heterocomplexes and acts as a dominant negative mutant. J. Biol. Chem. 1996, 271, 32315–32320. 20.╇ Silverstein, A.M.; Galigniana, M.D.; Chen, M.-S.; Owens-Grillo, J.K.; Chinkers, M.; Pratt, W.B. Protein phosphatase 5 is a major component of glucocorticoid receptor·hsp90 complexes with properties of an FK506binding immunophilin. J. Biol. Chem. 1997, 272, 16224–16230. 21.╇ Zuo, Z.; Dean, N.M.; Honkanen, R.E. Serine/threonine protein phosphatase type 5 acts upstream of p53 to regulate the induction of p21WAF1/Cip1 and mediate growth arrest. J. Biol. Chem. 1998, 273, 12250–12258. 22.╇ Peters, J.-M.; King, R.W.; Höög, C.; Kirschner, M.W. Identification of BIME as a subunit of the anaphase-promoting complex. Science 1996, 274, 1199–1201. 23.╇ Ali, A.; Zhang, J.; Bao, S.; Liu, I.; Otterness, D.; Dean, N.M.; Abraham, R.T.; Wang, X.F. Requirement of protein phosphatase 5 in DNA-damageinduced ATM activation. Genes Dev. 2004, 18, 249–254. 24.╇ Zhang, J.; Bao, S.; Furumai, R.; Kucera, K.S.; Ali, A.; Dean, N.M.; Wang, X.F. Protein phosphatase 5 is required for ATR-mediated checkpoint activation. Mol. Cell. Biol. 2005, 22, 9910–9919. 25.╇ Gentile, S.; Darden, T.; Erxleben, C.; Romeo, C.; Russo, A.; Martin, N.; Rossie, S.; Armstrong, D.L. Rac GTPase signaling through the PP5 protein phosphatase. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 5202–5206.
178â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
26.╇ Messner, D.J.; Romeo, C.; Boynton, A.; Rossie, S. Inhibition of PP2A, but not PP5, mediates P53 activation by low levels of okadaic acid in rat liver epithelial cells. J. Cell Biochem. 2005, 99(1), 241–255. 27.╇ Giavalisco, P.; Nordhoff, E.; Lehrach, H.; Gobom, J.; Klose, J. Extraction of proteins from plant tissues for two-dimensional electrophoresis analysis. Electrophoresis 2003, 24, 207–216. 28.╇ Ficarro, S.B.; McCleland, M.L.; Stukenberg, P.T.; Burke, D.J.; Ross, M.M.; Shabanowitz, J.; Hunt, D.F.; White, F.M. Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nat. Biotechnol. 2002, 20, 301–305. 29.╇ Posewitz, M.C.; Tempst, P. Immobilized gallium(III) affinity chromatography of phosphopeptides. Anal. Chem. 1999, 71, 2883–2892. 30.╇ Cao, P.; Stults, J.T. Mapping the phosphorylation sites of proteins using on-line immobilized metal affinity chromatography/capillary electrophoresis/electrospray ionization multiple stage tandem mass spectrometry. Rapid Commun. Mass Spectrom. 2000, 14, 1600–1606. 31.╇ Kocher, T.; Allmaier, G.; Wilm, M. Nanoelectrospray-based detection and sequencing of substoichiometric amounts of phosphopeptides in complex mixtures. J. Mass Spectrom. 2003, 38, 131–137. 32.╇ Ndassa, Y.M.; Orsi, C.; Marto, J.A.; Chen, S.; Ross, M.M. Improved immobilized metal affinity chromatography for large-scale phosphoproteomics applications. J. Proteome Res. 2006, 10, 2789–2799. 33.╇ Ham, B.M.; Yang, F.; Jayachandran, H.; Jaitly, N.; Monroe, M.E.; Gritsenko, M.A.; Livesay, E.A.; Zhao, R.; Purvine, S.O.; Orton, D.; Adkins, J.N.; Camp, D.G., 2nd; Rossie, S.; Smith, R.D. The influence of sample preparation and replicate analyses on HeLa cell phosphoproteome coverage. J. Proteome Res. 2008, 7(6), 2215–2221. 34.╇ Kelly, R.T.; Page, J.S.; Luo, Q.; Moore, R.J.; Orton, D.J.; Tang, K.; Smith, R.D. Chemically etched open tubular and monolithic emitters for nanoelectrospray ionization mass spectrometry. Anal. Chem. 2006, 78, 7796–7801. 35.╇ Yi, E.C.; Marelli, M.; Lee, H.; Purvine, S.O.; Aebersold, R.; Aitchison, J.D.; Goodlett, D.R. Approaching complete peroxisome characterization by gas-phase fractionation. Electrophoresis 2002, 23, 3205–3216. 36.╇ Li, X.; Gerber, S.A.; Rudner, A.D.; Beausoleil, S.A.; Haas, W.; Villen, J.; Elias, J.E.; Gygi, S.P. Large-scale phosphorylation analysis of α-factorarrested Saccharomyces cerevisiae. J. Proteome Res. 2007, 6, 1190–1197. 37.╇ Gaudreault, I.; Guay, D.; Lebel, M. YB-1 promotes strand separation in vitro of duplex DNA containing either mispaired bases or cisplatin modifications, exhibits endonucleolytic activities and binds several DNA repair proteins. Nucleic Acids Res. 2004, 32, 316–327. 38.╇ Al-Shahrour, F.; Minguez, P.; Vaquerizas, J.M.; Conde, L.; Dopazo, J. Nucleic Acids Res. 2005, 33(Web Server issue), W460–W464.
REFERENCESâ•…â•… 179
39.╇ Rush, J.; Moritz, A.; Lee, K.A.; Guo, A.; Goss, V.L.; Spek, E.J.; Zhang, H.; Zha, X.-M.; Polakiewicz, R.D.; Comb, M.J. Immunoaffinity profiling of tyrosine phosphorylation in cancer cells. Nat. Biotechnol. 2005, 23, 94–101. 40.╇ Chen, C.-Y.; Gherzi, R.; Anderson, J.S.; Gaietta, G.; Juerchott, K.; Royer, H.-D.; Mann, M.; Karin, M. Nucleolin and YB-1 are required for JNKmediated interleukin-2 mRNA stabilization during T-cell activation. Genes Dev. 2000, 14, 1236–1248. 41.╇ Gundersen, G.G.; Cook, T.A. Curr. Opin. Cell Biol. 1999, 11, 81–94. 42.╇ Giannakakou, P.; Nakano, M.; Nicolaou, K.C.; O’Brate, A.; Yu, J.; Blagosklonny, M.V.; Greber, U.F.; Fojo, T. Enhanced microtubule-dependent trafficking and p53 nuclear accumulation by suppression of microtubule dynamics. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 10855–10860. 43.╇ Dean, D.A.; Urban, G.; Aragon, I.V.; Swingle, M.; Miller, B.; Rusconi, S.; Bueno, M.; Dean, N.M.; Honkanen, R.E. Serine/threonine protein phosphatase 5 (PP5) participates in the regulation of glucocorticoid receptor nucleocytoplasmic shuttling. BMC Cell Biol. 2001, 2(6), 1471–2121. 44.╇ Sahara, S.; Aoto, M.; Eguchi, Y.; Imamoto, N.; Yoneda, Y.; Tsujimoto, Y. Acinus is a caspase-3-activated protein required for apoptotic chromatin condensation. Nature 1999, 401, 168–173. 45.╇ Becker, W.M.; Kleinsmith, L.J.; Hardin, J. The World of the Cell. Delhi, India: Pearson Education, 2003. 46.╇ Finn, R.D.; Mistry, J.; Schuster-Böckler, B.; Griffiths-Jones, S.; Hollich, V.; Lassmann, T.; Moxon, S.; Marshall, M.; Khanna, A.; Durbin, R.; Eddy, S.R.; Sonnhammer, E.L.L.; Bateman, A. Pfam: clans, web tools and services. Nucleic Acids Res. 2006, 34(Database Issue), D247–D251. 47.╇ Devault, A.; Martinez, A.M.; Fesquet, D.; Labbe, J.C.; Morin, N.; Tassan, J.P. MAT1 (“menage a trios”) a new RING finger protein subunit stabilizing cyclin H-cdk7 complexes in starfish and Xenopus CAK. EMBO J. 1995, 14, 5027–5036. 48.╇ Tassan, J.P.; Jaquenoud, M.; Fry, A.M.; Frutiger, S.; Hughes, G.J.; Nigg, E.A. In vitro assembly of a functional human CDK7-cyclin H complex requires MAT1, a novel 36╯kDa RING finger protein. EMBO J. 1995, 14, 5608–5617. 49.╇ Ferrari, S. Protein kinases controlling the onset of mitosis. Cell. Mol. Life Sci. 2006, 63, 781–795. 50.╇ Gould, K.L.; Nurse, P. Tyrosine phosphorylation of the fission yeast cdc2+ protein kinase regulates entry into mitosis. Nature 1989, 342, 39–45. 51.╇ Atherton-Fessler, S.; Parker, L.L.; Geahlen, R.L.; Piwnica-Worms, H. Mechanisms of p34cdc2 regulation. Mol. Cell. Biol. 1993, 13, 1675–1685. 52.╇ Peter, M.; Heitlinger, E.; Haner, M.; Aebi, U.; Nigg, E.A. Disassembly of in vitro formed lamin head-to-tail polymers by CDC2 kinase. EMBO J. 1991, 10, 1535–1544.
180â•…â•… EUKARYOTE PTM AS PHOSPHORYLATION: PERTURBED STATE STUDIES
53.╇ Blangy, A.; Lane, H.A.; d’Herin, P.; Harper, M.; Kress, M.; Nigg, E.A. Phosphorylation by p34cdc2 regulates spindle association of human Eg5, a kinesin-related motor essential for bipolar spindle formation in vivo. Cell 1995, 83, 1159–1169. 54.╇ Kimura, K.; Hirano, M.; Kobayashi, R.; Hirano, T. Phosphorylation and activation of 13S condensin by Cdc2 in vitro. Science 1998, 282, 487–490. 55.╇ Conesa, A.; Gotz, S.; Garcia-Gomez, J.M.; Terol, J.; Talon, M.; Robles, M. Bioinformatics 2005, 21, 3674–3676. 56.╇ Matsuoka, S.; Ballif, B.A.; Smogorzewska, A.; McDonald, E.R., III; Hurov, K.E.; Luo, J.; Bakalarski, C.E.; Zhao, Z.; Solimini, N.; Lerenthal, Y.; Shiloh, Y.; Gygi, S.P.; Elledge, S.J. ATM and ATR substrate analysis reveals extensive protein networks responsive to DNA damage. Science 2007, 316, 1160–1166. 57.╇ Hinds, T.D., Jr.; Sanchez, E.R. Protein phosphatase 5. Int. J. Biochem. Cell Biol. 2008, 40(11), 2358–2362. 58.╇ Dent, P.; Yacoub, A.; Fisher, P.B.; Hagan, M.P.; Grant, S. MAPK pathways in radiation responses. Oncogene 2003, 22(37), 5885–5896. 59.╇ Watcharasit, P.; Bijur, G.N.; Zmijewski, J.W.; Song, L.; Zmijewska, A.; Chen, X.; Johnson, G.V.; Jope, R.S. Direct, activating interaction between glycogen synthase kinase-3beta and p53 after DNA damage. Proc. Natl. Acad. Sci. U.S.A. 2002, 99(12), 7951–7955. 60.╇ Morita, K.; Saitoh, M.; Tobiume, K.; Matsuura, H.; Enomoto, S.; Nishitoh, H.; Ichijo, H. Negative feedback regulation of ASK1 by protein phosphatase 5 (PP5) in response to oxidative stress. EMBO J. 2001, 20(21), 6028–6036. 61.╇ von Kriegsheim, A.; Pitt, A.; Grindlay, G.J.; Kolch, W.; Dhillon, A.S. Regulation of the Raf-MEK-ERK pathway by protein phosphatase 5. Nat. Cell Biol. 2006, 8(9), 1011–1016. 62.╇ Wek, R.C.; Jiang, H.Y.; Anthony, T.G. Coping with stress: eIF2 kinases and translational control. Biochem. Soc. Trans. 2006, 34(Pt 1), 7–11.
6
Prokaryotic Phosphorylation of Serine, Threonine, and Tyrosine 6.1 INTRODUCTION 6.1.1 Serine (Ser)/Threonine (Thr)/Tyrosine (Tyr) Phosphorylation Numerous mass spectrometry (MS)-based studies have been applied to identify Ser/Thr/Tyr protein phosphorylation sites for signaling pathways involved with protein kinases and phosphatases in eukaryotic systems. Figure 6.1 illustrates the major amino acids that are studied in systems for phosphorylation as a posttranslational modification (PTM). This includes (a) the hydroxy phosphoamino acid residues phosphoserine (pS), phosphothreonine (pT), and phosphotyrosine (pY); and (b) the phosphoramidate phosphohistidine and the phosphorylated carboxylic acid phosphoaspartate and phosphoglutamate. The typical or usual acid-stable hydroxy phosphoamino acid residues pS, pT, and pY that are studied using phosphoproteomic techniques, also called the O-phosphates, are illustrated in Figure 6.2. 6.1.2 Histidine (His) Phosphorylation Also included (briefly, covered more extensively in Chapter 7) are the phosphoramidate phosphohistidine and the phosphorylated acid phosphoaspartate (last two in Fig. 6.1). Figure 6.3 illustrates the N-phosphate phosphohistidine, which is unstable in acid and an important residue in prokaryotic signaling systems. 6.1.3 Caulobacter crescentus Until this point, less has been studied in prokaryotic biological systems in identifying Ser/Thr/Tyr protein phosphorylation sites for signaling Proteomics of Biological Systems: Protein Phosphorylation Using Mass Spectrometry Techniques, First Edition. Bryan M. Ham. © 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 181
(a)
O H2N
CH C
O
O OH
H2N
CH C
OH
CH2
CH OH
OH
CH3
Serine (S)
H2N
CH C
OH
CH2
Threonine (T) OH
Tyrosine (Y) O H2N
O
CH C
O
O OH
H2N
CH C
OH
CH2
CH CH3
O
O
P
OH
O
P
OH
H2N
CH C
OH
CH2
OH
OH O
Phosphorylated serine (pS)
Phosphorylated threonine (pT)
O
P
OH
OH
Phosphorylated tyrosine (pY)
O
O
(b) H2 N
CH
C
OH
H2 N
CH
C
O OH
H2N
CH2
CH2
C N
CH
C
OH
CH2 O
CH2
OH
C
O
NH
OH
Histidine (H)
Aspartate (D)
CH
OH
C
O
O
O H2N
Glutamate (E)
H2N
CH
C
OH
H2 N
C N
O
P HO
OH
OH
CH2
O
O
C
P
O
O
N
C
CH2
CH2
CH2
CH
OH
OH
O
P
O
OH
OH
Phosphorylated histidine (pH)
Phosphorylated aspartate (pD)
Phosphorylated glutamate (pE)
Figure 6.1.╇ Major amino acids studied in systems using phosphoproteomic techniques for phosphorylation as a posttranslational modification including (a) the hydroxy phosphoamino acid residues phosphoserine, phosphothreonine, and phosphotyrosine; and (b) the phosphoramidate phosphohistidine and the phosphorylated carboxylic acid phosphoaspartate and phosphoglutamate. 182
INTRODUCTIONâ•…â•… 183 O H2N
CH C
O
O OH
H2N
CH C
CH2
CH OH
OH
CH3
Serine (S)
OH
H2N
CH C
OH
The O-phosphates
CH2
Found in eukaryotes
Threonine (T)
The “normal” phosphoproteome OH
Stable in acid
Tyrosine (Y) O H2N
CH C
OH
H2N
CH2
P
CH C
OH
H2N
CH C
OH
CH2
CH CH3
O O
O
O
O OH
OH
O
P
OH
OH O
Phosphorylated serine (pS)
Phosphorylated threonine (pT)
O
P
OH
OH
Phosphorylated tyrosine (pY)
Figure 6.2.╇ The three O-phosphates (hydroxyl amino acid residues) serine, threonine, and tyrosine normally studied in eukaryotic systems.
pathways specifically under stressed conditions. Here, we will look at an applied phosphoproteomic methodology to a differential study of C. crescentus under carbon-rich versus carbon-starved environmental conditions. C. crescentus is an aquatic gram-negative α-proteobacterium (α-purple bacterium) that can exist in two forms: either flagellated or possessing a sessile stalk. C. crescentus undergoes an unequal asymmetric cell division that produces a motile swarmer cell and a sessile stalked cell at the end of each cell cycle. C. crescentus is a key model system used for the study of the developmental processes of bacteria. 6.1.4 Ser/Thr/Tyr Phosphorylation of C. crescentus The study we will look at in the following sections extensively, which may also be used as a specific example for prokaryotic Ser/Thr/Tyr phosphoproteomic studies, resulted in the identification of 259 phosphorylation sites on 149 C. crescentus phosphorylated proteins including 112 sites on Ser, 107 on Thr, 24 on Tyr, and 16 on aspartate. pY immunoprecipitation (IP) enrichment verified 64% of the pY-containing
184â•…â•… PROKARYOTIC PHOSPHORYLATION O H2N
CH
OH
C
N-phosphates
CH2
Found in prokaryotes
N NH
Histidine (H)
Unstable in acid
O H2 N
CH
C
Phosphoramidate
OH
CH2
Found in the two-component signaling system
N O
N P HO
OH
Phosphorylated histidine (pH) Figure 6.3.╇ The N-phosphate phosphohistidine, the important phosphorylated residue utilized in prokaryote two-component signaling system. In the phosphorylated form, the residue is a phosphoramidate that is highly unstable in acidic conditions.
proteins identified in the global phosphoproteome including the verification of three specific kinases’ pY sites and the measurement of multiple pY-containing peptides (from three to nine) for TonB-receptor proteins. Under conditions of carbon deficiency, 15 phosphoproteins were detected that were not present in the presence of carbon. The phosphoproteomic environment differential study composed of carbonrich versus carbon-starved conditions indicated a reduced signaling state as indicated by upregulation or downregulation in phosphorylation, in the C. crescentus present in carbon-depleted conditions. Six to nine proteins observed to be upregulated are likely involved in elevated signaling processes associated with an adaptive response to the carbon-starved growth environment. 6.1.5 Ser/Thr/Tyr Phosphorylation of Bacillus subtilis and Escherichia coli Let us consider the amino acid residues listed in Figure 6.1 again. The phosphoryl modification of Ser, Thr, and Tyr is a PTM primarily
INTRODUCTIONâ•…â•… 185
associated with regulating enzyme activity in eukaryotic systems.1–6 In contrast, prokaryotic biological systems utilize His for the purpose of transferring the phosphate group from one biomolecule to another7,8 as illustrated by the two-component systems of the bacterial His kinases.9,10 However, recently, the observance of Ser/Thr/Tyr phosphorylation has been reported in the model gram-positive bacteria B. subtilis using phosphoproteomic techniques where 103 unique phosphopeptides were identified from 78 B. subtilis proteins11 and E. coli where 81 phosphorylation sites on 79 proteins were identified.12 This work demonstrated that the signaling processes involved with Ser/Thr/ Tyr phosphorylation are a more general regulatory process applicable to both eukaryotes and prokaryotes. 6.1.6 C. crescentus as Cell Cycle Model Let us take a closer look at C. crescentus. C. crescentus is a key model system used for the study of the developmental processes of bacteria.13,14 It is an aquatic gram-negative α-proteobacterium (α-purple bacterium) that can be flagellated in a polar way or have a stalk. A picture of C. crescentus is shown in Figure 6.4 with the stalk at the top of the bacterium and the bottom being flagellated. The bacterium is found in aquatic environments often attached by its stalk to plant matter, other aquatic organisms, or particulate matter. C. crescentus undergoes an unequal asymmetric cell division that produces a motile swarmer cell and a sessile stalked cell at the end of each cell cycle. Figure 6.5 illustrates the volume change in C. crescentus as it undergoes the asymmetric cell division. The two cells produced are morphologically distinct in their construction and purpose.13 The motile swarmer cell possesses a polar flagellum and also polar pili, and until it differentiates into a stalked cell, it cannot initiate deoxyribonucleic acid (DNA) replication. After 30–45 minutes of swimming, the swarmer cell sheds its flagellum and differentiates into a stalk cell. The stalk cell, however, immediately starts chromosome replication and goes into an elongated predivisional cell structure that includes a chemotaxis apparatus and a new flagellum at the pole that is opposite to the stalk. Figure 6.6 illustrates the phases that C. crescentus passes through as it undergoes the asymmetric cell division. Besides membrane-associated two-component signaling systems involved in adaptive response mechanisms,7 phosphorylation also plays an important role in the proper management of cell cycle progression15 and morphogenesis.16 For example, the ctrA gene encodes a response regulator that plays multiple roles in controlling ∼55 operons including cell division, chemotaxis, and metabolism through cell
186â•…â•… PROKARYOTIC PHOSPHORYLATION
Figure 6.4.╇ A picture of Caulobacter crescentus with the stalk at the top of the bacterium and the bottom being flagellated (picture from Yves Brun at Indiana University).
Volume 0.5 µ3 Swarmer
0.6 µ3 0.7 µ3
Stalk
0.9 µ3 1.2 µ3
Predivisional
0.5 µ3
0.7 µ3
Figure 6.5.╇ Volume change in C. crescentus as it undergoes asymmetric cell division (http://caulo.stanford.edu/caulo/).
INTRODUCTIONâ•…â•… 187
Swarmer (SW)
Stalked (ST)
Predivisional (PD)
Daughters
New SW pole
Flagellated pole
ST pole
Flagellum ejection
Flagellum biogenesis
Pili retraction
Chemotaxis machinery
Holdfast synthesis
Pili biogenesis
Stalk synthesis and elongation
Replication initiation
DNA methylation Chromosome replication Chromosome segregation
Cell separation
Figure 6.6.╇ Phases C. crescentus passes through as it undergoes asymmetric cell division (Jacobs-Wagner et al. Molecular Microbiology 2004, 51, 7–13).
cycle-regulated phosphorylation,17 and the divK gene encodes a response regulator involved in the signal transduction pathways controlling polar differentiation and cell division.16 6.1.7 Bacteria Starvation Response Bacteria are known to exhibit a starvation survival response18 to sustain themselves in oligotrophic natural environments through cell size reduction and decreased harsh condition susceptibility. This crossprotection includes decreased susceptibility to high temperature, oxidative stress, and high osmotic pressure.19 The starvation survival response resulting in cross-protection has been observed in E. coli,20–22 in Salmonella typhirmurium,23 in Pseudomonas putida,24,25 and in Vibrio spp.26,27 While the proteomes of these model bacterial systems have been studied and shown to result in a starvation survival response resulting in cross-protection, to this date, the signaling events associated with the starvation survival response through phosphorylation have not.
188â•…â•… PROKARYOTIC PHOSPHORYLATION
6.1.8 First Coverage of C. crescentus Phosphoproteome Herein is reported the first comprehensive coverage of the Ser/Thr/ Tyr phosphoproteome of C. crescentus including both the soluble protein fraction and the membrane-bound insoluble protein fraction. Non-gel-based, quantitative, label-free shotgun proteomics28 and phosphoproteome differential studies29–31 have been reported for a number of eukaryotic models utilizing high-resolution MS, and with this methodology, it is possible to accurately measure and quantitate from hundreds up to thousands of phosphorylated species. C. crescentus proteins were extracted, enzymatically digested, enriched by immobilized metal affinity chromatography (IMAC) and measured using the hybrid linear ion trap-Orbitrap mass spectrometer that allows high mass accuracy measurement of the phosphorylated peptides in conjunction with product ion spectral collection for identification and phosphor site determination. Within this study, the first phosphoproteome differential comparison for C. crescentus under carbon-rich versus carbon-starved conditions is being reported. Finally, the first reporting of the measurement and identification by MS of phosphorylated aspartate-containing peptides from C. crescentus is being presented.
6.2 OPTIMIZED METHODOLOGY FOR PHOSPHO SER/THR/TYR STUDIES 6.2.1 Bacterial Strain and Growth Conditions Wild-type C. crescentus (strain CB15N) was grown: 1. To mid-exponential phase (0.5 UA600) in M2G media (17.4╯g/L Na2HPO4, 10.6╯g/L KH2PO4, 10.0╯g/L NH4Cl, 0.5╯mM MgSO4, 10╯µM FeSO4, 10╯µM ethylenediaminetetraacetic acid (EDTA), 0.5╯mM CaCl2, 0.2% glucose) at 28°C. 2. Cells were then harvested by centrifugation. 3. Then washed four times with M2 (M2G media without glucose) and resuspended in either M2 or M2G. 4. After a 3-hour incubation at 28°C, the cells were harvested and washed twice with 50╯mM NH4HCO3. 5. Finally, the cells were frozen in liquid nitrogen and stored at −80°C until further processing.
OPTIMIZED METHODOLOGY FOR PHOSPHO SER/THR/TYR STUDIESâ•…â•… 189
6.2.2 C. crescentus Cell Protein Extraction: Phosphoproteomics 1. To a C. crescentus sample containing ∼42 UA600 cells (equates to 12╯mg total protein by bicinchoninic acid (BCA) analysis [Pierce, Inc. Rockford, IL]), add according to manufacturer’s suggested guidelines Roche Complete, Mini, EDTA-free Protease Inhibitor Cocktail (Roche Applied Science, Mannheim, Germany) along with 6╯M guanidine HCl for the soluble protein fraction. 2. Add phosphatase inhibitors to give final concentrations of 5╯mM β-glycerophosphate, 5╯mM sodium fluoride, and 1╯mM sodium orthovanadate. 3. Vortex sample for 1 minute and then place it on ice. 4. After a period of 10 minutes, sonicate the sample with five pulses with a duration of 30 seconds for each pulse. 5. Vortex sample and centrifuge at 13,000╯rpm for 5 minutes. 6. Remove the supernate representing the soluble protein fraction and wash the pellet with water. 7. Combine the wash with the supernate and perform BCA assay resulting in 11╯mg total protein on average (n╯=╯3). 8. Extract the protein in the pellet with a solution consisting of Roche Complete Lysis-M, EDTA-free lysis buffer, Roche protease inhibitor, 6╯M guanidine HCl, and phosphatase inhibitors (same as earlier). 9. Strongly vortex for 2 minutes and then centrifuge at 13,000╯rpm for 5 minutes. 10. Remove the supernate representing the insoluble protein fraction and run BCA assay resulting in 0.7╯mg total protein on average equating to 5% of the global extracted protein content (n╯=╯3). 11. Digest the proteins with modified trypsin at a 1:50 ratio for 4 hours at 37°C after threefold dilution with 50╯mM NH4HCO3 (pH 7.4). 12. After fivefold further dilution, a second trypsin digestion is performed at a 1:50 ratio overnight at 37°C to ensure complete digestion. 13. Stop the digestion by adding trifluoracetic acid (TFA) to a final concentration of 0.5% (pH 3–4). 14. Desalt the tryptic digests with C18 reversed-phase (RP) Sep-Pak cartridge (Waters Corporation, Milford, MA).
190â•…â•… PROKARYOTIC PHOSPHORYLATION
6.2.3 Solid-Phase Extraction (SPE) Desalting Use a C18 RP peptide SPE cartridge to desalt the tryptic digests. A 1╯mL/100╯mg tube is usually sufficient to use for up to 5╯mg of protein extract: 1. Condition the SPE column with 3╯mL of methanol on an SPE vacuum chamber. 2. Rinse the column with 2╯mL of acidified water (0.1% TFA). 3. Slowly put the protein extract sample through the column using minimal vacuum (∼0.5–1╯mL per minute flow rate). 4. Wash the column containing the sample with 4╯mL of 95:5 H2Oâ•›:â•›acetonitrile (ACN), 0.1% TFA. 5. Allow the column to go to dryness and whip the needles below the columns dry. 6. Place appropriate collection tubes under the columns. 7. Close off the tubes from the vacuum and add 1╯mL of 80:20 ACNâ•›:â•›H2O, 0.1% TFA. 8. Allow the elution buffer to slowly flow through the tube until the column is dry (∼0.5–1╯mL per minute flow rate). 9. When completed, remove the sample from the SPE vacuum chamber. 6.2.4 In Vitro Methylation of Peptides Thionyl chloride (Sigma-Aldrich, St. Louis, MO) was used for methyl esterification of the peptides.32 The steps involved in methylating the peptides are as follows: 1. Only use dry peptide material that has been previously extensively dried in a SpeedVac. 2. Carefully add dropwise 40╯µL of thionyl chloride to 1╯mL methanol (both anhydrous). 3. Add thionyl chloride/methanol mixture to dry peptide at a ratio of 75╯µL thionyl chloride/methanol solution per 100╯µg peptide. 4. Vortex the reaction mixture for 5–10 minutes to ensure dissolution of the dry peptide material. 5. Sonicate the reaction mixture for 10 minutes. 6. Let the mixture react at room temperature for 1 hour.
OPTIMIZED METHODOLOGY FOR PHOSPHO SER/THR/TYR STUDIESâ•…â•… 191
7. Bring the methylated peptide to dryness in a SpeedVac (Eppendorf, Hamburg, Germany) and store at −80°C until further processed. 8. Reconstitute the sample in IMAC loading solution that is composed of 1:1:1 methanol/ACN/0.01% acetic acid at a ratio of 100╯µL solution to 100–200╯µg peptides.
6.2.5 Phosphopeptide Enrichment by IMAC Advances and optimizations recently summarized by Ross et al.32 have been previously incorporated into this standard IMAC protocol including the use of thionyl chloride during the methyl esterification process. Also employed is a custom-packed IMAC Macrotrap cartridge with a 50-µL bed volume (Michrom BioResources, Inc., Auburn, CA) for phosphopeptide enrichment. The general IMAC procedure was composed of the following steps: 1. Strip the column with 50╯mM EDTA for 15 minutes at 50╯µL/min. 2. Wash the column with nanopure water for 10 minutes at 150╯µL/min. 3. Activate the column with 500╯µL 100╯mM FeCl3 at 25╯µL/min. 4. Remove excess metal ions for 5 minutes with 0.01% HOAc at 50╯µL/min. 5. Equilibrate the column with 500╯µL 1:1:1 methanol/ACN/0.01% acetic acid at 50╯µL/min. 6. Load sample (∼1.5╯mg) onto the IMAC cartridge at 4╯µL/min. 7. Wash the column with 100╯µL 1:1:1 methanol/ACN/0.01% acetic acid at 25╯µL/min. 8. Wash the column with 600╯µL 1:1:1 methanol/ACN/0.01% acetic acid at 50╯µL/min. 9. Re-equilibrate the column with 500╯µL of 0.01% acetic acid. 10. Elute the sample with 250╯µL of 250╯mM Na2HPO4 (pH ∼8.5) at 10╯µL/min. 11. Immediately acidify the eluant with acetic acid to a pH of ∼4 (∼30–50╯µL). 12. Sample is ready for injection on liquid chromatography (LC)MS for phosphopeptide analysis.
192â•…â•… PROKARYOTIC PHOSPHORYLATION
6.2.6 Normal Proteomics Aliquots of cell cultures at time points of 0, 30, 60, and 120 minutes in both carbon-rich growth media and carbon-depleted growth media were taken for global protein preparation. Cell cultures were lysed using bead beating (zirconia/silica beads), the extracted proteins solubilized, reduced with dithiothreitol, and digested with trypsin at a 1:50 trypsin to protein ratio. 6.2.7 pY Enrichment by IP To a C. crescentus sample containing ∼40 UA600 cells (equates to 10╯mg total protein by BCA analysis [Pierce Inc.]), the global protein extraction described in “C. crescentus Cell Protein Extraction” was used to obtain ∼10╯mg total protein. The protein extract was buffer exchanged using PD-10 desalting columns (GE Healthcare, Waukesha, WI) to a buffer composed of 100╯mM Tris (pH 7.4), 150╯mM NaCl, 0.5% NP40, and protease and phosphatase inhibitors. The pY-containing proteins were enriched by IP using Dynabeads M-280 sheep antimouse IgG cross-linked with anti-pY, clone 4G-10 (Upstate-Millipore, Billerica, MA). The manufacturer’s protocol was used with the exception of the elution solutions where elution solution 1 was composed of 0.15% TFA, and elution solution 2 was composed of 10╯mM phenyl phosphate. 6.2.8 RP/Nano-High-Performance Liquid Chromatography (HPLC) Separation Peptide mixtures from C. crescentus extracts were separated using an automated dual-column phosphoproteome nano-HPLC platform assembled in-house that has been reported elsewhere.33 Briefly, all portions of the separation system that come in contact with peptide mixtures with the exception of the autosampler syringe but including the valve apparatus and transfer lines are nonmetal to minimize the loss of phosphopeptides. Two pairs of SPE and analytical columns are used on the system during analysis. The tips coupled to the columns for electrospray are 10-µm i.d. open tubular fused silica that have been etched with hydrofluoric acid (HF) for uniform tip bevel and opening.34 The HPLC mobile phases were composed of 0.1╯M acetic acid in nanopure water (A), and 70% ACN/0.1╯M acetic acid in nanopure water (B). The system was equilibrated at 1000╯psi for 20 minutes with 100% mobile phase A. Next, an exponential gradient was created by valve switching from pump A to B, which displaced mobile phase A in
OPTIMIZED METHODOLOGY FOR PHOSPHO SER/THR/TYR STUDIESâ•…â•… 193
the mixer with mobile phase B. The gradient was controlled by the split flow (∼9╯µL/min) under constant pressure conditions. The final composition of mobile phase B was approximately 70% by the end of the HPLC run (180 minutes). 6.2.9 LC-Linear Ion Trap (LTQ)-Orbitrap MS/MS A linear ion trap/Orbitrap hybrid MS was used for product ion spectral data set collection. For peptide fragmentation and sequencing, datadependent data sets were collected for the 10 most abundant species after each high-resolution MS scan by the LTQ-Orbitrap (100,000 resolution and mass scan range of m/z 300–2000). To enhance identification of phosphopeptides, data sets were also collected with high mass accuracy precursor scans by the LTQ-Orbitrap, data-dependent MS/MS of the top six peptides, followed by multistage activation (MSA) of the neutral loss peak in the MS2 scan that was associated with a precursor peak loss corresponding to phosphate loss (i.e., a neutral loss of 32.7╯Da, 49.0, 65.4, or 98.0). 6.2.10 LTQ-Fourier Transform (FT)/MS/MS A linear ion trap/Fourier transform hybrid mass spectrometer (Thermo Electron Corp., Bremen, Germany) was used for the analysis of the pY IP enrichment samples. The FT was scanned at a resolution of 100,000 for a mass range of m/z 400–2000. For peptide fragmentation and sequencing, data sets were collected for the top six most abundant species after each high-resolution MS scan by the FT mass spectrometer. 6.2.11 Peptide Identification and False Discovery Rate (FDR) Determination To identify peptides, all data collected from LC-MS/MS analyses were analyzed using SEQUEST and the following search criteria for phosphorylated peptides (LC-LTQ-Orbitrap MS/MS): static methylation on D-, E-, and C-terminus of the peptides in conjunction with dynamic phosphorylation of S, T, and Y residues, all searched as fully tryptic cleavage products. As the precursor masses were collected with high mass accuracy, the SEQUEST parameter file also contained a search criteria cutoff of ±1.5╯Da for the precursor masses. Data were searched against the Genbank entry for C. crescentus CB15 (AE005673.faa containing 3737 protein entries available at www.ncbi.nih.gov/).
194â•…â•… PROKARYOTIC PHOSPHORYLATION
To determine the FDR, the C. crescentus database was searched as a decoy database, that is, the reversed C. crescentus database was appended to the forward database and included in the SEQUEST search. The FDR was estimated from the forward and reverse (decoy) filtered matches and was calculated as a ratio of two times the number of false positives to the total number of identified peptides. For phosphorylated peptide search results (fully tryptic only), the following filtering criteria were applied for an FDR╯≤╯5%: 2+, 3+, and 4+ CS, XCorr╯≥╯3.0, all charge states (CSs) with DelCn2╯≥╯0.09. All phosphopeptide filtering criteria included a mass error cutoff within ±10╯ppm. For normal peptide search results, the following filtering criteria were applied for an FDR╯≤╯5%: XCorr╯≥╯1.9, 2.2, or 3.75 for 1+, 2+, or ≥3+ CS, all CSs partially or fully tryptic or nontryptic protein terminal peptide, minimum length of six. 6.2.12 Peptide Quantitative Comparison The in-house developed programs “Decon-2LS,” “Viper,” and “MultiAlign” were used to process the mass spectral data of the nonlabeled quantitative phosphopeptide comparison of the carbon-rich versus the carbon-starved C. crescentus samples (available in downloadable software packages at http://ncrr.pnl.gov/software). The differentially expressed phosphorylated peptides are measured using mass spectrometric techniques and are subsequently listed as either statistically significant increased or decreased in abundances. A t-test is performed on the aligned data sets for the determination of the number of statistically significant upregulated phosphorylated peptides. An observance where at least three incidences were found (at least two of one condition vs. one of the compared condition) was required with a t-test probability factor of ≤0.05 to indicate a difference that is statistically significant.35
6.3 IDENTIFICATION OF THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOME IN C. CRESCENTUS GROWN IN THE PRESENCE AND ABSENCE OF GLUCOSE 6.3.1 Total Phosphoprotein Identifications Overall, 259 phosphorylation sites on 149 C. crescentus proteins were identified in both carbon-rich and carbon-starved conditions. The methodology includes (see Fig. 6.7) the two environmental conditions (carbon rich vs. carbon starved), cell lysis and protein digestion,
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 195 Biological replicate 1
Biological replicate 2
All grown to mid-exponential phase in M2Glucose media
Harvested by centrifugation
Resuspend 3 hours in M2G Resuspend 3 hours in M2 Resuspend 3 hours in M2G (carbon-rich environment) (carbon-starved environment) (carbon-rich environment)
Harvested by centrifugation Bacterial cell wall lysis/in-solution digest SPE IMAC Phosphopeptides 100 90 80 70 60 50 40 30 20 10 0
nano-RP-LC
400
600
800
1000 m/z
1200
1400
100 90 80 70 60 50 40 30 20 10 0
ESI-MS/MS
400
600
800
1000 m/z
1200
1400
IPI Sequest Database searching
Cluster Intensity Log Intensity (Alignee)
WT_OE_top10.pek Mutant_cntl_quant_run1_col2.pek
9 8 7 6 5 4 3
Mutant_cntl_quant_run2_col2.pek Mutant_cntl_quant_run3_col2.pek Mutant_cntl_quant_run4_col2.pek Mutant_cntl_top10.pek Mutant_OE_quant_run1.pek Mutant_OE_quant_run1_1.pek Mutant_OE_quant_run2.pek Mutant_OE_quant_run3.pek Mutant_OE_quant_run4.pek Mutant_OE_quant_run5.pek Mutant_OE_quant_run6.pek Mutant_OE_top10.pek WT_cntl_quant_run1.pek WT_cntl_quant_run2.pek WT_OE_quant_run2.pek WT_OE_quant_run4.pek WT_OE_quant_run5.pek WT_OE_quant_run6.pek WT_OE_quant_run7.pek WT_OE_quant_run8.pek
3 4 5 6 7 8 9 Log Intensity (baseline: WT_cntl_quant_run3.pek)
Bioinformatics
Figure 6.7.╇ Methodology used in the study including the treatment of two biological replicates in the two environmental conditions (carbon rich vs. carbon starved), cell lysis and protein digestion, phosphorylated peptide IMAC enrichment and measurement, followed by protein identification and bioinformatic differential comparison.
196â•…â•… PROKARYOTIC PHOSPHORYLATION
phosphorylated peptide IMAC enrichment and measurement, followed by protein identification and bioinformatic differential comparison. The phosphorylation sites include 112 on Ser, 107 on Thr, 24 on Tyr, and 16 on aspartate. Among these identifications, 137 unique phosphopeptides from 135 proteins were identified in the soluble protein fractions, and 14 unique phosphopeptides from 14 proteins were identified in the insoluble protein fractions. All the identified phosphorylated peptides and protein information are listed in Table 6.1. Spectra for all 149 phosphopeptides and their SEQUEST identification information are included in the SpectrumLook Software Package (see Section 6.3.16) in compliance with the recent standards for the identification of phosphorylation sites.35 6.3.2 MSA Spectra The MSA approach was observed to produce both extensive peptide sequence coverage and highly confident phosphorylation site modification determination; however, other unassigned product ion peaks were also being observed in the MSA product ion spectra. These were determined to be a combination of a-type product ions and internal fragment product ions (mostly associated with proline [Pro]-containing peptides). Peptide amino acid sequence coverage and phosphorylation site determination were made with the a-, b-, and y-type ions only (i.e., excluding the internal fragments). In all studies, both MSMS (MS2) and MSA product ion spectra were collected and used for identificative purposes. Figure 6.8 illustrates (Fig. 6.8a) MS2 and (Fig. 6.8b) MSA product ion spectra for the peptide pTPLAALpSAQSRRAR, affording confirmatory and high confident identifications of all phosphorylated proteins reported. All spectral identifications were also hand annotated. 6.3.3 Phosphorylation Sites Identified Overall, 226 phosphorylation sites on 135 C. crescentus proteins were identified in both carbon-rich and carbon-starved conditions for the Ser/Thr/Tyr phosphoproteome. The methodology includes (see Fig. 6.7) the two environmental conditions (carbon rich vs. carbon starved), cell lysis and protein digestion, phosphorylated peptide IMAC enrichment and measurement, followed by protein identification and bioinformatic differential comparison. This method yielded phosphorylation sites on Ser (107), Thr (97), and Tyr (22). Among these identifications, 135 unique phosphopeptides from 135 proteins were identified in the global
197
VVS*ENTATGRILGAHMR MNS*T*KGCVRAR PLDGKPGLT*GSVGVK S*Y*NLNRPSAAVAR LIQT*T*GGLT*AR GLTGYGLSY*DLS*AR GQVLCKPGSITPHT*K DANVGGEVLCRVY* ES*AARS*AVEGAKR
SRPEAAIAS*GF Y*KGASLPSTESLATTLVR NLGDAAS*KRSDY*LR
gi|13421938|gb|AAK22698.1| gi|13422195|gb|AAK22909.1| gi|13422287|gb|AAK22983.1| gi|13422327|gb|AAK23017.1| gi|13422328|gb|AAK23018.1| gi|13422397|gb|AAK23077.1| gi|13422569|gb|AAK23221.1| gi|13422593|gb|AAK23243.1| gi|13422661|gb|AAK23299.1|
gi|13422926|gb|AAK23520.1| gi|13423772|gb|AAK24232.1| gi|13423038|gb|AAK23612.1|
GKIVPS*RITAVS*AK
S*NQSTCINQRPLVK
gi|13421681|gb|AAK22489.1|
gi|13423076|gb|AAK23646.1|
GGMT*SHAAVVAR LSAQVVGNS*EALAK LIT*MGFVT*NMLNPK MSKS*DPSDYSR VT*DALT*LT*PGAR AT*LIEAGAS*PAAAYK STFLAAAS*AAKPK IIDS*T*GALS*LPEVPK
Peptide
gi|13422841|gb|AAK23451.1| gi|13421090|gb|AAK21992.1| gi|13421119|gb|AAK22017.1| gi|13421159|gb|AAK22051.1| gi|13421306|gb|AAK22172.1| gi|13421459|gb|AAK22301.1| gi|13421460|gb|AAK22302.1| gi|13421491|gb|AAK22329.1|
Reference
TABLE 6.1.╇ Total Phosphorylated Proteins Identified in Caulobacter Crescentus
Pyruvate phosphate dikinase Kinase, putative Efflux protein, LysE family Tryptophanyl-tRNA synthetase TonB-dependent receptor Glutamate 5-kinase GTP-binding protein CgtA 2-Oxoglutarate dehydrogenase, E3 component, lipoamide dehydrogenase DNA-directed RNA polymerase, beta subunit Transcriptional regulator, AraC family OmpA-related protein TonB-dependent receptor, putative C-5 cytosine-specific DNA methylase GTP-binding protein LepA TonB-dependent receptor, putative Translation elongation factor EF-Tu Ribosomal protein S8 Outer membrane protein TolC, putative 2-Isopropylmalate synthase Phosphoglycerate mutase Oxidoreductase, glucose-methanolcholine (GMC) family Ribosomal protein S18 (Continued)
Description
198
AEPVLVQASPAAGEAPS*PK DAGMT*ACVAKPVS*AR
LAGES*GLACVR T*AAPIVRVATAGPASDGER LVVAT*S*WLPARSATPR
IGPSSGVSAT*RMATLSRLAR NQVWAIPAPTGGS*R VT*KIT*PGAVATLDS*VR ISATVT*PKVVELPQK AGLAGT*GVEAAAGADAVR
EVAAAGGRVLFVGT*KR VGFAATGGT*TPAPVYDR EEVLEAT*PSVT*LAR ES*TAPPPAAGS*AF
EAET*RALLAS*GR TVARAT*AARLEEAAK VASS*AAVVRR ES*LQAGLTAYGARTLGK
FDLGNET*S*ALTAK T*AEGGLVMTAADIT*AIK
gi|13423193|gb|AAK23743.1| gi|13423217|gb|AAK23763.1| gi|13423234|gb|AAK23778.1|
gi|13423283|gb|AAK23819.1| gi|13423341|gb|AAK23867.1| gi|13423343|gb|AAK23869.1| gi|13423368|gb|AAK23890.1| gi|13423370|gb|AAK23892.1|
gi|13423378|gb|AAK23898.1| gi|13423535|gb|AAK24029.1| gi|13423671|gb|AAK24145.1| gi|13423884|gb|AAK24326.1|
gi|13423900|gb|AAK24340.1| gi|13423925|gb|AAK24361.1| gi|13423936|gb|AAK24370.1| gi|13423960|gb|AAK24390.1|
gi|13423998|gb|AAK24422.1| gi|13424035|gb|AAK24453.1|
Peptide
gi|13423080|gb|AAK23650.1| gi|13423117|gb|AAK23681.1|
Reference
TABLE 6.1.╇ (Continued)
Transcriptional regulator, AraC family Sensor histidine kinase/response regulator Response regulator HlyD family secretion protein Flavin mononucleotide (FMN) oxidoreductase Transcription-repair coupling factor Aspartyl-tRNA synthetase Rotamase family protein Outer membrane protein 1-Deoxy-d-xylulose 5-phosphate reductoisomerase Ribosomal protein S2 6-Phospho-glucono-lactonase Magnesium transporter Sal operon transcriptional repressor SalR Hydantoinase/oxoprolinase AcrB/AcrD/AcrF family protein Transcriptional regulator, AraC family Type IV secretion system protein B4, putative DNA topoisomerase I Nonmotile and phage-resistance protein
Description
199
T*GLT*AARALIAGGAK
RTVY*ISPADFS*K VKAAGFTGS*R ALACDPTSAFGGIVAVNS*R
DAT*HKT*VT*AALK
PS*Y*VLGGRGMEIIR
S*PQT*GWT*LVVAIPR LSSAGNRVS*T*GRR LDSATSTSALRASEFET*Y*GAR LDSATSTSALRASEFET*Y*GAR DTTGAGVTVS*AGKKIEK GQVLCKPGSITPHT*K TGDT*LCDPLKSPVILER T*AAKAPAAET*APAAK MKTCLVVDDS*R DLQS*ALADR Y*GNFDKLAELSEARTK MLKFTT*VAR VT*QGSAAAGMIAGLTER
GIIVTNTPGVLTEDT* ADLTMT*LIMAAS*R
gi|13424241|gb|AAK24627.1| gi|13424391|gb|AAK24755.1| gi|13421187|gb|AAK22073.1|
gi|13421953|gb|AAK22711.1|
gi|13424518|gb|AAK24862.1|
gi|13424529|gb|AAK24871.1| gi|13424608|gb|AAK24938.1| gi|13424812|gb|AAK25108.1| gi|13424815|gb|AAK25109.1| gi|13424870|gb|AAK25156.1| gi|13424877|gb|AAK25161.1| gi|13424878|gb|AAK25162.1| gi|13425157|gb|AAK25403.1| gi|13425193|gb|AAK25433.1| gi|13425208|gb|AAK25446.1| gi|13425232|gb|AAK25466.1| gi|13425352|gb|AAK25568.1| gi|13425483|gb|AAK25677.1|
gi|13425492|gb|AAK25684.1|
Peptide
gi|13424123|gb|AAK24527.1|
Reference
UDP-N-acetylmuramoylalanine-dglutamate ligase TonB-dependent receptor Fatty aldehyde dehydrogenase Phosphoribosylaminoimidazolecarbox amide formyltransferase/IMP cyclohydrolase Electron transfer flavoprotein, alpha subunit Carbamoyl-phosphate synthase, large subunit Sensor histidine kinase, putative Flagellin, putative TonB-dependent receptor TonB-dependent receptor TonB-dependent receptor Translation elongation factor EF-Tu Translation elongation factor G Arylesterase-related protein Chemotaxis protein CheYIV Tyrosine kinase DivL Peptidase M13 family protein Glutamate synthase, small subunit Drug resistance transporter, EmrB/ QacA subfamily D-isomer specific 2-hydroxyacid dehydrogenases family protein (Continued)
Description
200
T*FAKPAVSNLDLT*VR TVT*S*LTNNTVKAVR MAAGKVY*VPET*AR NETNDKQLSLLVS*A LDMEAY*KGLLSDKTK NT*LATVQSMAAQT*LR AVRRTLPGAVT*LMAS*AVDPR LLRCVVGS*IFDVAVDIR
LAT*AS*AAVS*RLIAR MS*VLSALT*SLT*PR
ALAYRAGGDY*ETVLR
QT*LLVIDGGEVRS*R TADGVVIT*PANGPAK FS*ARLAGVEAQIK
IPT*T*T*PAETLAR T*PLAALS*AQSRRAR FFDS* LGPALLSELAQAGAATLADAALGER TS*PVIGAGRLALDT*R
gi|13422091|gb|AAK22823.1| gi|13422511|gb|AAK23171.1| gi|13422989|gb|AAK23573.1| gi|13423064|gb|AAK23636.1| gi|13423303|gb|AAK23835.1| gi|13424840|gb|AAK25132.1| gi|13424978|gb|AAK25248.1| gi|13425385|gb|AAK25595.1|
gi|13423689|gb|AAK24159.1| gi|13424153|gb|AAK24553.1|
gi|13425213|gb|AAK25451.1|
gi|13425378|gb|AAK25588.1| gi|13422030|gb|AAK22774.1| gi|13422305|gb|AAK22999.1|
gi|13422448|gb|AAK23118.1| gi|13422644|gb|AAK23286.1| gi|13423366|gb|AAK23888.1|
gi|13422643|gb|AAK23285.1|
DGDPTTALKAIAAAY*GKATAR IAALLYDLAGISLPDS*KATLVY*S*R
Peptide
gi|13421333|gb|AAK22195.1| gi|13421602|gb|AAK22422.1|
Reference
TABLE 6.1.╇ (Continued)
ThiJ/PfpI family protein Chemotaxis protein methyltransferase CheR ABC transporter, ATP-binding protein spoU rRNA methylase family protein Sensor histidine kinase KdpD Amidophosphoribosyltransferase Aminotransferase, class V Sensor histidine kinase, putative Response regulator dTDP-4-dehydrorhamnose 3,5-epimerase Transcriptional regulator, MarR family Chemotactic signal-response protein CheL Penicillin-binding protein AmpH, putative DNA-cytosine methyltransferase Glycosyl hydrolase, family 31 Type I secretion system outer membrane protein RsaF Fructokinase Sensor histidine kinase UDP-3-O-3-hydroxymyristoyl glucosamine N-acyltransferase DNA-binding response regulator
Description
201
VAWSTLFKDAETLAADGFVVS*PR T*QRSALS*DLLEGGGTVLTR
ITAAALPLIIS* DPAVAGAFDSNALLTLPPAPADA SPEELS*LAERLR VS*GVRAGADQIADAAVNLS*R
gi|13421716|gb|AAK22518.1| gi|13423843|gb|AAK24291.1|
gi|13424010|gb|AAK24432.1|
RS*RGALS*EINVT* PLVDVMLVLLIIFMISAPLLT AGVPLELPK QIRVLDLPGTY*SLR LTVIGCGLIGGS*VIR RLT*QAQLAQAAGVS*KR
AGCS*LASSITGADVR AY*GGQDGT*NLGMAVIR
PS*ADPLFESAAR DLS*GAGRILFVED*EDAVRS*VAAR AASAGALD*APLSGVD*RER QSPTGVTVS*FKGPD*GK ELT*IT*VRAD*AAPETPAR IDPGAPADKNLYD*LPPR VAFSAKVAGKDEAWT*T*NFD*IWSR
gi|13421937|gb|AAK22697.1| gi|13423729|gb|AAK24195.1| gi|13424367|gb|AAK24735.1|
gi|13424532|gb|AAK24874.1| gi|13425172|gb|AAK25416.1|
gi|13421799|gb|AAK22583.1| gi|13422380|gb|AAK23062.1| gi|13422395|gb|AAK23075.1| gi|13422395|gb|AAK23075.1| gi|13422617|gb|AAK23263.1| gi|13423432|gb|AAK23944.1| gi|13423453|gb|AAK23961.1|
gi|13424916|gb|AAK25194.1|
gi|13424459|gb|AAK24811.1|
EAVGS*IAEVEQSALR
Peptide
gi|13424414|gb|AAK24774.1|
Reference
Ferrous iron transport protein B Cyclohexadienyl dehydrogenase Transcriptional regulator, Cro/CI family Quinolinate synthetase A Acyl-CoA dehydrogenase family protein Protein-glutamate methylesterase Cell cycle histidine kinase CckA Amine oxidase, flavin containing Amine oxidase, flavin containing Serine protease Glutamine synthetase, class I Prolyl oligopeptidase family protein (Continued)
Methyl-accepting chemotaxis protein McpI ExbD/TolR family protein
Methyl-accepting chemotaxis protein McpR, putative Gamma-glutamyltransferase ABC transporter, periplasmic substrate-binding protein, putative Alkaline phosphatase, putative
Description
202
gi|13424829|gb|AAK25121.1| gi|13425099|gb|AAK25353.1| gi|13425252|gb|AAK25484.1| gi|13424864|gb|AAK25152.1| gi|13424254|gb|AAK24638.1| gi|13423187|gb|AAK23739.1| gi|13422331|gb|AAK23021.1| gi|13423120|gb|AAK23682.1|
gi|13421417|gb|AAK22267.1| gi|13422203|gb|AAK22915.1| gi|13423102|gb|AAK23668.1| gi|13422849|gb|AAK23457.1| gi|13422486|gb|AAK23150.1| gi|13424526|gb|AAK24868.1|
gi|13423642|gb|AAK24120.1| gi|13423937|gb|AAK24371.1| gi|13425150|gb|AAK25396.1| gi|13424700|gb|AAK25016.1| gi|13424791|gb|AAK25091.1| gi|13423889|gb|AAK24331.1| gi|13421189|gb|AAK22075.1| gi|13421247|gb|AAK22123.1| gi|13422817|gb|AAK23431.1| gi|13423187|gb|AAK23739.1|
Reference
TABLE 6.1.╇ (Continued)
NDDLY*GAKLNLD*LT*DK PVLMHGGAILVS*D*GFEPAR DPPPALLVS*PGD*AR TADAKTAT*PD*AVALAR LLAEAALGETGPVD*LLSR DT*S*QAAVAEAVK QQSGLDPS*QQVQEAALR RGRIYTTAPET*LK VVSVGY*LGLT*PK PD*GAGPASDLSSPTD* WVRDLVGAQR EAGT*AIKES*VK LT*DQVTGNSAHVS*LQPAT*RVAR KTKPAT*PAPAFS*GAA MGAT*GLGY*DIAK RLARLAAEES*S*APR ERYPVALHGVS*MS*VGS* ADGVKLDYLR LVVCT*GGEPFLQLDDAAIAALHAR S*IGIT*ILLLSLAR QTLKS*AS*AGGVFSR NQSAAAAS*ILAPAFVSKR AKALT*S*FPDRAPQFAGVTK LMLS*ETLAHAS*R DKGLVLVT*ET*R WASLSKTVAGTLT*AR
Peptide
Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Hypothetical protein Hypothetical protein
Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein
TonB-dependent receptor Acid-CoA ligase, putative Peptidase, M23/M37 family Xylosidase/arabinosidase Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein
Description
203
RVS*VVAALPPSDSR ESIATRGAGS*NS*R TWPES*QQAAVK ELWIRS*T*ISAPK EARMT*PLY*AQLIRAS*AAR Y*LNLQEATS*SLRR GPAGAPVAAY*AAR SAT*APSPNRRPPAR ASEIVSASIKDS*VK QY*ELAFAPEAAK LPELTNAQPS*LLS*LGDTEWSK S*RS*DEGLKALSGEAR LDLT*T*PGGRAR PT*HFVVAAR GEGENIT*TTER RLS*LPSKDLMAAIAATR RPGSVGS*T*GLIDDLR LQAAGVPAS*CS*VR HSACGGQLALDKARIT*PQT*S*R R.EAAIEVSGSGSGAVFAT*GKAK HRLRLS*ADART*LS*IAP EPDGALT*AT*AAAVR S*IMGGLPPLKPGEK VTGS*FYDS*EILAGRWKT*AGGR GYPNIIQPT*LVMS*RD*T*LR
Peptide
Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein
Description
A total of 259 phosphorylation sites on 149 C.â•›crescentus proteins were identified in both carbon-rich and carbon-starved conditions. The phosphorylation sites include 112 on serine, 107 on threonine, 24 on tyrosine, and 16 on aspartate.
gi|13421505|gb|AAK22341.1| gi|13421663|gb|AAK22473.1| gi|13421391|gb|AAK22245.1| gi|13421742|gb|AAK22538.1| gi|13421759|gb|AAK22551.1| gi|13424286|gb|AAK24666.1| gi|13424075|gb|AAK24489.1| gi|13424308|gb|AAK24684.1| gi|13423655|gb|AAK24131.1| gi|13424869|gb|AAK25155.1| gi|13425244|gb|AAK25476.1| gi|13422196|gb|AAK22910.1| gi|13425287|gb|AAK25513.1| gi|13423702|gb|AAK24170.1| gi|13424437|gb|AAK24793.1| gi|13423443|gb|AAK23953.1| gi|13425416|gb|AAK25620.1| gi|13424861|gb|AAK25149.1| gi|13424696|gb|AAK25012.1| gi|13425122|gb|AAK25372.1| gi|13421650|gb|AAK22462.1| gi|13422255|gb|AAK22957.1| gi|13423599|gb|AAK24083.1| gi|13424584|gb|AAK24918.1| gi|13423573|gb|AAK24061.1|
Reference
204â•…â•… PROKARYOTIC PHOSPHORYLATION (a)
bo * b’’ o
b’’
b’
9
8
11
12
pT P L A A L pS A Q S R R A R y o*
b’’
10
Relative Intensity
9
100 90 80 70 60 50 40 30 20 10 0
y*
13
y’
12
y’
10
y’
10
9
bo * ++
a
11
y* ++ 10
430.33
501.51
y’ ++ 10
b’
8
7
787.10 786.20
y’ ++
588.13
y’ ++ 8
1174.70
817.45
y o*
y’
8
12
y* ++
1175.71
13
y’
737.23
9
o
645.04 602.50
y’ * 9
648.49
o
12
10
9
1005.65
521.72
y o2* b’’
y o*
1157.67 1198.51
1052.61 1040.78
500
600
700
800 m/z
(b)
900
1000
1100
1200
b’’ b‘’o * b‘’o * b’’12 b’* 13
b’
6
9
b’’
10
11
Relative Intensity
9
100 90 80 70 y’ o *++ 11 60 y 4 50 571.30 40 ++ a’ 557.27 ++ b’* 6 b’’ 13 30 521.44 12 684.93 b’6 609.39 20 549.30 10 0 500 600 700
817.23
pT P L A A L pS A Q S R R A R
750.24
y’* y’ o * 12
y
yo *
y’ * 8
11
4
6
b‘ * 11
o
y * 6
752.95
b *
‘’o
11
b * 10
944.02
869.32
1012.61
y’ * 8
910.33
965.37 1009.09
a’’*
1124.64
‘’o
b
13
1208.62
b’’*
12
1165.54
9
1013.68
a’*
12
1235.60
y’*
12
1278.79
800
900 m/z
1000
1100
1200
1300
Figure 6.8.╇ MS2 and MSA product ion spectra for the peptide pTPLAALpSAQSRRAR. All identifications reported include both product ion spectra affording confirmatory and high confident identifications of phosphorylated proteins reported. All spectral identifications were hand-annotated.
protein extract and are listed in Table 6.2. In the phosphoproteome of C. crescentus, the levels of Ser and Thr phosphorylation were similar and higher than the extent of Tyr phosphorylation at a Ser/Thr/Tyr ratio of 47:43:10, in contrast to the overwhelming amount of Ser phosphorylation found in B. subtilis at a ratio of 70:20:10,11 and in E. coli at 68:23:9.12 This comparison is illustrated in Figure 6.9.
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 205
Figure 6.9.╇ Phosphoproteome comparison for C. crescentus versus B. subtilis and E. coli. There is a high ratio of phosphorylation of the tyrosine level as compared with eukaryote systems that are normally at a ratio╯<╯1. 1Macek et al., Mol. Cell Proteomics 2007, 6, 697–707. 2Macek et al., Mol. Cell Proteomics 2008, 7, 299–307.
Spectra for all 135 phosphopeptides and their SEQUEST identification information are included in the SpectrumLook Software Package (see Section 6.3.16). The MSA approach was observed to produce both extensive peptide sequence coverage and highly confident phosphorylation site modification determination; however, other unassigned product ion peaks were also being observed in the MSA product ion spectra. These were determined to be a combination of a-type product ions and internal fragment product ions (mostly associated with Procontaining peptides). Peptide amino acid sequence coverage and phosphorylation site determination were made with the a-, b-, and y-type ions only (i.e., excluding the internal fragments). 6.3.4 Ser/Thr/Tyr Phosphoproteome of C. crescentus In the phosphoproteome of C. crescentus, a high extent of Tyr phosphorylation was observed at a Ser/Thr/Tyr ratio of 46:43:10. This is similar to a study by Macek et al.11 of the phosphoproteome of B. subtilis where a Ser/Thr/Tyr ratio of 70:20:10 was reported, and in E. coli at 68:23:9.12 Interestingly, three of the TonB-dependent receptors identified in the study contain a pY residue such as the pY site location of the Tyr phosphorylated peptides LDSATSTSALRASEFYpTpYGAR (AAK25109.1), GLTGYGLSpYDLpSAR (AAK23077.1), and RTVpYISPADFpSK (AAK24627.1). The MSA product ion spectrum of the TonB-dependent receptors identified in the study contain a pY residue such as the pY site location of the Tyr phosphorylated peptides
206 CC #
0004 0029 0064 0086 0088 0136 0185 0208 0258 0280 0314 0315 0342 0354 0435 0475 0486 0502 0531
Peptide
LSAQVVGNS*EALAK
LIT*MGFVT*NMLNPK MSKS*DPSDYSR ALACDPTSAFGGIVAVNS*R
QQSGLDPS*QQVQEAALR RGRIYTTAPET*LK VT*DALT*LT*PGAR DGDPTTALKAIAAAY*GKATAR TWPES*QQAAVK EAGT*AIKES*VK AT*LIEAGAS*PAAAYK STFLAAAS*AAKPK IIDS*T*GALS*LPEVPK
RVS*VVAALPPSDSR IAALLYDLAGISLPDS*KATLVY*S*R
HRLRLS*ADART*LS*IAP ESIATRGAGS*NS*R S*NQSTCINQRPLVK
VAWSTLFKDAETLAADGFVVS*PR
Annotation
Efflux protein, LysE family trpS tryptophanyl-tRNA synthetase purH phosphoribosylaminoimidazolecarb oxamide formyltransferase/IMP cyclohydrolase Conserved hypothetical protein Conserved hypothetical protein TonB-dependent receptor ThiJ/PfpI family protein Hypothetical protein Conserved hypothetical protein proB glutamate 5-kinase cgtA GTP-binding protein CgtA ipdA 2-oxoglutarate dehydrogenase, E3 component, lipoamide dehydrogenase Hypothetical protein cheR chemotaxis protein methyltransferase CheR Hypothetical protein Hypothetical protein rpoB DNA-directed RNA polymerase, beta subunit ggt gamma-glutamyltransferase
Kinase, putative
TABLE 6.2.╇ Total Phosphorylated Proteins Identified in C.╛crescentus
Biosynthesis of cofactors, prosthetic groups, and carriers
Transcription
Cellular processes
Hypothetical Amino acid biosynthesis Unknown Energy metabolism
Central intermediary metabolism Transport and binding Protein synthesis Purines, pyrimidines, nucleosides, and nucleotides Hypothetical Hypothetical Transport and binding Unknown
Predicted Function
207
S*Y*NLNRPSAAVAR LIQT*T*GGLT*AR DKGLVLVT*ET*R DLS*GAGRILFVED*EDAVRS* VAAR GLTGYGLSY*DLS*AR IPT*T*T*PAETLAR RLARLAAEES*S*APR TVT*S*LTNNTVKAVR DANVGGEVLCRVY*
TonB-dependent receptor, putative cscK fructokinase Conserved hypothetical protein spoU rRNA methylase family protein rpsH ribosomal protein S8
1033 1034 1037 1078 1093 1134 1166 1187 1262
0789 0838 0925 0926 0931
TADGVVIT*PANGPAK T*FAKPAVSNLDLT*VR MNS*T*KGCVRAR S*RS*DEGLKALSGEAR LT*DQVTGNSAHVS*LQPAT* RVAR EPDGALT*AT*AAAVR PLDGKPGLT*GSVGVK FS*ARLAGVEAQIK
Hypothetical protein Hypothetical protein cheB protein-glutamate methylesterase feoB ferrous iron transport protein B Transcriptional regulator, AraC family etfA electron transfer flavoprotein, alpha subunit Glycosyl hydrolase, family 31 ABC transporter, ATP-binding protein OmpA-related protein Hypothetical protein Conserved hypothetical protein
Annotation
Hypothetical protein TonB-dependent receptor, putative rsaF type I secretion system outer membrane protein RsaF C-5 cytosine-specific DNA methylase GTP-binding protein LepA Hypothetical protein cckA cell cycle histidine kinase CckA
0551 0565 0597 0712 0713 0726
ELWIRS*T*ISAPK EARMT*PLY*AQLIRAS*AAR PS*ADPLFESAAR QIRVLDLPGTY*SLR VVS*ENTATGRILGAHMR DAT*HKT*VT*AALK
0973 0999 1015
CC #
Peptide
Transport and binding Energy metabolism Cell envelope Protein synthesis Protein synthesis (Continued)
Signal transduction
DNA metabolism Unknown
Transport and binding Protein fate
Hypothetical
Energy metabolism Transport and binding Transport and binding
Cellular processes Transport and binding Regulatory Energy metabolism
Cell envelope
Predicted Function
208 CC #
1304 1305 1318 1450 1471 1477 1541 1594 1634 1658 1668 1672 1691 1705 1706 1763 1767 1787 1802 1844 1860
Peptide
TS*PVIGAGRLALDT*R T*PLAALS*AQSRRAR ES*AARS*AVEGAKR VVSVGY*LGLT*PK GGMT*SHAAVVAR MGAT*GLGY*DIAK SRPEAAIAS*GF MAAGKVY*VPET*AR NLGDAAS*KRSDY*LR
NETNDKQLSLLVS*A
GKIVPS*RITAVS*AK AEPVLVQASPAAGEAPS*PK KTKPAT*PAPAFS*GAA DAGMT*ACVAKPVS*AR WASLSKTVAGTLT*AR LMLS*ETLAHAS*R LAGES*GLACVR T*AAPIVRVATAGPASDGER LVVAT*S*WLPARSATPR
IGPSSGVSAT*RMATLSRLAR LDMEAY*KGLLSDKTK
TABLE 6.2.╇ (Continued)
mfd transcription-repair coupling factor Aminotransferase, class V
rpsR ribosomal protein S18 Transcriptional regulator, AraC family Conserved hypothetical protein Sensor histidine kinase/response regulator Hypothetical protein Conserved hypothetical protein Response regulator HlyD family secretion protein FMN oxidoreductase
purF amidophosphoribosyltransferase
DNA-binding response regulator Sensor histidine kinase Outer membrane protein TolC, putative Conserved hypothetical protein ppdK pyruvate phosphate dikinase Conserved hypothetical protein leuA 2-isopropylmalate synthase kdpD sensor histidine kinase KdpD oxidoreductase, GMC family
Annotation
Unknown Regulatory Transport and binding Central intermediary metabolism DNA metabolism Central intermediary metabolism
Signal transduction Signal transduction Cell envelope Hypothetical Energy metabolism Hypothetical Amino acid biosynthesis Regulatory Central intermediary metabolism Purines, pyrimidines, nucleosides, and nucleotides Protein synthesis Regulatory Cell envelope Regulatory
Predicted Function
209
Outer membrane protein dxr 1-deoxy-D-xylulose 5-phosphate reductoisomerase
1915 1917 1923 1978 2056 2112 2160 2174 2188 2199 2224 2261 2320 2355 2360 2369 2390 2399
EVAAAGGRVLFVGT*KR RLS*LPSKDLMAAIAATR VGFAATGGT*TPAPVYDR S*IMGGLPPLKPGEK ASEIVSASIKDS*VK EEVLEAT*PSVT*LAR LAT*AS*AAVS*RLIAR PT*HFVVAAR LTVIGCGLIGGS*VIR Y*KGASLPSTESLATTLVR T*QRSALS*DLLEGGGTVLTR
ES*TAPPPAAGS*AF DT*S*QAAVAEAVK EAET*RALLAS*GR TVARAT*AARLEEAAK VASS*AAVVRR
rpsB ribosomal protein S2 Hypothetical protein pgl 6-phospho-glucono-lactonase Hypothetical protein Hypothetical protein mgtE magnesium transporter Transcriptional regulator, MarR family Hypothetical protein tyrA cyclohexadienyl dehydrogenase gpm phosphoglycerate mutase ABC transporter, periplasmic substratebinding protein, putative Sal operon transcriptional repressor SalR cutC conserved hypothetical protein Hydantoinase/oxoprolinase AcrB/AcrD/AcrF family protein Transcriptional regulator, AraC family
aspS aspartyl-tRNA synthetase Rotamase family protein ipxD UDP-3-O-3-hydroxymyristoyl glucosamine N-acyltransferase
1892 1894 1913
NQVWAIPAPTGGS*R VT*KIT*PGAVATLDS*VR FFDS* LGPALLSELAQAGAATLADA ALGER ISATVT*PKVVELPQK AGLAGT*GVEAAAGADAVR
Annotation
CC #
Peptide
(Continued)
Energy metabolism Unknown Unknown Transport and binding Regulatory
Amino acid biosynthesis Energy metabolism Transport and binding
Transport and binding Regulatory
Energy metabolism
Cell envelope Biosynthesis of cofactors, prosthetic groups, and carriers Protein synthesis
Protein synthesis Protein fate Cell envelope
Predicted Function
210 CC #
2419 2451 2461 2482 2518 2556 2583 2660 2671 2701 2719 2771 2791 2810 2829 2847
Peptide
ES*LQAGLTAYGARTLGK
FDLGNET*S*ALTAK ITAAALPLIIS* DPAVAGAFDSNALLTLPPAP ADASPEELS*LAERLR T*AEGGLVMTAADIT*AIK
GPAGAPVAAY*AAR T*GLT*AARALIAGGAK
MS*VLSALT*SLT*PR
RTVY*ISPADFS*K AKALT*S*FPDRAPQFAGVTK Y*LNLQEATS*SLRR SAT*APSPNRRPPAR RLT*QAQLAQAAGVS*KR VKAAGFTGS*R EAVGS*IAEVEQSALR
GEGENIT*TTER VS*GVRAGADQIADAAVNLS*R
TABLE 6.2.╇ (Continued)
pleC nonmotile and phage-resistance protein Hypothetical protein murD UDP-N-acetylmuramoylalanine–Dglutamate ligase cheL chemotactic signal-response protein CheL TonB-dependent receptor Conserved hypothetical protein Hypothetical protein Hypothetical protein Transcriptional regulator, Cro/CI family Fatty aldehyde dehydrogenase Methyl-accepting chemotaxis protein McpR, putative Hypothetical protein mcpl methyl-accepting chemotaxis protein McpI
virB4 type IV secretion system protein B4, putative topA DNA topoisomerase I Alkaline phosphatase, putative
Annotation
Cellular processes
Regulatory Energy metabolism Cellular processes
Transport and binding Hypothetical
Cellular processes
Cell envelope
Regulatory
Protein fate, transport and binding DNA metabolism Central intermediary metabolism
Predicted Function
211
Conserved hypothetical protein Sensor histidine kinase, putative nadA quinolinate synthetase A
tufA translation elongation factor EF-Tu fusA translation elongation factor G ExbD/TolR family protein Response regulator Conserved hypothetical protein
2906 2909 2912 2956 2976 3050 3146 3159 3170 3187 3190 3193 3194 3199 3200 3232 3286 3391
ERYPVALHGVS*MS*VGS* ADGVKLDYLR S*PQT*GWT*LVVAIPR AGCS*LASSITGADVR
VTGS*FYDS*EILAGRWKT*AGGR LSSAGNRVS*T*GRR HSACGGQLALDKARIT*PQT*S*R LDSATSTSALRASEFET*Y*GAR LVVCT*GGEPFLQLDDAAIAALHAR NT*LATVQSMAAQT*LR LQAAGVPAS*CS*VR NQSAAAAS*ILAPAFVSKR QY*ELAFAPEAAK DTTGAGVTVS*AGKKIEK
GQVLCKPGSITPHT*K TGDT*LCDPLKSPVILER RS*RGALS*EINVT* PLVDVMLVLLIIFMISAP LLTAGVPLELPK AVRRTLPGAVT*LMAS*AVDPR S*IGIT*ILLLSLAR
Hypothetical protein Flagellin, putative Hypothetical protein TonB-dependent receptor Conserved hypothetical protein Sensor histidine kinase, putative Hypothetical protein Conserved hypothetical protein Hypothetical protein TonB-dependent receptor
carB carbamoyl-phosphate synthase, large subunit
2900
PS*Y*VLGGRGMEIIR
Annotation
CC #
Peptide
Regulatory Hypothetical
(Continued)
Transport and binding proteins Protein synthesis Protein synthesis Transport and binding
Cellular processes
Transport and binding Hypothetical Regulatory
Cellular processes
Regulatory Biosynthesis of cofactors, prosthetic groups, and carriers
Purines, pyrimidines, nucleosides, and nucleotides Hypothetical
Predicted Function
212
3410 3441 3454 3471 3484 3489 3504 3514 3522 3551 3606 3626 3633 3658 3715 3722
R.EAAIEVSGSGSGAVFAT*GKAK T*AAKAPAAET*APAAK AY*GGQDGT*NLGMAVIR MKTCLVVDDS*R DLQS*ALADR ALAYRAGGDY*ETVLR Y*GNFDKLAELSEARTK LPELTNAQPS*LLS*LGDTEWSK QTLKS*AS*AGGVFSR LDLT*T*PGGRAR MLKFTT*VAR QT*LLVIDGGEVRS*R LLRCVVGS*IFDVAVDIR
RPGSVGS*T*GLIDDLR VT*QGSAAAGMIAGLTER
GIIVTNTPGVLTEDT*ADLTMT* LIMAAS*R
Hypothetical protein Arylesterase-related protein Acyl-CoA dehydrogenase family protein cheYIV chemotaxis protein CheYIV divL tyrosine kinase DivL Penicillin-binding protein AmpH, putative Peptidase M13 family protein Hypothetical protein Conserved hypothetical protein Hypothetical protein gltD glutamate synthase, small subunit DNA-cytosine methyltransferase rfbC dTDP-4-dehydrorhamnose 3,5-epimerase Hypothetical protein Drug resistance transporter, EmrB/QacA subfamily D-isomer-specific 2-hydroxyacid dehydrogenases family protein
Annotation
Central intermediary metabolism
Transport and binding
Amino acid biosynthesis DNA metabolism Cell envelope
Hypothetical
Unknown Unknown Regulatory Regulatory Cell envelope Protein fate
Predicted Function
Overall, 226 phosphorylation sites on 135 C.â•›crescentus proteins were identified in both carbon-rich and carbon-starved conditions for the Ser/Thr/Tyr phosphoproteome. This method yielded phosphorylation sites on Ser (107), Thr (97), and Tyr (22). Among these identifications, 135 unique phosphopeptides from 135 proteins were identified in the global protein extract (difference in values from Table 6.1 due to exclusion of phosphoasparate).
CC #
Peptide
TABLE 6.2.╇ (Continued)
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 213 o * b b 12
b’o
b’o *
bo
11
bo
18
17
16
20
L D S A T S T S A L R A S E F E pT pY G A R ++
y o * y’
y’ o * y’ o y’* y o * 20 17 19
14
18
13
y
y’ o
12
b’ o
y’ *
10
18
7
100
yo*
90
y’ ++ b
Relative Intensity
80
60
+++
+++
50
y’*
y o*
y ++ +++
480.34
30
b
428.26
++ 11
559.56
10
++
16
y ++
++
13
yo* 17
901.63
980.30
y’ * 787.61
617.63
b’o
18
845.04
732.47
10
++
17
++ o
b
20
y’ o
1009.73
b *
y’ o *
18
19
++
’o
12
711.30 663.34 648.40 666.79
17
40
20
800.53
y ++ 10
14
++
16
13
70
o
++
y’ o
1008.99
++
798.65
7
801.55 809.46 836.32
916.71
860.49
yo
++ ++
18 ++
bo
y’ o *
20
18
1066.00
o
b * 12
1117.47
1024.64 990.48 1103.53
b
++
20
1130.59
y
10
1332.61
0 400
500
600
700
800
m/z
900
1000
1100
1200
1300
Figure 6.10.╇ The MSA product ion spectrum and pY site location of the tyrosine phosphorylated peptide LDSATSTSALRASEFYpTpYGAR from a TonB-dependent receptor protein at m/z 817.37.
LDSATSTSALRASEFYpTpYGAR (AAK25109.1) is illustrated in Figure 6.10. 6.3.5 Phosphorylated His and Aspartate A search of the phosphorylated peptides for modification of the His or aspartate residues resulted in no phosphorylated His-containing peptides and 14 phosphorylated aspartate-containing peptides that identified 13 C. crescentus proteins. The phosphoaspartate-containing peptides observed and their subsequent identifications are listed in Table 6.3. The low observance of phosphorylated aspartate and no phosphorylated His-containing peptides is likely due to the acid lability of these two modifications that could result in loss of the phosphoryl group during peptide sample cleanup steps at low pH. However, some of the phosphorylated aspartate-containing peptides are stable enough to be measured with our methodology. Figure 6.11 illustrates the identification of the phosphoaspartate-containing peptide LLAEAALGETGPVpDLLSR from an MSA product ion spectrum at m/z 974.01 for a 45,473╯Da conserved hypothetical protein. This is a singly phosphorylated peptide and its product ion spectrum suggests that the
214 GenBank ID
AAK22381.1 AAK23062.1 AAK23075.1 AAK23075.1 AAK23263.1 AAK23739.1 AAK23944.1 AAK23961.1 AAK24061.1 AAK24120.1 AAK24371.1 AAK25016.1 AAK25091.1 AAK25396.1
Peptide
GS*D*VGGIAVFTATDGK DLS*GAGRILFVED*EDAVRS*VAAR
QSPTGVTVS*FKGPD*GK
AASAGALD*APLSGVD*RER
ELT*IT*VRAD*AAPETPAR PD*GAGPASDLSSPTD*WVRDLVG AQR IDPGAPADKNLYD*LPPR
VAFSAKVAGKDEAWT*T*NFD*IWSR
GYPNIIQPT*LVMS*RD*T*LR NDDLY*GAKLNLD*LT*DK
PVLMHGGAILVS*D*GFEPAR
TADAKTAT*PD*AVALAR LLAEAALGETGPVD*LLSR
DPPPALLVS*PGD*AR
Description
Xylosidase/arabinosidase Conserved hypothetical protein Peptidase, M23/M37 family
Acid-CoA ligase, putative
Hypothetical protein Cell cycle histidine kinase CckA Amine oxidase, flavin containing Amine oxidase, flavin containing Serine protease Conserved hypothetical protein Glutamine synthetase, class I Prolyl oligopeptidase family protein Hypothetical protein TonB-dependent receptor
TABLE 6.3.╇ Identification of Phosphoaspartate-Containing Proteins
No data
xarB No data
No data
No data No data
No data
glnA
No data No data
No data
No data
No data cckA
TIGR Gene Symbol
Protein fate
No data Transport and binding proteins Fatty acid and phospholipid metabolism Energy metabolism Hypothetical proteins
Protein fate
Amino acid biosynthesis
Central intermediary metabolism Central intermediary metabolism Protein fate Unknown function
No data Signal transduction
Function
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 215 b
b4 b5
12
b
b14 b15
13
L L A E A A L G E T G P V pD L L S R 100 y
y
90
11
13
y
y
10
Relative Intensity
y
y
y
6
12
y
3
4
5
y
1080.63
6
796.34
60
849.40
50
y
3
4
502.50
b4 441.24
y*
y*
y
389.27
20 10
7
1009.47
70
30
y
8
80
40
b
1151.56
b5 y ++
512.24
7
11
608.51 589.24
y
8
y
697.43
13
y
y
5
b
955.38
729.42 6 778.47 679.35
5
y ++ 10
a 12
950.40
13
1266.48
10
1123.59
893.47
y
1250.74
y
1435.20
1356.68
b14
b
15
1445.60 1558.64
11
1158.57 1216.61
1467.54
0 400
600
800
1000 m/z
1200
1400
Figure 6.11.╇ MSA product ion spectrum at m/z 974.01, [M╯+╯2H]2+, of the phosphoaspartate-containing peptide LLAEAALGETGPVpDLLSR. The phosphoaspartate-containing peptide is derived from a 45,473╯Da conserved hypothetical protein. Its product ion spectrum suggests that the singly phosphoaspartatecontaining peptide has similar behavior to that of phosphotyrosine where the phosphate group is not easily lost during collision-induced dissociation (CID) in some peptides.
phosphoaspartate residue has similar behavior to that of pY where the phosphate group is not easily lost during collisionally activated dissociation (CAD) in some peptides. 6.3.6 Cell Cycle His Kinase CckA A very interesting phosphoaspartate-containing peptide that was observed in both biological replicates (BRs) is the DLpSGAGRILFVEpDEDAVRpSVAAR peptide identified as the cell cycle His kinase CckA. This protein is involved in one of the three signal transduction pathways that control the phosphorylation level of CtrA, a master transcriptional regulator of C. crescentus cell cycle progression and polar morphogenesis. Cell cycle His kinase CckA is also involved in the stability of CtrA through regulation of CtrA proteolysis.36 The site of phosphorylation was determined at the Asp578 residue, while the predicted phosphorylation sites include His322 in the transmitter domain and Asp623 in the C-terminal receiver domain.
216â•…â•… PROKARYOTIC PHOSPHORYLATION
6.3.7 Phosphoglutamate Finally, besides a TonB-dependent receptor and four hypothetical proteins, the remaining seven phosphoaspartate-containing peptides all belong to enzymes with various activities apparently involved in the two-component signaling system. Surprisingly, a phosphoglutamatecontaining peptide, GLpSALLGEVDAAPAQAPGpEQLGGpSR, was also identified in the filtered SEQUEST results belonging to the chromosome partitioning protein ParB. Inspection of the MSA product ion spectrum showed good coverage of the y-type ion series and also a fragmentation behavior similar to that observed for phosphoaspartate where the loss of the phosphate group from the glutamate residue was not taking place. 6.3.8 Enriched Tyr Phosphoproteome of C. crescentus Of the 22 peptides identified containing pY residues (see Table 6.1), 14 of the associated pY-containing proteins (64%) were confirmed by the pY IP enrichment. 6.3.8.1╅ Sensor His Kinase KdpD.╇ The sensor His kinase KdpD (CC 1594) was originally identified in the global phosphorylation study by the peptide MAAGKVpYVPEpTAR and in the pY enrichment study as pYVPETAR, both phosphorylated at the Tyr-198 residue. Sensor His kinase KdpD is an 892-residue protein that contains six protein family (Pfam)37 domains. The KdpD N-terminal domain (residues 21 through 231) regulates the kdpFABC operon that is responsible for potassium transport.38 The kdp domain is cytoplasmic and may have a sensor domain role for turgor pressure sensing.39 The present study demonstrates that the conserved KdpD domain may be a Tyr kinase domain that autophosphorylates similar to DivL. The originally identified pY sites from the global phosphorylation study (Table 6.1) of the phosphoglycerate mutase Try-138 residue and the acyl-CoA dehydrogenase family protein Tyr-85 and Thr-91 were also confirmed by the pY IP enrichment. 6.3.8.2╅ TonB-Dependent Receptor Proteins.╇ Finally, the TonBdependent receptor proteins were found to contain multiple pYcontaining peptides in the IP enrichment study. For example, the enrichment study identified for the following TonB-receptor proteins: five pY-containing peptides for CC 1093, nine for CC 2660, and four for CC 3146.
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 217
6.3.9 Carbon Environment-Shared Phosphoproteome Out of the 149 identified phosphorylated proteins, there is an overlap of 92 that are common to both of the tested environments. The 92 common phosphorylated proteins are listed in Table 6.4. 6.3.9.1â•… Two-Component His Kinases.╇ There are a number of interesting proteins observed in the two environments. Both carbon environments there contained 10 phosphorylated proteins that are involved in the two-component signaling system (CC 1594, 1705, 3170, 1305, 1767, 2482, 1304, 3471, 0597, 1078). Of these 10, there are five different sensor His kinases observed to be phosphorylated on Tyr and Thr (KdpD), Thr and Ser (three histidine protein kinase (HPK) response regulators), and aspartate and Ser (CckA) amino acid residues (see Table 6.3). 6.3.9.2â•… Multiply Phosphorylated Kinases.╇ Our studies indicate that the sensor His kinases, along with the other reported two-component signaling system proteins, are also multiply phosphorylated on these amino acid residues. For the five sensor His kinase phosphorylated peptides observed in both conditions, the MS2 product ion spectra contain product ion peaks for a neutral loss of one to two phosphorylation moieties, and the MSA spectra have good coverage of the phosphorylated residues in the form of b- and y-type ions. 6.3.9.3â•… pTPLAALpSAQSRRAR Peptide as Sensor His Kinase.╇ For example, for the pTPLAALpSAQSRRAR peptide identified as a 45,804-Da sensor His kinase, the Figure 6.12a MS2 and the Figure 6.12b MSA product ion spectra illustrate that both the Thr and Ser are phosporylated. Sensor His kinases contain two domains, a His kinase domain and a sensor domain. The His kinase domain contains an adenosine triphosphate (ATP)-binding catalytic subdomain and a His phospho-accepting autophosphorylation subdomain.39–42 This phosphorylation may be a control of the antagonistic state equilibrium or a required conformational change needed for signaling activity. Many HPKs, besides containing kinase-regulating domains, have phosphatase activity43 where the two antagonistic states are in equilibrium as determined by ligands.44 6.3.9.4â•… Aspartate Phosphorylated Tyr Kinase DivL.╇ The aspartate phosphorylated Tyr kinase DivL protein (CC 3484), which is related to the HPKs, was identified in both environmental conditions as the phosphorylated peptide DLQpSALADR (see Fig. 6.13a for the ribbon
218 CC #
0004 0029 0064 0086 0088 0136 0185 0258 0280 0314 0315 0342
0475 0486 0502 0531
Peptide
LSAQVVGNS*EALAK
LIT*MGFVT*NMLNPK MSKS*DPSDYSR ALACDPTSAFGGIVAVNS*R
QQSGLDPS*QQVQEAALR RGRIYTTAPET*LK VT*DALT*LT*PGAR
TWPES*QQAAVK EAGT*AIKES*VK AT*LIEAGAS*PAAAYK
STFLAAAS*AAKPK IIDS*T*GALS*LPEVPK
HRLRLS*ADART*LS*IAP ESIATRGAGS*NS*R S*NQSTCINQRPLVK
VAWSTLFKDAETLAADGFVVS *PR
Annotation
Conserved hypothetical protein Conserved hypothetical protein TonB-dependent receptor (transduce cytoplasmic membrane energy to outer membrane Hypothetical protein Conserved hypothetical protein proB glutamate 5-kinase (cytosol, proline biosyn) cgtA GTP-binding protein CgtA lpdA 2-oxoglutarate dehydrogenase, E3 component, lipoamide dehydrogenase (complex 3 enzyme system, transfer e- to NAD) Hypothetical protein Hypothetical protein rpoB DNA-directed RNA polymerase, beta subunit ggt gamma-glutamyltransferase
Efflux protein, LysE family trpS tryptophanyl-tRNA synthetase purH phosphoribosylaminoimidazolecarboxam ide formyltransferase/IMP cyclohydrolase
Kinase, putative
TABLE 6.4.╇ Phosphorylated Proteins Observed in Both Carbon Environments
Biosynthesis of cofactors, prosthetic groups, and carriers
Transcription
Hypothetical Amino acid biosynthesis Unknown Energy metabolism
Central intermediary metabolism Transport and binding Protein synthesis Purines, pyrimidines, nucleosides, and nucleotides Hypothetical Hypothetical Transport and binding
Predicted Function
219
TonB-dependent receptor, putative cscK fructokinase tufB translation elongation factor EF-Tu DNA-binding response regulator
0712 0726 0925 0926 0973 1015 1033 1034 1037 1078 1093 1134 1240 1304 1305 1318 1450 1477 1594
QIRVLDLPGTY*SLR DAT*HKT*VT*AALK
MNS*T*KGCVRAR S*RS*DEGLKALSGEAR EPDGALT*AT*AAAVR FS*ARLAGVEAQIK
S*Y*NLNRPSAAVAR LIQT*T*GGLT*AR DKGLVLVT*ET*R DLS*GAGRILFVED* EDAVRS*VAAR GLTGYGLSY*DLS*AR IPT*T*T*PAETLAR GQVLCKPGSITPHT*K TS*PVIGAGRLALDT*R
T*PLAALS*AQSRRAR
ES*AARS*AVEGAKR
VVSVGY*LGLT*PK MGAT*GLGY*DIAK MAAGKVY*VPET*AR
Outer membrane protein TolC, putative (multifunctional) Conserved hypothetical protein Conserved hypothetical protein kdpD sensor histidine kinase KdpD
Sensor histidine kinase
Hypothetical protein cheB protein-glutamate methylesterase (phosphorelay) feoB ferrous iron transport protein B etfA electron transfer flavoprotein, alpha subunit OmpA-related protein (transport and receptor) Hypothetical protein Hypothetical protein rsaF type I secretion system outer membrane protein RsaF C-5 cytosine-specific DNA methylase GTP-binding protein LepA Hypothetical protein Cell cycle histidine kinase CckA
0565 0597
EARMT*PLY*AQLIRAS*AAR PS*ADPLFESAAR
Annotation
CC #
Peptide
Hypothetical Hypothetical Regulatory (Continued)
Transport and binding Energy metabolism Protein synthesis Regulatory functions, signal transduction Regulatory functions, signal transduction Cell envelope
Signal transduction
DNA metabolism Unknown
Protein fate
Transport and binding
Transport and binding Energy metabolism
Cellular processes
Predicted Function
220 CC #
1658 1668 1672 1691 1705 1706 1763 1767 1802 1844 1860 1894 1917 2056 2112 2149 2160 2174 2188
Peptide
NETNDKQLSLLVS*A
GKIVPS*RITAVS*AK AEPVLVQASPAAGEAPS*PK KTKPAT*PAPAFS*GAA DAGMT*ACVAKPVS*AR WASLSKTVAGTLT*AR LMLS*ETLAHAS*R LAGES*GLACVR LVVAT*S*WLPARSATPR
IGPSSGVSAT*RMATLSRLAR LDMEAY*KGLLSDKTK
VT*KIT*PGAVATLDS*VR
AGLAGT*GVEAAAGADAVR
VGFAATGGT*TPAPVYDR S*IMGGLPPLKPGEK NDDLY*GAKLNLD*LT*DK ASEIVSASIKDS*VK EEVLEAT*PSVT*LAR LAT*AS*AAVS*RLIAR
TABLE 6.4.╇ (Continued)
pgl 6-phospho-glucono-lactonase Hypothetical protein TonB-dependent receptor Hypothetical protein mgtE magnesium transporter Transcriptional regulator, MarR family (multiple antibiotic resistance regulator)
Rotamase family protein (membrane receptor with TonB) dxr 1-deoxy-d-xylulose 5-phosphate reductoisomerase
rpsR ribosomal protein S18 Transcriptional regulator, AraC family Conserved hypothetical protein Sensor histidine kinase/response regulator Hypothetical protein Conserved hypothetical protein Response regulator FMN oxidoreductase (enzymatic release of iron from ferritin) Transcription-repair coupling factor Aminotransferase, class V
purF amidophosphoribosyltransferase
Annotation
Transport and binding Regulatory
Transport and binding
Biosynthesis of cofactors, prosthetic groups, and carriers Energy metabolism
Unknown Regulatory Central intermediary metabolism DNA metabolism Central intermediary metabolism Protein fate
Purines, pyrimidines, nucleosides, and nucleotides Protein synthesis Regulatory Cell envelope Regulatory
Predicted Function
221
Conserved hypothetical protein Flagellin, putative Hypothetical protein TonB-dependent receptor
2451 2482 2556 2660 2671 2701 2771 2829 2900 2906 2976 3050 3147
FDLGNET*S*ALTAK T*AEGGLVMTAADIT*AIK T*GLT*AARALIAGGAK
RTVY*ISPADFS*K AKALT*S*FPDRAPQFAGVTK Y*LNLQEATS*SLRR RLT*QAQLAQAAGVS*KR GEGENIT*TTER PS*Y*VLGGRGMEIIR
ERYPVALHGVS*MS*VGS* ADGVKLDYLR LSSAGNRVS*T*GRR HSACGGQLALDKARIT*PQT* S*R LDSATSTSALRASEFET*Y* GAR
topA DNA topoisomerase I pleC nonmotile and phage-resistance protein murD UDP-N-acetylmuramoylalanine-dglutamate ligase (catalyze addition of d-glutamate to nucleotide precursor) TonB-dependent receptor Conserved hypothetical protein Hypothetical protein Transcriptional regulator, Cro/CI family Hypothetical protein carB carbamoyl-phosphate synthase, large subunit (arginine, urea biosyn)
Hypothetical protein Sal operon transcriptional repressor SalR cutC conserved hypothetical protein Hydantoinase/oxoprolinase (hydrolase activity) AcrB/AcrD/AcrF family protein (transporter)
2199 2355 2360 2369 2390
PT*HFVVAAR ES*TAPPPAAGS*AF DT*S*QAAVAEAVK EAET*RALLAS*GR TVARAT*AARLEEAAK
Annotation
CC #
Peptide
(Continued)
Transport and binding
Cellular processes
Purines, pyrimidines, nucleosides, and nucleotides Hypothetical
Regulatory
Transport and binding Hypothetical
Energy metabolism Unknown Unknown Cellular processes, Transport and binding DNA metabolism Regulatory Cell envelope
Predicted Function
222
Conserved hypothetical protein Sensor histidine kinase, putative Hypothetical protein Hypothetical protein Conserved hypothetical protein Hypothetical protein Arylesterase-related protein cheYlV chemotaxis protein CheYIV divL tyrosine kinase DivL (cell viability and division) Penicillin-binding protein AmpH, putative Peptidase M13 family protein Conserved hypothetical protein Hypothetical protein gltD glutamate synthase, small subunit (a complex iron–sulfur flavoprotein that participates in ammonia assimilation processes) DNA-cytosine methyltransferase Hypothetical protein Drug resistance transporter, EmrB/QacA subfamily D-isomer-specific 2-hydroxyacid dehydrogenases family protein
3159 3170 3187 3193 3391 3410 3441 3471 3484 3489 3504 3522 3551 3606
3626 3658 3715 3722
LVVCT*GGEPFLQLDDAAIA ALHAR NT*LATVQSMAAQT*LR LQAAGVPAS*CS*VR QY*ELAFAPEAAK S*IGIT*ILLLSLAR EAAIEVSGSGSGAVFAT*GKAK T*AAKAPAAET*APAAK MKTCLVVDDS*R DLQS*ALADR
ALAYRAGGDY*ETVLR Y*GNFDKLAELSEARTK QTLKS*AS*AGGVFSR LDLT*T*PGGRAR MLKFTT*VAR
QT*LLVIDGGEVRS*R RPGSVGS*T*GLIDDLR VT*QGSAAAGMIAGLTER
GIIVTNTPGVLTEDT*ADLTMT* LIMAAS*R
Annotation
CC #
Peptide
TABLE 6.4.╇ (Continued)
Central intermediary metabolism
Transport and binding
DNA metabolism
Amino acid biosynthesis
Cell envelope Protein fate Hypothetical
Unknown Regulatory Cellular processes
Hypothetical
Regulatory
Hypothetical
Predicted Function
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 223
(a)
bo * b’’ o
b’’
b’
9
8
11
12
pT P L A A L pS A Q S R R A R y o*
b’’
10
9
y*
100
13
y’
y’
y’
12
8
10
1174.70
817.45
y o*
y’
10
9
90
Relative Intensity
80 70
bo * ++ y* ++
50
10
40 30 20
y’ ++ 8 430.33
10
y* ++
1175.71
13
y’
737.23
9
o
645.04
y’ ++ 10
501.51
8
7 787.10 786.20
y’ ++ 12
588.13
b’
a
11
60
602.50
y’ *
648.49
1005.65
521.72
y o2* b’’
y o*
9
9
o
12
10
1157.67 1198.51
1052.61 1040.78
0 500
600
700
800
900
m/z
1000
(b)
1100
b’’ b‘’o * b‘’o * b’’ b’* 13
b’
6
b’’
1200
9
10
11
12
9
817.23
100
pT P L A A L pS A Q S R R A R
750.24
90
Relative Intensity
80 70 60
y’ o *
30
y’* y’ o * 12
8
11
4
571.30
a’
6
521.44
b’6
20
4
6
b‘ *
y 557.27
y
yo*
y’ *
11
11
50 40
++
b’’
++
12
609.39
b’*
++
o
y * 6
752.95
13
684.93
‘’o
11
b * 10
869.32
944.02
1012.61
y’ * 8
910.33
549.30
965.37 1009.09
a’’*
1124.64
b‘’o * b
13
1208.62
b’’*
12
1165.54
9
a’*
1013.68
12
1235.60
10
y’*
12
1278.79
0 500
600
700
800
900
m/z
1000
1100
1200
1300
Figure 6.12.╇ (a) MS2 and (b) MSA product ion spectra of the pTPLAALpSAQSRRAR peptide at m/z 836.47 identified as a 45, 804-Da sensor histidine kinase, part of a large family of the two-component response regulators typically activated through the phosphorylation of a conserved histidine residue. The sensor histidine kinase is multiply phosphorylated on the acid-stable serine and threonine residues. This phosphorylation may be a control of the antagonistic state equilibrium or a required conformational change needed for signaling activity. In the figure, loss associated with the phosphate group is labeled with (‘).
224â•…â•… PROKARYOTIC PHOSPHORYLATION (a)
Tyrosine kinase DivL
(b) Transmembrane potential
AA residue
6-26
Pfam-B 10903
272-465
PAS-4
Pfam-B 16139
411-465
466-512
or sph
Pho
His KA
540-606
HATPase-C
652-757
n
tio
yla
M
QSALADRSAALAEAERLKRDFVGNVSYE Ser-525
Tyr-550
Figure 6.13.╇ (a) Ribbon structure of tyrosine protein kinase DivL. (b) The five protein family (Pfam)37 domains of DivL include a histidine kinase and HATPase domain and a transmembrane potential region. Phosphorylation was observed at the serine residue (Ser-525) in close proximity to the predicted modification site on the Tyr-550 residue designated as M.
structure of DivL). DivL is a multipass membrane protein kinase that is required for cell viability and controls the response regulator CtrA activity through a signal transduction network.45 The response regulator CtrA is involved in the transcriptional control of at least 55 cell cycle-regulated operons12,46 and is activated through the phosphorylation of an aspartate residue through the multicomponent phosphorelay process with the Tyr kinase DivL.47 The Tyr kinase DivL differs from its homologous His protein kinases where DivL contains a Tyr residue (Tyr-550) instead of a His residue in the conserved H-box, the site of predicted autophosphorylation. DivL contains five Pfam37 domains that include a His kinase and ATPase domain and a transmembrane potential region, illustrated in Figure 6.13b. In the present study, in both environmental conditions, we observed the phosphorylation at a Ser residue (Ser-525) that is in close proximity to the predicted modification site on the Tyr-550 residue (designated as M in Fig. 6.13b). In the IP pY enrichment study, the Tyr-550 phosphorylated residue was observed for the peptide DFVGNVSpYE, verifying its presence but at
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 225
very low levels. In the normal proteome of C. crescentus, the peptide DFVGNVSYELR in its unmodified form has also been observed. 6.3.10 Carbon-Rich versus Carbon-Starved Class/Category 6.3.10.1â•… Localization of Phosphoproteome of C. crescentus.╇ The localization of the Ser/Thr/Tyr phosphoproteome of C. crescentus was extracted from the identified proteins using the PSORTb(v.2.0)48 subcellular localization prediction tool. A comparison of the localization for the glucose (carbon)-rich environment (+G) versus the glucose (carbon)-starved (−G) is illustrated in Figure 6.14a. Besides the unknown assignments, the majority of the identified phosphorylated proteins are cytoplasmic, followed by the cytoplasmic membrane. The −G condition contained less unknown locations as compared with the +G condition, resulting in a greater percentage of each assignment location. This may be due to a reduced and more specific signaling within the C. crescentus under the −G condition. This is analogous to the fact that bacteria under starved conditions shut down the synthesis of nonessential proteins not needed for survival, resulting in an overall reduced expression of proteins during stress.49 6.3.10.2â•… Integral Membrane Proteins.╇ Out of the 3767 putative open reading frames (ORFs) in the C. crescentus 4╯Mb genome, 731 are predicted to be integral membrane proteins, and 140 are predicted to be outer membrane proteins (OMPs).50 Sixteen membrane-associated proteins were observed in both conditions (see Table 6.4) including six TonB-dependent receptor proteins. TonB is a periplasmic protein that contains an N-terminal cytoplasmic membrane anchor and is part of a six-component regulatory system spanning the periplasm from the cytoplasmic membrane to the outer membrane.51 Its primary purpose is the uptake of iron and heme complexes into the cell. At least 60 of the predicted OMP contain a TonB box, and it is suspected that their functions are broader than just uptake of iron and heme complexes.50 6.3.10.3â•… Function of Phosphoproteome of C. crescentus.╇ The functions of the identified phosphorylated proteins were obtained from the C. crescentus functional database downloaded from The Institute for Genomic Research (TIGR) Comprehensive Microbial Resource (CMR). A comparison of the carbon-rich versus carbon-starved functions is illustrated in Figure 6.14b. Important differences include decreases in the cellular processes, metabolisms (central intermediary and energy), and protein synthesis associated with the carbon-starved environment.
226â•…â•… PROKARYOTIC PHOSPHORYLATION (a) 51
Percent (%)
–G +G 36 32
34
5 8
2 5
Cy to p
la s
m C ic m yto em pl br asm an ic e Pe rip la sm m Ou em te br r an e U nk no w n
16 10
2121 * * 9 10 9 ** –G * +G 6 6 75* 6 3423 334 511
10 * 9 88 * 6 66 5 4 3 23 11
A m i Bi no os aci yn d th bio es s Ce is yn of th nt e ra Ce cof sis li C ac nt l e l t er Fa l m lula env ors tty ed e r l ac ia pr op id ry o e an D m ces NA et se d a s ph E os ner me boli ph gy ta sm ol m bo Pu i l rin H pid etab ism es y m o po ,p th etab lism yr et im ic olis id al in pr m es ot ,n e uc N ins le os Pr Pr o d id ot ote ata es ei in n , Re and syn fate gu nu th Tr la cl esi an to eo s sp ry ti or fu de ta nd Tra nct s n io bi n scr n U din ipti s nk g o no pr n w ote n fu ins nc tio n
Percent (%)
(b)
Figure 6.14.╇ (a) Localization of the Ser/Thr/Tyr phosphoproteome of C. crescentus. The majority of the identified phosphorylated proteins for the glucose (carbon)-rich environment (+G) versus the glucose (carbon) starved (−G) are cytoplasmic followed by the cytoplasmic membrane. The −G condition contained less unknown locations as compared with the +G condition, resulting in a greater percentage of each assignment location due to reduced and more specific signaling within the C. crescentus under the −G condition. (b) Comparative functions of the Ser/Thr/Tyr phosphoproteome of C. crescentus under the two conditions studied. Important differences include the reduction in cellular processes, metabolisms, and protein synthesis in the carbonstarved environment.
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 227
6.3.11 Carbon-Rich versus Carbon-Starved Unique Phosphorylated Proteins To investigate the difference in the phosphoproteome of the carbonrich versus carbon-starved environments, the phosphoproteome was compared for proteins apparently unique to each environmental condition. There were 43 phosphorylated proteins found in the carbon-rich environment (Table 6.5) that were not detected in the carbon-starved environment and 15 in the carbon starved (Table 6.6) not detected in the carbon rich. 6.3.11.1╅ Carbon-Rich Environment Phosphorylated Proteins.╇ In the carbon-rich environment, nine phosphorylated proteins are involved in protein and/or amino acid biosynthesis (CC 0257, 0482, 0713, 2399, 1923, 1262, 1892, 1187, 3286). For example, the MSA product ion spectrum identified the pY of adenosylhomocysteinase (CC 0257), an amino acid metabolism enzyme for converting S-adenosylhomocysteine to homocysteine. Also observed only in the carbon-rich environment are six phosphorylated proteins that are either glycolytic enzymes or involved in glycolysis processes (CC 1634, 2791, 2261, 2912, 0789, 3633), five phosphorylated proteins involved in the two-component signaling system (CC 2909, 0435, 2810, 2847, 2583), substrate transport proteins (CC 0838, 2320, 3232, 2419), periplasmic and membrane-associated proteins (CC 2461, 0999, 3194, 1915), and a number of hypothetical proteins (see Table 6.5). 6.3.11.2╅ Carbon-Starved Environment Phosphorylated Proteins.╇ Fifteen phosphorylated proteins were found to be unique to the carbon-starved environment (Table 6.6). Of these, five are associated with membrane processes (first five listed in Table 6.6), two as secondary energy source processes (second two listed), four in biosynthesis, and one involved with the two-component signaling system. Under carbon-starved conditions, P. putida was shown to generate 72 new proteins (unidentified) not observed under normal growth conditions.24,25 6.3.11.2.1╅ Carbon-Starved Programmed Expression.╇ The 15 uniquely phosphorylated proteins observed for C. crescentus under carbonstarved conditions may represent association with starvation-specific genes that undergo programmed expression during the nutrientstressed condition.24,25 The gram-positive bacterium Mycobacterium smegmatis undergoes changes in its cell surface glycopeptidolipid
228 CC #
0208 0224 0257 0354 0435 0482 0509 0551 0713 0789 0838 0931 0999 1109 1187 1262 1634 1892
Peptide
DGDPTTALKAIAAAY*GKATAR GNY*S*APLAGGKIEATAR VAVVCGY*GDVGKGSAASLR
RVS*VVAALPPSDSR IAALLYDLAGISLPDS*KATLVY*S*R
ASVAAALEAS*AQAARARATSPR
LRGQLS*T*MAPGK
ELWIRS*T*ISAPK VVS*ENTATGRILGAHMR TADGVVIT*PANGPAK T*FAKPAVSNLDLT*VR
LT*DQVTGNSAHVS*LQPAT*RVAR PLDGKPGLT*GSVGVK
KT*T*AVLEASDLAR TVT*S*LTNNTVKAVR DANVGGEVLCRVY* NLGDAAS*KRSDY*LR
NQVWAIPAPTGGS*R
aspS aspartyl-tRNA synthetase
Hypothetical protein spoU rRNA methylase family protein rpsH ribosomal protein S8 Oxidoreductase, GMC family
Membrane protein, putative Transcriptional regulator, AraC family Glycosyl hydrolase, family 31 ABC transporter, drug exporter-1 (drugE1) family, ATP-binding protein Conserved hypothetical protein TonB-dependent receptor, putative
Hypothetical protein cheR chemotaxis protein methyltransferase CheR metE 5-methyltetrahydropteroyltriglutamatehomocysteine methyltransferase phaR polyhydroxyalkanoate synthesis repressor PhaR
ThiJ/PfpI family protein Hypothetical protein achY adenosylhomocysteinase
Annotation
Protein synthesis Protein synthesis Intermediary metabolism Protein synthesis
Amino acid biosynthesis Fatty acid and phospholipid metabolism Cell envelope Regulatory Energy metabolism Transport and binding proteins Hypothetical proteins Transport and binding proteins
Chemotaxis
Central intermediary metabolism
Unknown
Predicted Function
TABLE 6.5.╇ Phosphorylated Proteins Unique to Growth of C.╛crescentus in Minimal Media with Glucose as the Sole Carbon Source
229
gpm phosphoglycerate mutase ABC transporter, periplasmic substratebinding protein, putative Transcriptional regulator, AraC family virB4 type IV secretion system ATPase VirB4 Alkaline phosphatase, putative cheL chemotactic signal-response protein CheL Hypothetical protein Fatty aldehyde dehydrogenase Methyl-accepting chemotaxis protein McpR, putative mcpl methyl-accepting chemotaxis protein Mcpl Sensor histidine kinase, putative
1915 1923 1978 2224 2261 2320 2399 2419 2461 2583 2719 2791 2810 2847 2909
GAS*LPSTESLATTLVR T*QRSALS*DLLEGGGTVLTR
VASS*AAVVRR ES*LQAGLTAYGARTLGK
ITAAALPLIIS*DPAVAGAFDSNALL TLPPAPADASPEELS*LAERLR MS*VLSALT*SLT*PR
SAT*APSPNRRPPAR VKAAGFTGS*R EAVGS*IAEVEQSALR
VS*GVRAGADQIADAAVNLS*R
S*PQT*GWT*LVVAIPR or NT*LATVQSMAAQT*LR
IpxD UDP-3-O-3-hydroxymyristoyl glucosamine N-acyltransferase Outer membrane protein rpsB ribosomal protein S2 Hypothetical protein tyrA cyclohexadienyl dehydrogenase
1913
FFDS*LGPALLSELAQAGAAT LADAALGER ISATVT*PKVVELPQK EVAAAGGRVLFVGT*KR RLS*LPSKDLMAAIAATR LTVIGCGLIGGS*VIR
Annotation
CC #
Peptide
(Continued)
Regulatory
Energy metabolism
Amino acid biosynthesis Energy metabolism Transport and binding proteins Regulatory functions Protein fate, transport, and binding Central intermediary metabolism
Cell envelope Protein synthesis
Cell envelope
Predicted Function
230
Transport energizing protein, ExbD/TolR family Response regulator Acyl-CoA dehydrogenase family protein Hypothetical protein rfbC dTDP-4-dehydrorhamnose 3,5-epimerase
2956 3190 3194 3232 3286 3454 3514 3633
VTGS*FYDS*EILAGRWKT*AGGR NQSAAAAS*ILAPAFVSKR DTTGAGVTVS*AGKKIEK
RS*RGALS*EINVT* PLVDVMLVLLIIFMISAPLL TAGVPLELPK AVRRTLPGAVT*LMAS*AVDPR AY*GGQDGT*NLGMAVIR LPELTNAQPS*LLS*LGDTEWSK LLRCVVGS*IFDVAVDIR
Hypothetical protein Beta-lactamase TonB-dependent receptor
nadA quinolinate synthetase A
2912
AGCS*LASSITGADVR
Annotation
CC #
Peptide
TABLE 6.5.╇ (Continued)
Cell envelope
Regulatory functions Unknown
Transport and binding proteins Transport and binding proteins
Biosynthesis of cofactors, prosthetic groups, and carriers
Predicted Function
231
CC #
1357
1541
2068
0325
1141
1166 1122 1471
2518 1997
3200 1137
1339
1787 3002
Peptide
LIIAGGS*AYSR
SRPEAAIAS*GF
AFGDKPVLVHCVT*QK
KIRGGSTIS*QQTAK
QMT*ALS*IEEKPK
RLARLAAEES*S*APR FT*VFKT*HRGPIVR GGMT*SHAAVVAR
GPAGAPVAAY*AAR FNTLGLT*NVIT*K
TGDT*LCDPLKSPVILER LAGVSTSTVS*R
GAEY*QVNFVPK
T*AAPIVRVATAGPASDGER HFGLSEAGAAT*IRK
mtgA monofunctional biosynthetic peptidoglycan transglycosylase (peptidoglycan layer (cell wall) metabolism rfbA glucose-1-phosphate thymidylyltransferase (synthesis of cell surface glycolipid) Conserved hypothetical protein Penicillin amidase family protein (serine peptidase) ppdK pyruvate phosphate dikinase (catalyzes ATP in anaerobic conditions as energy source) Hypothetical protein pcm protein-L-isoaspartate(D-aspartate) O-methyltransferase (cytosol protein for protein repair) fusA translation elongation factor G (protein synthesis) Transcriptional regulator, LacI family (transport ligands (e.g., glucose) in periplasm) glnK nitrogen regulatory protein P-II 2 (two-component signaling system) HlyD family secretion protein (membrane fusion protein) Oxidoreductase, aldo/keto reductase family (membrane protein activate K+ transport)
glyA serine hydroxymethyltransferase (involved in secondary energy source besides glucose) leuA 2-isopropylmalate synthase (biosynthesis of amino acids) dxs 1-deoxyxylulose-5-phosphate synthase (isopentenyl diphosphate biosynthesis)
Annotation
Transport and binding Unknown
Amino acid biosynthesis
Protein synthesis Regulatory
Protein fate
Cell envelope Cellular processes Energy metabolism
Cell envelope
Biosynthesis of cofactors, prosthetic groups, and carriers Cell envelope
Amino acid biosynthesis
Amino acid biosynthesis
Predicted Function
TABLE 6.6.╇ Phosphorylated Proteins Unique to Growth of C.╛crescentus in Minimal Media without Glucose as the Sole Carbon Source
232â•…â•… PROKARYOTIC PHOSPHORYLATION
(GPL) composition under carbon-starved conditions, resulting in a colony morphology change from a large regular rugose to a small, smooth colony.26 The unique phosphorylation of the MtgA (CC 0325) monofunctional biosynthetic peptidoglycan transglycosylase (peptidoglycan layer, cell wall) metabolism protein, and the RfbA (CC 1141) glucose-1-phosphate thymidylyltransferase (synthesis of cell surface glycolipid) protein may indicate a change in the C. crescentus cell surface or in cell size. 6.3.11.3╅ Decreased Normal Activity.╇ The stark contrast between the carbon-rich and carbon-starved conditions indicates a significant decrease in the normal activity of the bacteria in association with the PTM of phosphorylation. For example, the six phosphorylated proteins that are either glycolytic enzymes or involved in glycolysis processes and the five phosphorylated proteins involved in the two-component signaling system seen in the carbon-rich environment and absent in the carbon starved may reflect signaling processes observed under normal environmental conditions. 6.3.12 Confirmation of Decreased Energy Pathways The Blast2GO52 tool was used to extract the gene ontology (GO) terms of the C. crescentus phosphorylated proteins identified in this study. Combining the results from the two BRs, GO-annotated orthologs from the Swiss-Prot database collected for both of the carbon-rich environment and the carbon-starved environment resulted in a total of 99 phosphorylated proteins and 85 phosphorylated proteins in the carbon-rich and carbon-starved environments, respectively. It was found that the localization of the majority of the phosphorylated proteins in both conditions is cytosolic and membrane-bound-associated. This would be expected as C. crescentus contains no intracellular organelles. A difference between the two conditions was observed in mitochondrial-associated phosphorylation of proteins. 6.3.12.1╅ Carbon-Rich Mitochondrial Localization.╇ In the localization of the carbon-rich environment phosphorylated proteins, 22% was associated with the mitochondrial part and 5% with the mitochondrial lumen. The mitochondrial-associated phosphorylated proteins iden� tified included 2-oxoglutarate dehydrogenase E3 component, lipoa� mide dehydrogenase (AAK22329.1), fatty aldehyde dehydrogenase (AAK24755.1), acyl-CoA dehydrogenase family protein (AAK25416.1), and electron transfer flavoprotein alpha subunit (AAK22711.1). This is
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 233
in contrast to the carbon-starved environment where mitochondrialassociated localization was not observed. This is not surprising as a dominant role for the mitochondria is the production of ATP by oxidizing the products of glycolysis that are produced within the cytosol. 6.3.12.2╅ Normal Proteome Glycolytic Pathway.╇ In the normal C. crescentus proteome, we see strong evidence that glycolytic pathway proteins such as glucose-6-phosphate isomerase (CC 0222), fructosebisphosphate aldolase (CC 3250), glyceraldehyde 3-phosphate dehydrogenase GAPDH or GA3PDH (CC 3248), and enolase (CC 1724) are present and expressed, verifying that C. crescentus does indeed utilize this pathway. 6.3.12.3╅ Starvation Survival Response.╇ In the absence of high levels of glucose in the carbon-starved environment, the phosphorylated mediated signaling pathways involving the mitochondria-associated proteins (by similarity) appears to have decreased. This may indicate a reduction in the production of ATP during the environmental stress of glucose deficiency. This decrease in the production of ATP during carbon starvation stress is similar to the decrease observed in oxygenlimited stress.27 The aerobic respiration process (cellular respiration) is oxygen dependent, and in the absence of oxygen, the process of anaerobic respiration metabolizes the glycolytic products outside of the mitochondria. It has been reported that there is an approximately 13-fold higher yield during aerobic respiration of the production of ATP from glucose in eukaryotes as compared with anaerobic respiration.27 This decrease in the production of ATP during carbon starvation stress is similar to the starvation survival response decrease in endogenous metabolic activity observed in the dissimilatory metal-reducing bacterium Shewanella alga cultured under carbon-starved conditions.18 The change in the metabolic response of the S. alga in the carbon-starved environment was demonstrated through a measured decrease in culture absorbance, cell viability, and Fe(III) reductase activity.18 6.3.13 Phosphopeptide Quantitative Differential Comparison Quantitative analysis of the abundance of the phosphorylated peptides allows the observation of protein expression changes due to the conditions of the study. The phosphorylated peptides identified from the two conditions (carbon rich vs. carbon starved) were compared for either upregulation or downregulation using a modified quantitative labelfree accurate mass and time (AMT) tag approach.28
234â•…â•… PROKARYOTIC PHOSPHORYLATION
6.3.13.1â•… Upregulation in Phosphorylation.╇ In BR1, 36 unique phosphorylated peptides showed a statistically significant upregulation in response to the carbon-starved environment as compared with the carbon-rich with a t-test probability factor of P╯≤╯0.05. These correlated to 33 identified unique phosphorylated proteins. Similarly, 18 unique phosphorylated peptides, correlating to 18 identified unique phosphorylated proteins, showed a statistically significant upregulation in BR2. Shown in both of the BRs, eight unique phosphorylated proteins were shown upregulated in the carbon-rich environment. The upregulated phosphorylated peptides and protein information in each BR are listed in Table 6.7. 6.3.13.2â•… Adaptive Response with Phosphorylation.╇ The upregulation in the phosphorylated form of the sal operon transcriptional repressor SalR appears to be an adaptive response to the carbonstarved growth environment. The expression of the sal operon transcriptional repressor SalR is required for the catabolism of salicin, a common glucoside used by bacteria as a carbon source.53 The upregulation of the peptidase M13 family protein is also probably an adaptive response associated with catabolism. The upregulation in the DNAbinding response regulator and the response regulator are both part of the two-component response regulator activity where phosphorylated by a sensor that detects the presence of a particular signal substance (or lack of) outside the cell either transfers the signal to another receptor or binds to DNA and alters the level of transcription of target genes. The D-isomer-specific 2-hydroxyacid dehydrogenases family protein and the conserved hypothetical protein, which has similarity to an NAD-specific glutamate dehydrogenase (GDH), are probably involved in elevated signaling processes associated with an adaptive response to the carbon-starved growth environment. 6.3.13.3â•… Upregulation NAD-Dependent GDH.╇ A sequence search of the upregulated conserved hypothetical protein (CC 0088) on the Pfam37 Web site showed close similarity to the Bac GDH bacterial NAD-GDH protein of Streptomyces clavuligerus, which has previously been reported to show strong sequence similarity to unidentified ORFs in Mycobacterium, Rickettsia, Pseudomonas, Vibrio, Shewanella, and Caulobacter.54 NAD-GDH is an enzyme involved in glutamate catabolism regulation55 of intermediates in catabolic and biosynthetic pathways. Studies have shown that the active form of the NAD-GDH enzyme is in the phosphorylated state in Candida utilis56 and Saccharomyces cerevisiae.24,25 The present study has measured the
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 235
upregulation in the phosphorylation event of NAD-dependent GDH in C. crescentus where the upregulation in the phosphorylation event is indicative that responses in metabolic pathways are taking place in the carbon-starved environment. 6.3.13.4╅ Downregulation of Flagellin Protein.╇ The downregulation in the phosphorylation event of the flagellin protein (CC 2976) may result in signaling a change in motility of the bacteria, indicating an adaptive response to the carbon-starved growth environment. 6.3.14 Carbon-Rich versus Carbon-Starved Normal Proteome Time Course Study Overall, 25,076 peptides (observed at least twice) equating to 2465 unique proteins were identified in the normal (nonmodified) proteome of C. crescentus grown in the presence and absence of glucose as the sole carbon source. The time course study included three points for the carbon-rich environment at 0 hour, 0.5 hour, and 1 hour, and three points for the carbon-starved environment at 0.5 hour, 1 hour, and 2 hours. Each time point was analyzed in triplicate equating to 18 data sets. 6.3.14.1╅ Entire Proteome Localization and Function.╇ In the carbonrich environment, 23,321 peptides (at least two peptides) equating to 2432 unique proteins were identified, and in the carbon-starved environment, 22,598 peptides equating to 2410 unique proteins were identified. The localization and functional category assignments of the entire proteomes were quite similar between the two environments. When compared against each other, there were very few unique proteins found in each of the environments (43 unique to the carbon rich and 28 unique to the carbon starved). For the 135 phosphorylated proteins identified, there were 105 found to overlap with the normal proteome where 104 phosphorylated proteins overlap with the normal proteome under carbon-rich conditions and 103 with the carbon starved. This indicates that the phosphorylation event is evenly distributed between both of the normal (nonmodified) proteomes of carbon-rich and carbon-starved environments; thus, a direct relationship may not exist between phosphorylation and normal proteome growth environment in general but may be limited to more specific processes and events such as upregulation and downregulation as discussed in the next section.
236
CC #
0088
1134 2360
2660 2976 3441
Peptide
QQSGLDPS*QQVQEAALR
IPT*T*T*PAETLAR DT*S*QAAVAEAVK
RTVY*ISPADFS*K LSSAGNRVS*T*GRR T*AAKAPAAET*APAAK
Conserved hypothetical protein Fructokinase Copper homeostasis protein CutC TonB-dependent receptor Flagellin, putative Arylesterase-related protein
Annotation
Transport and binding Chemotaxis and motility Unknown
Energy metabolism Unknown
Unknown
Predicted Function
2.09 −2.56 down −1.17 down
0.91 1.79
4.66 up
Regulation
BioRep1
−6.61 −2.58 down −2.11 down
−4.01 −3.60
6.47 up
Regulation
BioRep2
TABLE 6.7.╇ Overlap of Statistically Regulated Phosphorylated Proteins in Carbon-Starved Environment Observed in Biological Replicates 1 and 2
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 237
Scaled and Outlier removed
Carbon rich
2 1 0 –1 –2
4 1 3 5 2
Time 0.5 hour
0 hour 31 24 5
4 3 1 5
5 3 1 4
25 3 1 4
1 hour 5 32 4
5
41
25 1 3
5 3 1 4
1 3 5 2
431 25
0.5 hour
10 5 Peptides
1
13 54
1 24
1 245
3 1 2 5
23 1 5 4
14
2 hours
1 hour 15 Time Carbon starved
Figure 6.15.╇ Tracking of five peptides for the CBS domain protein (CC 2626) across the entire time course of the study (indicative of upregulation), including three points for the carbon-rich environment at 0 hour, 0.5 hour, and 1 hour (analyzed in triplicate), and three points for the carbon-starved environment at 0.5 hour, 1 hour, and 2 hours (analyzed in triplicate).
6.3.14.2â•… Regulated Proteins.╇ The time course-based study resulted in 388 upregulated proteins that showed a statistically significant difference (P╯<╯0.05) according to analysis of variance (ANOVA) testing due to the study carbon starvation condition, and 260 downregulated. Figure 6.15 illustrates the tracking of five peptides identified in the study as the cystathionine-β-synthase (CBS) domain protein (CC 2626) across the entire time course of the study, indicative of upregulation. 6.3.14.3â•… Localization of Regulated Proteins.╇ The localization of the regulated normal proteome of C. crescentus was extracted from the identified proteins using the PSORTb(v.2.0)48 subcellular localization prediction tool. A comparison of the localization for the upregulated and downregulated proteins in the glucose (carbon)-starved versus the glucose (carbon)-rich environment is illustrated in Figure 6.16. As was also observed in the localization of the phosphorylated proteins in Figure 6.14a, besides the unknown assignments, the majority of the regulated proteins are cytoplasmic, followed by the cytoplasmic membrane. The downregulated contained less unknown locations as compared with the upregulated, resulting in a greater percentage of each assignment location. Analogous to Figure 6.14a, the downregulated proteins involve more specific (known) localization within the C. crescentus under the −G condition, while the upregulated proteins encompass a broader scope of localization including a greater amount of unknown assignment.
238â•…â•… PROKARYOTIC PHOSPHORYLATION 45
Upregulated Downregulated
40
35
% of Total
30
25
20
15
10
5
0 Cytoplasmic
Cytoplasmic membrane
Periplasmic
Outer membrane
Extracellular
Unknown
Figure 6.16.╇ Localization comparison of the upregulated and downregulated proteins in the glucose (carbon)-starved versus the glucose (carbon)-rich environment where the majority of the regulated proteins are cytoplasmic followed by the cytoplasmic membrane. Downregulated involves more specific (known) localization assignments.
6.3.14.4╅ Function of Regulated Proteins.╇ The functions of the upregulated and downregulated proteins were obtained from the C. crescentus functional database downloaded from TIGR CMR. A comparison of the upregulated and downregulated proteins in the carbonstarved environment is illustrated in Table 6.8. The biosynthesis-related category and the protein synthesis category both have a greater percentage assignment (15.7% and 15.2%, respectively, vs. 8.1% and 9.2%) in the upregulated proteins. For the amino acid biosynthesis, there is a greater percentage of upregulated predicted function associated with the pyruvate family. As stated previously, under carbon-starved conditions, P. putida was shown to generate 72 new proteins (unidentified) not observed under normal growth conditions.56,57 The greater percentage in upregulated proteins involved in protein synthesis may be a
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 239
TABLE 6.8.╇ Predicted Function of the Statistically Regulated Proteins in the Time Course Study
Predicted Function
Amino acid biosynthesis Biosynthesis of cofactors, prosthetic groups, carriers Cell envelope Biosynthesis related (total) Protein synthesis Central intermediary metabolism DNA metabolism Energy metabolism Fatty acid and phospholipid metabolism Metabolism related (total) Cellular processes Hypothetical, unknown, no data Protein fate: folding, trafficking, degradation Purines, pyrimidines, nucleosides, nucleotides Regulatory functions Signal transduction Transcription Transport and binding
Upregulated
Downregulated
(% of 388)
(% of 260)
8.5 3.1
4.2 1.2
5.2 15.7 15.2 3.6 2.3 13.1 4.9 23.9 5.2 17.3 5.9 2.8 2.6 <1 1.8 10.8
5.4 8.1 9.2 4.2 1.5 14.6 6.2 26.5 8.5 16.5 7.7 4.2 1.9 0 3.8 10.8
similar response in the C. crescentus under carbon-starved conditions of that of P. putida. 6.3.14.5╅ Normal Proteome Energy Pathways.╇ Out of the 12 known C. crescentus glycolytic pathway proteins, six were observed to be upregulated including glyceraldehyde 3-phosphate dehydrogenase (CC 3248), phosphoglycerate kinase (CC 3249), glucose-6-phosphate isomerase (CC 0222), phosphoglycerate mutase (CC 2261, also phosphorylated and unique to the carbon-rich environment), pyruvate kinase (CC 2051), and enolase (CC 1724), and two were downregulated including glucokinase (CC 2054) and fructose-bisphosphate aldolase, class II (CC 3250). A predominance of upregulated proteins would be expected from the starvation environment. 6.3.14.6╅ Overlap of Phosphorylated Proteins and Regulated Normal Proteome.╇ Though there appears to be a similar distribution of phosphorylated proteins within the normal proteomes of carbon-rich and
240â•…â•… PROKARYOTIC PHOSPHORYLATION
carbon-starved environments, the overlapping phosphorylated proteins within the upregulated and downregulated normal proteins are different and indicate relationships between phosphorylation and normal proteome regulation. Table 6.9 lists the overlap of phosphorylated peptides with upregulated proteins that have been identified. The overlap of phosphorylated peptides with downregulated proteins is listed in Table 6.10. 6.3.14.7â•… Differences of Phosphorylated Proteins.╇ Differences between the tables include five phosphorylated peptides that overlapped with upregulated proteins that have energy metabolism predicted functions (CC 1471, 0726, 2056, 0342, 2261), while there were no predicted energy metabolism that overlapped with the downregulated proteins. For the predicted protein synthesis function, there were three downregulated ribosomal proteins (CC 1262, 1668, 1923) that overlapped with phosphorylated peptides, while the upregulated proteins overlapped with two synthetase proteins (CC 0064, 1892) and two translation elongation factor proteins (CC 3199, 3200). 6.3.14.8â•… Localization of Phosphorylated Proteins.╇ The localization of the overlapping proteins listed in Tables 6.9 and 6.10 is also different. The upregulated proteins’ localization included 52% cytoplasmic, 13% cytoplasmic membrane, 4% periplasmic, 9% outer membrane, and 22% unknown. The downregulated proteins’ localization included 20% cytoplasmic, 7% cytoplasmic membrane, 20% periplasmic, 33% outer membrane, and 20% unknown. The localization of the overlap of the phosphorylated peptides with the upregulated proteins is primarily cytoplasmic at 52% followed by 26% membrane associated, while the downregulated is 20% cytoplasmic with a 60% localization that is membrane associated. 6.3.14.9â•… Direct Relationships Observed.╇ Finally, a direct relationship was observed between the regulated proteins of the normal (unmodified) proteome and the regulated phosphorylated peptides listed in Table 6.7. Specifically, we have seen upregulation in the phosphorylation of the NAD-dependent GDH (CC 0088) and upregulation in the normal (unmodified) proteome of CC 0088, both under carbonstarved conditions. A second direct relationship was observed for the arylesterase-related protein (CC 3441) where both downregulation in its phosphorylation was observed along with downregulation in the normal (unmodified) proteome, both under carbon-starved conditions.
241
CC #
1541 2556
0435
1471 0726 2056 0342
2261 0088 1037 1015
3504 0064 1892 3199 3200 1658
2900
0086
1705 0502 0712 2369
Peptide
SRPEAAIAS*GF T*GLT*AARALIAGGAK
IAALLYDLAGISLPDS* KATLVY*S*R GGMT*SHAAVVAR DAT*HKT*VT*AALK VGFAATGGT*TPAPVYDR IIDS*T*GALS*LPEVPK
Y*KGASLPSTESLATTLVR QQSGLDPS*QQVQEAALR DKGLVLVT*ET*R FS*ARLAGVEAQIK
Y*GNFDKLAELSEARTK MSKS*DPSDYSR NQVWAIPAPTGGS*R GQVLCKPGSITPHT*K TGDT*LCDPLKSPVILER NETNDKQLSLLVS*A
PS*Y*VLGGRGMEIIR
ALACDPTSAFGGIVAVNS*R
DAGMT*ACVAKPVS*AR S*NQSTCINQRPLVK QIRVLDLPGTY*SLR EAET*RALLAS*GR
Phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase Sensor histidine kinase/response regulator DNA-directed RNA polymerase, beta subunit Ferrous iron transport protein B Hydantoinase/oxoprolinase family protein
Carbamoyl-phosphate synthase, large subunit
Pyruvate phosphate dikinase Electron transfer flavoprotein, alpha subunit 6-Phospho-glucono-lactonase 2-Oxoglutarate dehydrogenase, E3 component, lipoamide dehydrogenase Phosphoglycerate mutase Conserved hypothetical protein Hypothetical protein Type I secretion system outer membrane protein RsaF Peptidase M13 family protein Tryptophanyl-tRNA synthetase Aspartyl-tRNA synthetase Translation elongation factor EF-Tu Translation elongation factor G Amidophosphoribosyltransferase
2-Isopropylmalate synthase UDP-N-acetylmuramoylalanine-d-glutamate ligase Chemotaxis protein methyltransferase CheR
Annotation
TABLE 6.9.╇ Overlap of Phosphorylated Peptides and Upregulated Proteins
Protein fate Protein synthesis Protein synthesis Protein synthesis Protein synthesis Purines, pyrimidines, nucleosides, and nucleotides Purines, pyrimidines, nucleosides, and nucleotides Purines, pyrimidines, nucleosides, and nucleotides Regulatory functions Transcription Transport and binding proteins Unknown function
Energy metabolism Hypothetical proteins No data Protein fate
Energy metabolism Energy metabolism Energy metabolism Energy metabolism
Cellular processes
Amino acid biosynthesis Cell envelope
Predicted Function
242 CC #
3606 1318 1691 1915 3489 2451 3551 2160 1894 1262 1668 1923 0925 3146 3441
Peptide
MLKFTT*VAR ES*AARS*AVEGAKR
KTKPAT*PAPAFS*GAA ISATVT*PKVVELPQK ALAYRAGGDY*ETVLR
FDLGNET*S*ALTAK LDLT*T*PGGRAR ASEIVSASIKDS*VK VT*KIT*PGAVATLDS*VR DANVGGEVLCRVY* GKIVPS*RITAVS*AK EVAAAGGRVLFVGT*KR MNS*T*KGCVRAR LDSATSTSALRASEFET*Y*GAR T*AAKAPAAET*APAAK
Glutamate synthase, small subunit Outer membrane protein TolC, putative Membrane protein, putative Outer membrane protein Penicillin-binding protein AmpH, putative DNA topoisomerase I Hypothetical protein Phasin family protein Rotamase family protein Ribosomal protein S8 Ribosomal protein S18 Ribosomal protein S2 TonB-dependent receptor TonB-dependent receptor Arylesterase-related protein
Annotation
TABLE 6.10.╇ Overlap of Phosphorylated Peptides and Downregulated Proteins
DNA metabolism No data No data Protein fate Protein synthesis Protein synthesis Protein synthesis Transport and binding proteins Transport and binding proteins Unknown function
Cell envelope Cell envelope Cell envelope
Amino acid biosynthesis Cell envelope
Predicted Function
THE COMPONENTS OF THE SER/THR/TYR PHOSPHOPROTEOMEâ•…â•… 243
6.3.15 Conclusions We have covered the report of the first in vivo site-specific, global, and gel-free phosphoproteome of C. crescentus under both normal and under carbon-starved environmental conditions. The normal phosphoproteome of C. crescentus was found similar to other recent bacterial studies to contain a relatively large ratio of pY-containing proteins as compared with eukaryotic systems. pY IP enrichment in conjunction with the global phosphoproteome analysis identified three specific kinases’ pY sites and the measurement of multiple pY-containing peptides (from three to nine) for TonB-receptor proteins. Using a labelfree approach, a signaling differential response was qualitatively and quantitatively observed for the environment-stressed growth conditions demonstrating that Ser/Thr/Tyr protein phosphorylation studies are applicable to prokaryotic systems. In a carbon-starved environment, C. crescentus undergoes a number of environmental adaptive processes including reduced and more specific signaling processes, less downregulation as compared with upregulation, with a more specific known localization assignment for the downregulated proteins, starvation-specific genes that undergo programmed expression, a decrease in the phosphorylation events associated with energy pathways, increase in some protein catabolism, and decreased motility. 6.3.16 Supplementary Material 6.3.16.1â•… Reviewing Spectra Using the SpectrumLook Software Package.╇ We suggest using a software package called SpectrumLook that allows readers to inspect the fragmentation (MS/MS) spectra for the phosphopeptides identified in this chapter. Using this software, readers can visually browse the MS/MS spectra that led to the phosphopeptide identifications, including viewing annotations for the identified b and y ions, and neutral loss ions where appropriate. This software is supported by the Microsoft Windows platform. The SpectrumLook package can be accessed at http://omics.pnl.gov/ software/SpectrumLook.php. The C. crescentus mass spectral data can be accessed at www.HamBooksOnline.com under PTM Book. Note: To access the file, type the aforementioned address and follow these steps : 1. SpectrumLook_Installer.msi—the installer. To install, doubleclick on the file and follow the installation prompts. During
244â•…â•… PROKARYOTIC PHOSPHORYLATION
installation, a shortcut to run the SpectrumLook program is placed at Start╯→╯Programs╯→╯PAST Toolkit╯→╯SpectrumLook. Alternatively, navigate to the C:\Program Files\SpectrumLook\ folder and double-click file “SpectrumLook.exe.” 2. Caulobacter_grouped.mzXML—the phosphopeptide spectra in mzXML format. 3. Caulobacter_grouped_syn.txt—a summary of the identifications determined by SEQUEST. See the Readme.txt file for a description of the columns in this file. 4. Caulobacter_grouped.ini—a parameter file that specifies the appropriate parameters for these data when browsing them with SpectrumLook. 5. Readme.txt and RevisionHistory.txt—text files that describe the SpectrumLook software. REFERENCES ╇ 1.╇ Amanchy, R.; Kalume, D.E.; Iwahori, A.; Zhong, J.; Pandey, A. Phosphoproteome analysis of HeLa cells using stable isotope labeling with amino acids in cell culture (SILAC). J. Proteome Res. 2005, 4(5), 1661–1671. ╇ 2.╇ Ballif, B.A.; Villen, J.; Beausoleil, S.A.; Schwartz, D.; Gygi, S.P. Phosphoproteomic analysis of the developing mouse brain. Mol. Cell. Proteomics 2004, 3(11), 1093–1101. ╇ 3.╇ Chalmers, M.J.; Kolch, W.; Emmett, M.R.; Marshall, A.G.; Mischak, H. Identification and analysis of phosphopeptides. J. Chromatogr. B 2004, 803, 111–120. ╇ 4.╇ Olsen, J.V.; Blagoev, B.; Gnad, F.; Macek, B.; Kumar, C.; Mortensen, P.; Mann, M. Global, in vivo, and site-specific phosphorylation dynamics in signaling networks. Cell 2006, 127, 635–648. ╇ 5.╇ Zhou, H.; Watts, J.D.; Aebersold, R. A systematic approach to the analysis of protein phosphorylation. Nat. Biotechnol. 2001, 19, 375–378. ╇ 6.╇ Beausoleil, S.A.; Jedrychowski, M.; Schwartz, D.; Elias, J.E.; Villen, J.; Li, J.; Cohn, M.A.; Cantley, L.C.; Gygi, S.P. Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 12130–12135. ╇ 7.╇ Stock, J.B.; Stock, A.M.; Mottonen, J.M. Signal transduction in bacteria. Nature 1990, 344, 395–400. ╇ 8.╇ Klumpp, S.; Krieglstein, J. Phosphorylation and dephosphorylation of histidine residues in proteins. Eur. J. Biochem. 2002, 269, 1067–1071. ╇ 9.╇ Pirrung, M.C. Histidine kinases and two-component signal transduction systems. Chem. Biol. 1999, 6, R167–R175.
REFERENCESâ•…â•… 245
10.╇ Saito, H. Histidine phosphorylation and two-component signaling in eukaryotic cells. Chem. Rev. 2001, 101, 2497–2509. 11.╇ Macek, B.; Mijakovic, I.; Olsen, J.V.; Gnad, F.; Kumar, C.; Jensen, P.R.; Mann, M. The serine/threonine/tyrosine phosphoproteome of the model bacterium Bacillus subtilis. Mol. Cell. Proteomics 2007, 6, 697–707. 12.╇ Macek, B.; Gnad, F.; Soufi, B.; Kumar, C.; Olsen, J.V.; Mijakovic, I.; Mann, M. Phosphoproteome analysis of E. coli reveals evolutionary conservation of bacterial Ser/Thr/Tyr phosphorylation. Mol. Cell. Proteomics 2008, 7(2), 299–307. 13.╇ Nierman, W.C.; Feldblyum, T.V.; Laub, M.T.; Paulsen, I.T.; Nelson, K.E.; Eisen, J.; Heidelberg, J.F.; Alley, M.R.K.; Ohta, N.; Maddocki, J.R.; Potocka, I.; Nelson, W.C.; Newton, A.; Stephens, C.; Phadkei, N.D.; Ely, B.; DeBoy, R.T.; Dodson, R.J.; Durkin, A.S.; Gwinn, M.L.; Haft, D.H.; Kolonay, J.F.; Smit, J.; Craven, M.B.; Khouri, H.; Shetty, J.; Berry, K.; Utterback, T.; Tran, K.; Wolf, A.; Vamathevan, J.; Ermolaeva, M.; White, O.; Salzberg, S.L.; Venter, J.C.; Shapiro, L.; Fraser, C.M. Complete genome sequence of Caulobacter crescentus. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 4136–4141. 14.╇ Jacobs-Wagner, C. Regulatory proteins with a sense of direction: cell cycle signaling network in Caulobacter. Mol. Microbiol. 2004, 51, 7–13. 15.╇ Iniesta, A.A.; McGrath, P.T.; Reisenauer, A.; McAdams, H.H.; Shapiro, L. A phospho-signaling pathways controls the localization and activity of a protease complex critical for bacterial cell cycle progression. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 10935–10940. 16.╇ Jacobs, C.; Hung, D.; Shapiro, L. Dynamic localization of a cytoplasmic signal transduction response regulator controls morphogenesis during the Caulobacter cell cycle. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 4095–4100. 17.╇ Ryan, K.R.; Huntwork, S.; Shapiro, L. Recruitment of a cytoplasmic response regulator to the cell pole is linked to its cell cycle-regulated proteolysis. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 7415–7420. 18.╇ Caccavo, F., Jr.; Ramsing, N.B.; Costerton, J.W. Morphological and metabolic responses to starvation by the dissimilatory metal-reducing bacterium Shewanella alga BrY. Appl. Environ. Microbiol. 1996, 62(12), 4678–4682. 19.╇ van Overbeek, L.S.; Eberl, L.; Givskov, M.; Molin, S.; van Elsas, J.D. Survival of, and induced stress resistance in, carbon-starved Pseudomonas fluorescens cells residing in soil. Appl. Environ. Microbiol. 1995, 61(12), 4202–4208. 20.╇ Jenkins, D.E.; Chaisson, S.A.; Matin, A. Starvation-induced cross protection against osmotic challenge in Escherichia coli. J. Bacteriol. 1990, 172(5), 2779–2781. 21.╇ Jenkins, D.E.; Schultz, J.E.; Matin, A. Starvation-induced cross protection against heat or H2O2 challenge in Escherichia coli. J. Bacteriol. 1988, 170(9), 3910–3914.
246â•…â•… PROKARYOTIC PHOSPHORYLATION
22.╇ Matin, A. The molecular basis of carbon-starvation-induced general resistance in Escherichia coli. Mol. Microbiol. 1991, 5(1), 3–10. 23.╇ Spector, M.P.; Park, Y.K.; Tirgari, S.; Gonzalez, T.; Foster, J.W. Identification and characterization of starvation-regulated genetic loci in Salmonella typhimurium by using Mu d-directed lacZ operon fusions. J. Bacteriol. 1988, 170(1), 345–351. 24.╇ Givskov, M.; Eberl, L.; Molin, S. Responses to nutrient starvation in Pseudomonas putida KT2442: two-dimensional electrophoretic analysis of starvation- and stress-induced proteins. J. Bacteriol. 1994, 176(16), 4816–4824. 25.╇ Givskov, M.; Eberl, L.; Moller, S.; Poulsen, L.K.; Molin, S. Responses to nutrient starvation in Pseudomonas putida KT2442: analysis of general cross-protection, cell shape, and macromolecular content. J. Bacteriol. 1994, 176(1), 7–14. 26.╇ Jouper-Jaan, A.; Goodman, A.E.; Kjelleberg, S. Bacteria starved for prolonged periods develop increased protection against lethal temperatures. FEMS Microbiol. Ecol. 1992, 101, 229–236. 27.╇ Nystrom, T.; Olsson, R.M.; Kjelleberg, S. Survival, stress resistance, and alterations in protein expression in the marine Vibrio sp. strain S14 during starvation for different individual nutrients. Appl. Environ. Microbiol. 1992, 58(1), 55–65. 28.╇ Qian, W.J.; Goshe, M.B.; Camp, D.G., II; Yu, L.R.; Tang, K.; Smith, R.D. Phosphoprotein isotope-coded solid-phase tag approach for enrichment and quantitative analysis of phosphopeptides from complex mixtures. Anal. Chem. 2003, 75, 5441–5450. 29.╇ Garcia, B.A.; Shabanowitz, J.; Hunt, D.F. Analysis of protein phosphorylation by mass spectrometry. Methods 2005, 35, 256–264. 30.╇ Salih, E. Phosphoproteomics by mass spectrometry and classical protein chemistry approaches. Mass Spectrom. Rev. 2005, 24, 828–846. 31.╇ Smith, R.D.; Anderson, G.A.; Lipton, M.S.; Pasa-Tolic, L.; Shen, Y.; Conrads, T.P.; Veenstra, T.D.; Udseth, H.R. An accurate mass tag strategy for quantitative and high-throughput proteome measurements. Proteomics 2002, 2, 513–523. 32.╇ Ndassa, Y.M.; Orsi, C.; Marto, J.A.; Chen, S.; Ross, M.M. Improved immobilized metal affinity chromatography for large-scale phosphoproteomics applications. J. Proteome Res. 2006, 10, 2789–2799. 33.╇ Ham, B.M.; Yang, F.; Jaitly, N.; Jayachandran, H.; Monroe, M.E.; Gritsenko, M.A.; Zhao, R.; Purvine, S.O.; Orton, D.; Anderson, D.J.; Moore, R.J.; Camp, D.G., II; Rossie, S.; Smith, R.D. The influence of sample preparation and replicate analyses on HeLa cell phosphoproteome coverage. J. Proteome Res. 2008, 7(6), 2215–2221. 34.╇ Kelly, R.T.; Page, J.S.; Luo, Q.; Moore, R.J.; Orton, D.J.; Tang, K.; Smith, R.D. Chemically etched open tubular and monolithic emitters for
REFERENCESâ•…â•… 247
nanoelectrospray ionization mass spectrometry. Anal. Chem. 2006, 78, 7796–7801. 35.╇ Bradshaw, R.A.; Burlingame, A.L.; Carr, S.; Aebersold, R. Reporting protein identification data: the next generation of guidelines. Mol. Cell. Proteomics 2006, 5, 787–788. 36.╇ Jacobs, C.; Ausmees, N.; Cordwell, S.J.; Shapiro, L.; Laub, M.T. Functions of the CckA histidine kinase in Caulobacter cell cycle control. Mol. Microbiol. 2003, 47, 1279–1290. 37.╇ Finn, R.D.; Mistry, J.; Schuster-Böckler, B.; Griffiths-Jones, S.; Hollich, V.; Lassmann, T.; Moxon, S.; Marshall, M.; Khanna, A.; Durbin, R.; Eddy, S.R.; Sonnhammer, E.L.L.; Bateman, A. Pfam: clans, web tools and services. Nucleic Acids Res. 2006, 34(Database Issue), D247–D251. 38.╇ Walderhaug, M.O.; Polarek, J.W.; Voelkner, P.; Daniel, J.M.; Hesse, J.E.; Altendorf, K.; Epstein, W. KdpD and KdpE, proteins that control expression of the kdpABC operon, are members of the two-component sensoreffector class of regulators. J. Bacteriol. 1992, 174(7), 2152–2159. 39.╇ Schlessinger, J. Cell signaling by receptor tyrosine kinases. Cell 2000, 103(2), 211–225. 40.╇ Shiro, Y.; Nakamura, H.; Kumita, H.; Kikuchi, A. Roles of non-covalent interactions in gas sensing by sensor histidine kinases of two-component regulatory systems. RIKEN Rev. 2002, 46, 39–41. 41.╇ Wingrove, J.A.; Gober, J.W. Identification of an asymmetrically localized sensor histidine kinase responsible for temporally and spatially regulated transcription. Science 1996, 274(5287), 597–601. 42.╇ Wolanin, P.M.; Thomason, P.A.; Stock, J.B. Histidine protein kinases: key signal transducers outside the animal kingdom. Genome Biol. 2002, 3(10), REVIEWS3013. 43.╇ Perego, M.; Hoch, J.A. Protein aspartate phosphatases control the output of two-component signal transduction systems. Trends Genet. 1996, 12(3), 97–101. 44.╇ Russo, F.D.; Silhavy, T.J. The essential tension: opposed reactions in bacterial two-component regulatory systems. Trends Microbiol. 1993, 1(8), 306–310. 45.╇ Wu, J.; Ohta, N.; Zhao, J.L.; Newton, A. A novel bacterial tyrosine kinase essential for cell division and differentiation. Proc. Natl. Acad. Sci. U.S.A. 1999, 96(23), 13068–13073. 46.╇ Laub, M.T.; Chen, S.L.; Shapiro, L.; McAdams, H.H. Genes directly controlled by CtrA, a master regulator of the Caulobacter cell cycle. Proc. Natl. Acad. Sci. U.S.A. 2002, 99(7), 4632–4637. 47.╇ Hung, D.Y.; Shapiro, L. A signal transduction protein cues proteolytic events critical to Caulobacter cell cycle progression. Proc. Natl. Acad. Sci. U.S.A. 2002, 99(20), 13160–13165.
248â•…â•… PROKARYOTIC PHOSPHORYLATION
48.╇ Gardy, J.L.; Laird, M.R.; Chen, F.; Rey, S.; Walsh, C.J.; Ester, M.; Brinkman, F.S.L. PSORTb v.2.0: expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis. Bioinformatics 2005, 5, 617–623. 49.╇ Gupta, S.; Pandit, S.B.; Srinivasan, N.; Chatterji, D. Proteomics analysis of carbon-starved Mycobacterium smegmatis: induction of Dps-like protein. Protein Eng. 2002, 15(6), 503–512. 50.╇ Phadke, N.D.; Molloy, M.P.; Steinhoff, S.A.; Ulintz, P.J.; Andrews, P.C.; Maddock, J.R. Analysis of the outer membrane proteome of Caulobacter crescentus by two-dimensional electrophoresis and mass spectrometry. Proteomics 2001, 1(5), 705–720. 51.╇ Koebnik, R. TonB-dependent trans-envelope signalling: the exception or the rule? Trends Microbiol. 2005, 13(8), 343–347. 52.╇ Conesa, A.; Gotz, S.; Garcia-Gomez, J.M.; Terol, J.; Talon, M.; Robles, M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 2005, 21, 3674–3676. 53.╇ Somers, E.; Keijers, V.; Ptacek, D.; Halvorsen Ottoy, M.; Srinivasan, M.; Vanderleyden, J.; Faure, D. The salCAB operon of Azospirillum irakense, required for growth on salicin, is repressed by SalR, a transcriptional regulator that belongs to the Lacl/GalR family. Mol. Gen. Genet. 2000, 263, 1038–1046. 54.╇ Minambres, B.; Olivera, E.R.; Jensen, R.A.; Luengo, J.M. A new class of glutamate dehydrogenases (GDH). Biochemical and genetic characterization of the first member, the AMP-requiring NAD-specific GDH of Streptomyces clavuligerus. J. Biol. Chem. 2000, 275(50), 39529–39542. 55.╇ Duncan, P.A.; White, B.A.; Mackie, R.I. Purification and properties of NADP-dependent glutamate dehydrogenase from Ruminococcus flavefaciens FD-1. Appl. Environ. Microbiol. 1992, 58(12), 4032–4037. 56.╇ Hemmings, B.A. Phosphorylation and proteolysis regulate the NADdependent glutamate dehydrogenase from Saccharomyces cerevisiae. FEBS Lett. 1980, 122(2), 297–302. 57.╇ Uno, I.; Matsumoto, K.; Adachi, K.; Ishikawa, T. Regulation of NADdependent glutamate dehydrogenase by protein kinases in Saccharomyces cerevisiae. J. Biol. Chem. 1984, 259(2), 1288–1293.
7
Prokaryotic Phosphorylation of Histidine 7.1 PHOSPHOHISTIDINE AS POSTTRANSLATIONAL MODIFICATION (PTM) The phosphoryl modification of serine (Ser), threonine (Thr), and tyrosine (Tyr) is a PTM primarily associated with regulating enzyme activity in eukaryotic systems. The phosphoester linkage resulting with the PTM of Ser, Thr, and Tyr is relatively stable at physiological pH (∼7.36) and thus, generally requires a phosphatase for the removal of the modification. The phosphorylation and subsequent dephosphorylation of proteins are activities associated in biological systems with respect to cellular signaling pathways. In prokaryotic biological systems, the histidine (His) group is phosphorylated primarily for the purpose of transferring the phosphate group from one biomolecule to another. These biomolecules are known as phosphodonor and phosphoacceptor molecules where the phosphohistidine acts as a high-energy intermediate in some type of biological process on the molecular level.1 The phosphohistidine transfer potential (ΔGo of transfer) of the phosphate is estimated at −12 to −14╯kcal/mol, reflecting a relatively high-energy system.2 The phosphorylation of the His residue produces a phosphoramidate that contains a large standard free energy of hydrolysis, making them the most unstable form of phosphoamino acids. The stability of the phosphohistidine is largely influenced by its local amino acid residues and the nature and makeup of the associated protein. It is speculated due to this instability that often, a His phosphatase is not required. It has been estimated that up to 6% of the phosphorylation that takes place in eukaryotes3 and prokaryotes4 is with the His amino acid residue. It is also estimated that the abundance of phosphohistidine is 10- to 100-fold greater than that of phosphotyrosine but much
Proteomics of Biological Systems: Protein Phosphorylation Using Mass Spectrometry Techniques, First Edition. Bryan M. Ham. © 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 249
250â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE O H2N
CH
C
OH
CH2
N NH
Histidine H2N
O H2N
CH
C
CH
C
OH
CH2
OH
CH2
O
O
N
O P
P
O N
O
1-Phosphohistidine Figure 7.1.╇ Structures 3-phosphohistidine.
O
N N
of
unmodified
O
3-Phosphohistidine histidine,
1-phosphohistidine,
and
less that phosphoserine (pS).3 The phosphorylation of the His amino acid residue can take place on either the nitrogen in the 1-position or the nitrogen in the 3-position. This is illustrated in Figure 7.1 for the phosphorylation of the His amino acid residue. Studies have demonstrated that the 3-phosphohistidine is more stable than the 1-phosphohistidine5 and is therefore more likely to be the positional isomer observed in biological systems. The His amino acid residue is phosphorylated by kinase using the reactive intermediate adenosine triphosphate (ATP). This process is illustrated in Figure 7.2 where the modification produces the 3-phosphohistidine residue. 7.2 BACTERIAL KINASES AND THE TWO-COMPONENT SYSTEM The bacterial His kinases of the two-component system is an example of phosphohistidines found in prokaryotic systems. Figure 7.3 is a drawing of a two-component signaling system that involves His
MEASUREMENT OF PHOSPHORYLATED HIS (PH)â•…â•… 251 O
O R
N H
CH
C
CH2
O
R
R
NH
CH
O
C
R
CH2
ATP Kinase
N
N H
N O
N P
Histidine O
O
3-Phosphohistidine Figure 7.2.╇ Kinase phosphorylation of the histidine amino acid residue producing 3-phosphohistidine.
phosphorylation and dephosphorylation. The figure illustrates a membrane-bound protein that contains a carboxy-terminal His kinase domain. The subsequent steps involved include the binding of a ligand, which is followed by dimerization of two kinase. Phosphorylation of the His residue is done through the reactive intermediate ATP. The next step is the binding of a cytosolic response regulator protein that contains an aspartate residue in the amino-terminal domain. The aspartate residue in the response regulator is then phosphorylated by the first membrane-bound protein and released to perform its downstream function. There are various bacterial two-component His kinases signaling systems that have been observed and characterized. Table 7.1 is a listing of some of the bacteria that have been observed to have the two-component system. 7.3 MEASUREMENT OF PHOSPHORYLATED HIS (PH) 7.3.1 Stabilities of Phosphorylated Amino Acids The major difficulty in measuring the phosphorylation of a His residue in a peptide is due to the instability of the modification. The pH is very unstable in an acidic environment that is typically the matrix that the phosphorylated peptide is contained within during normal proteomic preparation steps prior to mass spectral analyses. Table 7.2 lists the acidic or alkaline stabilities of the phosphorylated amino acids. Due to the acid lability of the pH residue, it is not observed during most phosphoprotome studies using mass spectrometric techniques.
252â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
Binding of ligand
N H
Kinase dimerization
N H
C
C
ATP N
N H
H
Autophosphorylation
P
H
P
H C
N D
N
N
C
C
N
RR
Phosphorylation of response regulator
H
D
C
C
C
P
N
C D
C N
P
P
Binding of response regulator
Dissociation of response regulator
H C
DNA or protein binding N
D Dephosphorylation C
Figure 7.3.╇ A model for two-component signaling systems involving the phosphorylation of the histidine amino acid residue. RR, response regulator. (Reprinted with permission. This article was published in Chem Biol, Pirrung, M.C. Histidine kinases and two-component signal transduction systems. 1999, 6, R167–R175. Copyright Elsevier 1999.)
7.3.2 Immobilized Metal Affinity Chromatography (IMAC) and Mass Spectrometry (MS) Current studies of His phosphorylated peptides are incorporating neutral-level pH methodologies to preserve the modification in conjunction with enrichment approaches such as immobilized copper(II) ion affinity chromatography.6 Figure 7.4a illustrates a
MEASUREMENT OF PHOSPHORYLATED HIS (PH)â•…â•… 253
TABLE 7.1.╇╛Some Bacterial Two-Component Signaling System Histidine Kinases Histidine Kinase
CheA
Control Function
Source
Receiver
E. coli
CheY
E. coli
OmpR
Enterococcus faecium E. coli E. coli
VanR
Myxococcus xanthus E. coli
FrzE (internal receiver)
Agrobacter tumefaciens E. coli
VirA (internal receiver)
ArcB
Chemotaxia, flagellar motor Osmosensing, outer membrane proteins Cell wall biosynthesis Osmosensing Phosphate metabolism Fruiting body formation Cell capsule synthesis Host recognition, transformation Anaerobiosis
BvgS
Virulence
Bordetella pertussis
EnvZ VanS KinA PhoR FrzE RscC VirA
SpoOF PhoB
RscC (internal receiver)
ArcB (internal receiver) and ArcA BvgS (internal receiver)
Reprinted with permission. This article was published in Chem Biol, Pirrung, M.C. Histidine kinases and two-component signal transduction systems, 1999, 6, R167–R175. Copyright Elsevier 1999.
TABLE 7.2.╇╛Acid and Alkaline Stabilities of the Phosphorylated Amino Acids Phosphoamino Acid
Acid Stable
Alkali Stable
N-phosphates
â•… Phosphoarginine â•… Phosphohistidine â•… Phospholysine
No No No
No Yes Yes
O-phosphates
â•… Phosphoserine â•… Phsophothreonine â•… Phosphotyrosine
Yes Yes Yes
No Partial Yes
Acyl-phosphate
â•… Phosphoaspartate
No
No
(a)
995.5
100 90 288.1
Relative Abundance
80
813.4 884.5
70 544.5
60
627.3
50 40 446.6
30 20
726.5
399.3
10
1108.7 1243.6
0 200 (b)
400
600
800 m/z
1000
1200
1400
584.6
100 90
Relative Abundance
80 70
622.4
60 1322.8
50 40 30
1324.8
564.7
20 485.6 288.1 368.7
10
664.5 682.8
1166.5 699.5 893.6 1034.6
0 200
400
1075.5 1320.7 1304.7
600
800 m/z
1000
1200
1400
Figure 7.4.╇ (a) Product ion mass spectrum of a nonphosphorylated peptide at m/z 642.4 as [M╯+╯2H]2+. Good coverage of the peptide backbone is observed with the y series ions allowing the sequencing of the peptide. (b) Product ion spectrum of the same peptide after phosphorylation of a histidine residue, at m/z 682.0 as [M╯+╯2H]2+. Major product ion observed is for the neutral loss of the phosphate moiety and water at m/z 633.1 as [M╯+╯2H╯−╯HPO3╯−╯H2O]2+.
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 255
nonphosphorylated peptide’s product ion mass spectrum collected using electrospray ionization (ESI) and a quadrupole ion trap. Very good coverage of the peptide backbone is observed with the y series ions allowing the sequencing of the peptide. The precursor was a doubly protonated species at m/z 642.4 as [M╯+╯2H]2+. Figure 7.4b is the product ion spectrum of the same peptide after phosphorylation of a His residue contained in the peptide chain. The mass spectrum was collected on the doubly protonated species at m/z 682.0 as [M╯+╯2H]2+. The major product ion observed in the spectrum is for the neutral loss of the phosphate moiety and water at m/z 633.1 as [M╯+╯2H╯−╯HPO3╯−╯H2O]2+. There are a couple of important considerations when considering product ion spectra of phosphorylated amino acid residues. When measuring these species in positive ion mode, the phosphorylation of the His residue, as illustrated in Figures 7.1 and 7.2, actually results in the addition of 79.9663╯Da (HPO3) to the mass of the peptide. The ATP is contributing, in a sense, HPO3 to the peptide as illustrated in Figure 7.5a. In physiological conditions, the addition is more likely to be PO3−, but measured by MS in positive ion mode, it is HPO3. During collisioninduced dissociation (CID) product ion spectral accumulation, the loss due to the phosphoryl modification is also at 79.9663╯Da for HPO3. This would suggest that in positive ion mode, the gas-phase fragmentation mechanism would be something similar to that illustrated in Figure 7.5b. The phosphoryl group on the His amino acid residue is presented as being fully protonated. The channel pathway during the fragmentation process results in the neutral loss of HPO3 and the reprotonated, original form of the His residue without a change in the charge state (CS) of the peptide. Similar to the product ion spectra of the phosphorylated Ser, Thr, and Tyr (also sulfated) amino acid residues, the pH product ion spectra are primarily dominated by the neutral loss of the modification. Future work utilizing mass spectrometric methodologies such as electron capture dissociation (ECD) and electron transfer dissociation (ETD) will also be beneficial in the gas-phase analysis of pH peptides.
7.4 IN VITRO AND IN VIVO STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MS 7.4.1 Introduction The phosphoryl PTM of Ser, Thr, and Tyr amino acid residues is primarily associated with regulating enzyme activity in eukaryotic systems for
256â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE OH
(a) O
P
N H2 N
OH
O
N
O
OH
O
P
OH
O
P
N
O
O
N
O
O
P
Contributes 79.9663 Da as HPO3
OH HO
OH
Adenosine triphosphate (ATP)
(b) R
O N H
CH
O
C
O
R
R
N H
CH2
CH
C
+
O
O
OH P
N N
R
CH2
CID N
O
O
N H
P HO
OH
Neutral loss of 79.9663 Da as HPO3
Figure 7.5.╇ (a) ATP contributing HPO3 to the peptide when considered mass spectrometrically in positive ion mode. (b) Collision-induced dissociation product ion spectra illustrate the loss due to the phosphoryl modification at 79.9663╯Da for HPO3.
signal transduction cellular processes. Numerous MS-based studies have been applied to identify Ser/Thr/Tyr protein phosphorylation sites for signaling pathways involved with protein kinases and phosphatases. More studies are now being reported of prokaryotic biological systems in identifying Ser/Thr/Tyr protein phosphorylation sites for signaling pathways. It is known that the His amino acid residue in prokaryotes is phosphorylated primarily for the purpose of transferring the phosphate group from a donor to an acceptor who has a conserved aspartate
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 257
residue accepting the phosphorylation. However, the measurement of the pH-containing peptides utilizing mass spectral techniques has been challenging due to the extreme lability of the phosphohistidine moiety. Phosphohistidine-containing peptides were synthesized and used to study the product ions produced in during CID in the presence and absence of peptide methylation and to set up sample preparation methodologies that would preserve the phosphate moiety for mass spectral measurement. An approximate neutral pH (6.7) nano-high-performance liquid chromatography (HPLC) system was set up, allowing the preservation and separation of the phosphohistidine-containing peptides. This, in conjunction with a rapid digestion using the Glu-C proteinase and copper(II)-based IMAC, allowed the preservation and measurement of the phosphohistidine-containing peptides. The developed global methodology was applied to the model bacterium Caulobacter crescentus where 102 unique phosphohistidine-containing peptides from 99 proteins were qualitatively measured and identified. 7.4.2 Background of Study 7.4.2.1â•… Bacteria Models of Ser/Thr/Tyr Phosphorylation.╇ Reversible phosphorylation of Ser, Thr, and Tyr residues in proteins represents a prominent mechanism in eukaryotes for regulating cellular processes involving signal transduction.7 Analytical approaches utilizing MS have become broadly applied to identify Ser/Thr/Tyr protein phosphorylation sites.8–11 Numerous MS-based studies have been reported for signaling pathways primarily involved with protein kinases,12–17 and the study of networks and pathways involving phosphatase enzymes.18–21 Recently, the observance of Ser/Thr/Tyr phosphorylation has been reported in the model gram-positive bacteria Bacillus subtilis using phosphoproteomic techniques where 103 unique phosphopeptides were identified from 78 B. subtilis proteins,22 and Escherichia coli where 81 phosphorylation sites on 79 proteins were identified.23 Another recent study reported using the model αproteobacterium (α-purple bacterium) C. crescentus the identification of 226 phosphorylation sites on 135 C. crescentus phosphorylated proteins including 107 sites on Ser, 97 on Thr, and 22 on Tyr (see Chapter 6). These studies demonstrate that the signaling processes involved with Ser/Thr/Tyr phosphorylation are a more general regulatory process applicable to both eukaryotes and prokaryotes. However, the events involving the prokaryotic two-component signaling system, particularly the phosphorylation event of the His amino acid residue, has not seen a similar global analysis.
258â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
7.4.2.2â•… Prokaryotic Phosphorylation of His.╇ In prokaryotic biological systems, the His group is phosphorylated primarily for the purpose of transferring the phosphate group from one biomolecule to another. These biomolecules are known as phosphodonor and phosphoacceptor molecules where the phosphohistidine is acting as a high-energy intermediate in a biological process on the molecular level.1 The phosphoryl group on the donor is transferred to an aspartate residue on the receiver. The bacterial His kinases of the two-component system is an example of phosphohistidines found in prokaryotic systems. The phosphohistidine transfer potential (ΔGo of transfer) of the phosphate is estimated at −12 to −14╯kcal/mol, reflecting a relatively high-energy system.2 The phosphorylation of the His residue produces a phosphoramidate that contains a large standard free energy of hydrolysis, making them the most unstable form of phosphoamino acids. The stability of the phosphohistidine is largely influenced by its local amino acid residues and the nature and makeup of the associated protein. It is estimated that up to 6% of the phosphorylation that takes place in eukaryotes3 and prokaryotes4 is with the His amino acid residue. It is also estimated that the abundance of phosphohistidine is 10- to 100-fold greater than that of phosphotyrosine but much less than pS.3 Phosphorylation of the His residue is an important protein modification but has remained relatively unstudied, utilizing mass spectral techniques due to its lability. 7.4.2.3â•… C. crescentus.╇ C. crescentus is an aquatic gram-negative αproteobacterium (α-purple bacterium) that can exist in two forms, either flagellated or possessing a stalk. C. crescentus undergoes an unequal asymmetric cell division that produces a motile swarmer cell and a sessile stalked cell at the end of each cell cycle. C. crescentus is a key model system used for the study of the developmental processes of bacteria. We have applied mass spectrometric techniques aimed at the preservation and subsequent measurement of synthetically prepared phosphohistidine-containing peptides and pH-containing peptides from C. crescentus, and reported on the CID fragmentation behavior of phosphohistidine-containing peptides and the initial application of the methodology to C. crescentus. 7.4.2.4â•… Mass Spectral Measurement of Phosphohistidine.╇ In spite of the phosphohistidine residue’s instability, there are an increasing number of studies being reported of its measurement using mass spectrometric techniques, though primarily confined to single protein isolation and measurement. These studies include a His-569 phosphorylated protein (designated as protein1) from Thauera aromatica involved in
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 259
phosphorylation of phenol by phenylphosphate synthase,24 a His-48 phosphorylation of a purified bacterial chemotaxis protein CheA,25 a His-15 phosphorylation of the His-containing protein (HPr) from the E. coli phosphoenolpyruvate-sugar phosphotransferase system (PTS),6 the purified cytoplasmic part of the His kinase EnvZ of E. coli,26 a His18 and His-75 phosphorylation of purified bovine histone H4 His kinase (HHK), and recombinant Drosphila phosphohistone H4.27 Finally, a review of the methodology for His phosphorylation enrichment and measurement has been reported in “Methods of Enzymology,” but the chapter presents the PTS protein (HPr) analysis and suggests that large-scale identification of His-phosphorylated proteins may be obtainable with the adaptation of that methodology.28 We will look at studies that have applied mass spectrometric techniques aimed at the preservation and subsequent measurement of synthetically prepared phosphohistidine-containing peptides and phosphorylated HPrs from C. crescentus, and report on the CID fragmentation behavior of phosphohistidine-containing peptides and a more global measurement of the phosphohistidine proteome in C. crescentus. 7.4.3 Optimized Methodology for Phosphohistidine Studies 7.4.3.1â•… In Vitro Selective pHis Phosphorylation.╇ The synthetic peptide INHDLR (EZBiolab Inc., Westfield, IN), the anaphylatoxic peptide C3a (human) HLGLAR (Sigma-Aldrich, St. Louis, MO), angiotensin I (angio I) DRVYIHPFHL (Fluka BioChemika, Buchs, Switzerland, ≥╯97.0%), and angiotensin II (angio II) DRVpYIHPF (Calbiochem, Darmstadt, Germany) were all selectively phosphorylated on the His residue using potassium phosphoramidate. Figure 7.6 illustrates the peptides used for the in vitro phosphohistidine studies including the nonphosphorylated form on the left and the phosphorylated form of the peptide on the right. Potassium phosphoramidate (H3PO3NK) is a mild phosphorylating reagent that does not modify the hydroxy-amino acid residues Ser, Thr, and Tyr.29–31 Potassium phosphoramidate (not commercially available) was prepared using a modified method of Pirrung et al.5 Briefly, the following steps were used to synthesize the potassium phosphoramidate: 1. Chill for 30 minutes 2.5╯mL 30% NH4OH in a 10-mL round bottom reaction flask in an ice/water bath. 2. Add (caution: highly exothermic reaction, add slowly/dropwise with shaking) 200╯µL of phosphorus oxychloride (POCl3, Fluka puriss╯≥╯99.0% pure) into the chilled flask.
260â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
Nonphosphorylated
Phosphorylated Forms
INHDLR
INpHDLR
HLGLAR (anaphylatoxic peptide C3a)
pHLGLAR
DRVYIHPFHL (angio I)
DRVYIpHPFHL, DRVYIHPFpHL DRVYIpHPFpHL
DRVpYIHPF (angio II)
DRVpYIpHPF
SarRVYIHPT (angio II(Sar1Thr8))
SarRVpYIHPT, SarRVpYIpHPT SarRVpYIpHPpT
Figure 7.6.╇ Peptides used for the in vitro phosphohistidine studies. The nonphosphorylated form of the peptide is on the left and the phosphorylated form of the peptide on the right. The complexity increases from top to bottom where singly phosphohistidine can be studied, or multiple phosphorylation residues including pY, pT, and pH.
3. Allow the mixture to react for 30╯minutes in the ice/water bath. 4. Extract the solution with 1╯mL acetone and recover the aqueous layer. 5. Adjust the aqueous layer to pH 6 with acetic acid. 6. The mono-ammonium salt is precipitated with the addition of ethanol, usually about 2–5╯mL. 7. Collect the salt by centrifugation and dry in a SpeedVac (Eppendorf, Hamburg, Germany). 8. Dissolve the salt in 2.5╯mL of 50% KOH. 9. Reflux for 30╯minutes at 60°C. 10. Cool the solution and adjust the pH to pH ∼6 with acetic acid. 11. Precipitate the potassium phosphoramidate with ethanol, collect by centrifugation, and dry in a SpeedVac. For His phosphorylation, a modified version of Lasker et al.32 was used. The reaction scheme for the selective phosphorylation of the His amino acid residue is illustrated in Figure 7.7. The steps involved are as follows: 1. Dissolve ∼1╯mg of peptide in 200╯µL 10╯mM ammonium bicarbonate. 2. Adjust the solution pH to pH ∼8 with 6╯N KOH.
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 261
STEP 1 O
POCl3
+
PO3NH3
+
KOH
HO
NH2
P K
O
Potassium phosphoramidate STEP 2 O
O R
N H
CH
C
CH2
O
R
R
O HO
N H
C
O
R
CH2
NH2
P
CH
O N
N
K NH
O
N
Histidine
P O
O
3-Phosphohistidine Figure 7.7.╇ Reaction scheme: selective phosphorylation of histidine residue with the mild phosphorylating agent potassium phosphoramidate. Potassium phosphoramidate is not commercially available and is generated in the lab for use.
3. To the peptide solution, add 40╯mg of the potassium phosphoramidate prepared earlier. 4. React the solution overnight at room temperature. 7.4.3.2â•… In Vitro Phosphorylation of Angio II (Sar1Thr8).╇ For the nonmethylated phosphorylated form of angio II (Sar1Thr8), the following synthesis steps were followed: 1. Add 100╯µL of phosphorus oxychloride to 1╯mL of anhydrous acetone. 2. Add to ∼1╯mg of peptide 500╯µL of the phosphorus oxychloride/ acetone solution. 3. Allow the solution to react for 1 hour at room temperature, effectively phosphorylating the Tyr, His, and Thr residues. 4. Remove the solvent from the mixture in a SpeedVac.
262â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
5. Dissolve the resultant viscous oil in 200╯µL of 10╯mM ammonium bicarbonate. 6. Adjust the peptide solution to pH ∼8 with 6╯N KOH. 7. Analyze immediately or snap freeze in liquid nitrogen and store at −80°C until further used. 7.4.3.3â•… In Vitro Methylation of Peptides.╇ Thionyl chloride (SigmaAldrich) was used for methyl esterification of the peptides.33 The steps involved in methylating the peptides were as follows: 1. Only use dry peptide material that has been previously extensively dried in a SpeedVac. 2. Carefully add dropwise 40╯µL of thionyl chloride to 1╯mL methanol (both anhydrous). 3. Add thionyl chloride/methanol mixture to dry peptide at a ratio of 75╯µL thionyl chloride/methanol solution per 100╯µg peptide. 4. Vortex the reaction mixture for 5–10 minutes to ensure dissolution of the dry peptide material. 5. Sonicate the reaction mixture for 10 minutes. 6. Let the mixture react at room temperature for 1 hour. 7. Bring the methylated peptide to dryness in a SpeedVac (Eppendorf) and store at −80°C until further processed. For the methylated form of angio II (Sar1Thr8), 1╯mg of peptide was added to 730╯µL of a solution composed of 40╯µL of thionyl chloride in 1╯mL anhydrous methanol. The mixture was sonicated for 10 minutes at 37°C and allowed to react at room temperature for 1╯hour. The solvent was removed in the SpeedVac, and the peptide was reconstituted in 500╯µL acetone. To the peptide solution, 50╯µL of phosphorus oxychloride was added, and the solution was allowed to react at room temperature for 1 hour. The solvent was removed, and the viscous oil was reconstituted in 200╯µL 10╯mM ammonium bicarbonate with pH adjusted to 8 using 6╯N KOH and either analyzed immediately or snap frozen in liquid nitrogen and stored at −80°C until further used. 7.4.3.4â•… C. crescentus Cell Protein Extraction with V-8 Protease Digestion.╇ To a C. crescentus sample containing ∼42 UA600 cells (equates to 12╯mg total protein by bicinchoninic acid (BCA) analysis [Pierce Biotechnology Inc., Rockford, IL]), Roche Complete, Mini, ethylenediaminetetraacetic acid (EDTA)-free Protease Inhibitor Cocktail (Roche Applied Science, Mannheim, Germany) was added according
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 263
to the manufacturer’s suggested guidelines along with 6╯M guanidine HCl. Phosphatase inhibitors were added to give final concentrations of 5╯mM β-glycerophosphate, 5╯mM sodium fluoride, and 1╯mM sodium orthovanadate. The samples were vortexed for 1 minute and then placed on ice. After a period of 10 minutes, the samples were sonicated with five pulses at a duration of 30 seconds for each pulse. Then, the samples were vortexed and centrifuged at 13,000╯rpm for 5 minutes. The supernate representing the soluble protein fraction was removed, and the pellet was washed with water. The wash was combined with the supernate, and BCA assay was performed. Two aliquots containing 3.44╯mg total protein in each were desalted using Amicon Ultra-4 centrifugal filter devices (5000-Da molecular weight cutoff, Millipore Corp., Billerica, MA) by centrifuging at 4000╯×╯g for 55 minutes at 25°C. The samples were then digested with Staphylococcus aureus V-8 protease (Pierce Biotechnology, Inc.) at an enzymeâ•›:â•›protein ratio of 1:30.34 The V-8 protease is specific for cleavage at the carboxy-terminus side of glutamic acid and aspartic acid residues in the presence of a small amount of phosphate ion. The protease is stable at a pH of 7.6 in the presence of 6╯M urea, 5.5╯M guanidine HCl, and 0.5% sodium dodecyl sulfate (SDS), at 37°C (see Pierce Biotechnology, Inc.). The proteins were digested at pH ∼7.6 in 50╯mM NH4HCO3 for 30 minutes at 37°C, and an aliquot was taken for one-dimensional gel electrophoresis (1DE). Urea was added to make a 2╯M solution, and SDS was added to make a 0.1% solution. The solution was then digested for 30 minutes (total 60 minutes) at 37°C and an aliquot was taken for 1-DE, and the sample was split in half (∼1.7╯mg for each). One split was dried in the SpeedVac, snap frozen in liquid nitrogen, and stored at −80°C until further processed. Guanidine HCl was added to the second split to make a 2╯M solution and V-8 protease (1:30). The sample was then digested for 30 minutes (total 90 minutes) at 37°C. An aliquot was taken for 1-DE, and the sample was dried by SpeedVac and snap frozen in liquid nitrogen and stored at −80°C until further processed. 7.4.3.5â•… 1-D SDS-Polyacrylamide Gel Electrophoresis (PAGE).╇ To monitor the digestion of the C. crescentus, protein extracts by the V-8 endopeptidase samples were separated according to digestion times using 1-D SDS-PAGE as described elsewhere.35 Briefly, the separations were performed according to the manufacturer’s guidelines using a Mini-PROTEAN 3 Cell (Bio-Rad, Hercules, CA) and 1-mm-thick Ready Gel Tris-HCl gels with a 4%–20% gradient acrylamide composition (Bio-Rad). Precision Plus Protein Standards (Bio-Rad) ranged from 10 to 250. Prior to gel loading, the protein samples were mixed
264â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE Digested Digested Digested Undigested MW 90 minutes 60 minutes 30 minutes marker
250 kDa 150 100 75 50 37 25 20 15 10
Figure 7.8.╇ One-dimensional SDS-PAGE gel monitoring the V-8 proteinase digestion of C. crescentus as undigested, after 30 minutes digestion, 60 minutes, and 90 minutes. As reported in the literature, it appears that the digestion of the proteins is at least 90% complete within 30 minutes of reaction. MW, molecular weight.
with a dye solution that contained the reducing agent Bond-Breaker TCEP (Pierce Biotechnology, Inc.) and heated at 95°C for 4 minutes. Approximately 15–20╯µg of extracted protein determined by the BCA Protein Assay (Pierce) were subjected to SDS-PAGE at a constant voltage of 200╯V. Gels were fixed, stained, destained, and then stored until analyzed.35 Figure 7.8 illustrates a 1-D SDS-PAGE gel monitoring the V-8 proteinase digestion of C. crescentus as undigested, after 30 minutes digestion, 60 minutes, and 90 minutes. The digestion is essentially complete after 30 minutes. 7.4.3.6â•… Phosphohistidine Enrichment by Cu(II)-Based IMAC.╇ The study employed a custom-packed IMAC Macrotrap cartridge with a 50-µL bed volume (Michrom BioResources, Inc., Auburn, CA) for phosphopeptide enrichment. The general Cu(II) IMAC procedure was composed of the following steps (Fig. 7.9): 1. The column was activated with 500╯µL 200╯mM CuSO4. 2. Washed with 250╯µL 0.1% acetic acid. 3. Equilibrated with 500╯µL 50╯mM MES (2-(N-morpholino)ethanesulfonic acid, Sigma-Aldrich) buffer with 10% acetonitrile (ACN).6 4. Sample (∼1.7╯mg) dissolved in 50╯mM MES buffer with 10% ACN was loaded onto the IMAC cartridge. 5. Washed with 250╯µL MES/10% ACN buffer.
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 265
Immobilized Metal Affinity Chromatography “IMAC” Method Development Used to enrich phosphorylated peptides Iron(III)-based IMAC Serine/threonine/tyrosine residues Acidic conditions react with phosphate group ~2.5–3.5 pH Noncompatible with phosphohistidine Copper(II)-based IMAC Histidine residue Neutral conditions react with imidazole N Imidazole nitrogen (pK 6.0) protonated in pH 5.5 MES buffer Avoid nonspecific binding of non-PTM histidine Compatible with phosphohistidine Figure 7.9.╇ Summary of a comparison of iron(III)-based IMAC and copper(II)-based IMAC. Iron(III) is used under acidic conditions for the enrichment of the phosphorylated form of serine, threonine, and tyrosine. Copper(II) is used for approximately neutral conditions for the enrichment of the phosphorylated form of histidine.
6. Washed with 250╯µL of 0.1% acetic acid. 7. Washed with 250╯µL of water. 8. Eluted with 150╯µL of 250╯mM Na2HPO4 (pH ∼8.5). 7.4.3.7â•… Reversed-Phase (RP)/Nano-HPLC Separation.╇ Peptide mixtures from in vitro phosphohistidine standards and in vivo phosphohistidine C. crescentus extracts were separated using a modified Agilent Series 1100 HPLC phosphoproteome nano-HPLC platform composed of a detachable solid-phase extraction (SPE) precolumn for sample loading and desalting, a fused-silica capillary nanocolumn for peptide separation, and a flow splitter/restrictor system for mobile-phase flow control to ∼100╯nL/min. The SPE precolumns are 150-µm i.d. fused silica, ∼10╯cm long, packed in-house with 10-µm POROS R2 material (Applied Biosystems, Carlsbad, CA) to a bed length of 4–6╯cm. The SPE precolumns are fritted (Kasil® potassium silicate, PQ Corporation, Valley Forge, PA, chemical frit) for sample loading and desalting prior to analytical column separation. The analytical separation columns were composed of the following: (1) 50-µm i.d. fused silica (Polymicron Technologies Inc., Phoenix, AZ), 20╯cm long, packed in-house with 5-µm carbon 18 (C18) Jupiter RP material for in vitro standard phosphohistidine analysis and (2) 50-µm i.d. fused silica (Polymicron
266â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
Technologies Inc.), 40╯cm long, packed in-house with 5-µm C18 Jupiter RP material for in vivo phosphohistidine analysis. The tips coupled to the columns for electrospray are 10-µm i.d. PicoClear tips (New Objective, Inc., Woburn, MA). The SPE precolumn and tips are connected to the analytical column using PicoClear unions (New Objective, Inc.). Standards and samples are bomb pressure-loaded onto the precolumn and desalted on the HPLC using mobile phase A, and the precolumn is then attached to the analytical column for peptide separation and mass spectral collection. The HPLC mobile phases were composed of 10-mM ammonium acetate in nanopure water at pH 6.7 (A) and 80% ACN/10╯mM ammonium acetate in nanopure water (B). For in vitro phosphohistidine standard analysis, a linear gradient from 0% to 100% mobile phase B in 17 minutes followed by 0% B after 20 minutes with a stop time of 150 minutes was used. For in vivo phosphohistidine analysis, a linear gradient composed of 0% to 50% B in 120 minutes, 70% B in 140 minutes, 100% B in 150 minutes, hold 100% B until 190 minutes, followed by 0% B after 210 minutes with a stop time of 310 minutes was used. 7.4.3.8â•… Nano-ESI Nano-HPLC MS.╇ Electrospray nano-HPLC MS of the in vitro phosphorylated peptides was performed on a Thermo model LCQ Deca (ThermoScientific, Bremen, Germany) ion trap mass spectrometer. The spray voltage was kept at 2.3╯ kV, and the heated capillary temperature was 175°C. The tandem mass spectrometry (MS/MS) spectra of eluting standards were collected using standard data-dependent analysis (DDA) of isolation and excitation procedures. A pictorial representation of the nano-HPLC/nano-ESI/ LCQ Deca-MS/MS setup is illustrated in Figure 7.10. The standard or sample was pressure loaded onto an SPE precolumn and coupled to the HPLC after the flow splitter and prior to the analytical column and ESI nanospray tip. This allows washing/desalting of the loaded sample without coupling to the column and nanospray tip. After washing/desalting, the SPE precolumn is then coupled to the analytical column using a PicoClear (New Objective) junction. Nanoflow rates (∼100╯nL/min) were obtained using a three-way flow splitter coupled with a flow restrictor split to waste. The flow splitter is composed of a 50╯ µm fused-silica capillary column. A length of approximately 1╯m is initially used and the flow is checked from the spray tip (with graduated 5-µL capillary tube). To decrease back pressure and subsequently decrease flow, the restrictor is cut until a suitable pressure and flow is obtained.
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 267 Mobile phase A 10 mM ammonium acetate nanopure water pH 6.7
Mobile phase B 10 mM ammonium acetate 80% acetonitrile pH 6.7
Restrictor split to waste
Flow splitter Pressure load sample onto SPE precolumn
m/z
Fused-silica (packed) capillary column
Figure 7.10.╇ The nano-HPLC/nano-ESI/LCQ Deca-MS/MS setup. Standard or sample is pressure loaded onto an SPE precolumn and coupled to the HPLC after the flow splitter and prior to the analytical column and ESI nanospray tip allowing washing/ desalting of the loaded sample without coupling to the column and nanospray tip.
A quadrupole time-of-flight (Q-TOF) 2 mass spectrometer (Micromass, Manchester, UK) in DDA mode was also used for in vitro phosphorylated peptide analysis product ion spectral collection. Argon was used for collision gas in product ion spectral generation. The C. crescentus in vivo phosphorylated peptides were measured using a Thermo model LTQ linear quadrupole ion trap mass spectrometer (ThermoScientific) in DDA mode. For peptide fragmentation and sequencing, datadependent data sets were collected for the 10 most abundant species after each mass scan range of m/z 300–2000. To enhance the identification of phosphopeptides, data sets were also collected with datadependent MS/MS of the top six peptides, followed by multistage activation (MSA) of the neutral loss peak in the MS2 scan that was associated with a precursor peak loss corresponding to phosphate loss (i.e., a neutral loss of 26.7╯Da, 32.7, 40.0, 49.0, 80.0, or 98.0).
268â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
7.4.3.9â•… Peptide Identification and False Discovery Rate (FDR) Determination.╇ For initial candidate identification of phosphohistidinecontaining peptides from C. crescentus extracts, all data collected from liquid chromatography (LC)-MS/MS analyses were analyzed using SEQUEST and the following search criteria for phosphorylated peptides (LC-LTQ MS/MS): dynamic phosphorylation of S, T, Y, D, and H residues, no enzyme rule, and cutoff of ±1.5╯Da for the precursor masses. Data were searched against the Genbank entry for C. crescentus CB15 (AE005673.faa containing 3737 protein entries available at www. ncbi.nih.gov/). To estimate the FDR, the C. crescentus database was searched as a decoy database, that is, the reversed C. crescentus database was appended to the forward database and included in the SEQUEST search. The FDR was estimated from the forward and reverse (decoy) filtered matches and was calculated as a ratio of two times the number of false positives to the total number of identified peptides. Due to high numbers of reverse identifications (as expected from a multiple fiveresidue dynamic phosphorylation search), the identifications of phosphohistidine-containing peptides was limited to CS 2 peptides that were manually confirmed from the highest scoring results. While CS 3 peptides scored high, ambiguity to assignments of modification site and the artifactual effect of long peptide chains produced from the V-8 enzyme limited their final reporting. 7.4.4 C18 RP LC Behavior With the synthesized standard peptide nano-LC-nano-ESI Q-TOF/MS analysis, qualitative spectral comparisons demonstrated that the conversion of the aspartate and glutamate residues in the standard peptides to methyl esters was at an approximate level of 90% up to 100%. Conversion of the unmodified His residues in the standard peptides to the phosphorylated state was observed at approximate levels of 50%– 80%. The general trend in the retention times was observed that phosphorylating the peptide decreased the retention time, while methylating the peptides increased the retention time. This would be expected in C18 RP LC due to changes in polarity of the peptide where methylating decreases the polarity while phosphorylating increases it. The relatively high pH at 6.7, however, was having a peak broadening effect on the retention of all of the peptides measured in the study. These effects, however, did appear to be somewhat peptide-dependent where the methylation and phosphorylation were observed to have less effect on retention times for some peptides as compared with others. This is
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 269
(a)
RF-only quadrupole with floating gas Prefilter RF-DC quadrupole mass analyzer collision cell Detector
69.23
100 TIC 33.56 44.92 51.44
%
0
(b) 100
Source inlet 20
40 1
60
Angio II (Sar Thr ) methylated with 50.43 two phosphorylations
1 8 72.86 Angio II (Sar Thr ) methylated with no phosphorylation
Angio II (Sar1Thr8) methylated with 36.22 % three phosphorylations
0
20
40
80
8
60
Orthogonal ion deflector Time-of-flight drift tube Path of analyte Reflectron electrostatic mirror
80 Time (minutes)
Figure 7.11.╇ (a) Q-TOF/MS total ion chromatogram (TIC) of the methylated and phosphorylated form of the synthetic peptide angio II (Sar1Thr8), which can potentially contain three phosphorylations: one on tyrosine, one on histidine, and one on threonine. (b) Triggering of the product ion scan collection for the methylated and nonphosphorylated form at ∼73 minutes. Triggering of the product ion scan collection for the methylated and doubly phosphorylated form at ∼50 minutes, and the triply phosphorylated at ∼36 minutes.
probably in relation to the number of methylations and phosphorylations taking place and the polarity of the particular peptide. Figure 7.11 illustrates the effects of methylation and phosphorylation on the peptide retention times. Figure 7.11a is a total ion chromatogram (TIC) of the synthetic peptide angio II (Sar1Thr8) in a methylated and phosphorylated form collected on the Q-TOF mass spectrometer. Angio II (Sar1Thr8) can potentially contain three phosphorylations: one on Tyr, one on His, and one on Thr. The Q-TOF/MS DDA method that generated Figure 7.11a was set up to scan for and measure all possible forms of the methylated and phosphorylated angio II (Sar1Thr8). Figure 7.11b illustrates the triggering of the product ion scan collection for the methylated and nonphosphorylated form at ∼73 minutes. Figure 7.11b also illustrates the triggering of the product ion scan collection for the methylated and doubly phosphorylated form at ∼50 minutes and the triply phosphorylated at ∼36 minutes. A recent study illustrating the complex nature of the retention time of phosphorylated peptides in RP LC was recently reported by Kim et al.36 who developed methodology for
270â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
predicting elution times based on normalized retention times that describes a more detailed explanation of the complex behavior due to near neighbor residues. In agreement with the recent study reported by Kim et al.,36 increasing the number of phosphorylation sites on the peptides tends to decrease the peptide’s retention time due to an increase in the polarity of the peptide. For example, the angio II (Sar1Thr8) peptide in the methylated form with no phosphorylation sites has a retention time of 73 minutes, while the peptide with two phosphorylations has a retention time of 50 minutes and the triply phosphorylated form at 36 minutes. For a more detailed treatment of the complex nature of the retention time of phosphorylated peptides in RP LC, the reader is directed to the Kim et al.36 study. 7.4.5 Phosphohistidine Loses HPO3 and H3PO4 Often, fragmentation of phosphorylated peptides in an ion trap mass spectrometer results in the neutral loss of the PTM with little other peptide sequence information in the way of product ions. It has become standard procedure in phosphopeptide analysis to incorporate mass spectrometric techniques such as neutral loss-triggered MS3 and MSA to enhance the product ion spectral information collected during peptide fragmentation. For phosphohistidine-containing peptides, this is also the general case where upon CID of the phosphorylated peptide, the major product ion observed is associated with neutral loss of the phospho modification. The peptide was analyzed using the HPLC system on the LCQ mass spectrometer. The elution and subsequent detection of the peptides is illustrated in the TIC of Figure 7.12a. Figure 7.12b illustrates the product ion spectrum of the in vitro pHcontaining peptide INpHDLR at m/z 424.3 as [M╯+╯2H]2+ where the predominant product ion at m/z 375.3 is produced through neutral loss of the phosphate moiety [M╯+╯2H╯−╯H3PO4]2+ with little other peptide sequence information in the way of product ions. It was observed for the methylated form of the phosphohistidine-containing peptides that a predominant ion trap mass spectral product ion is for the neutral loss of 80╯Da for the phosphate group as HPO3 is also observed in conjunction with a neutral loss of 98╯Da as H3PO4. This is illustrated in Figure 7.12c for the methylated form of the pH-containing peptide INpHD#LR# (# represents methylated peptides) at m/z 438.5. This is the doubly charged peptide as [M╯+╯2H]2+ where the predominant product ion at m/z 398.3 is for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯HPO3]2+, followed by the product ion at m/z 389.3 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯H3PO4]2+.
(a)
27.00 1055.3
100
Relative Abundance
90 26.74 1055.4
80
27.52 1015.4
70
27.81 1015.4 37.00 1304.3
60 50 40
37.38 1304.6 38.20 43.03 1369.1 650.8 50.96 62.45 650.7 1994.1
30 20 10 0
(b)
25.36 767.5 9.02 18.23 21.18 1734.6 1742.4 1541.8
0
10
20
30 40 50 Time (minutes) 375.5
100
60
70
INpHDLR m/z 424.3 as [M + 2H]2+
[M + 2H – H3PO4]2+
%
664.3
332.6 288.2
0 (c)
200
480.1 560.2
300
400
500
600
346.7
100
2+
[M + 2H – HPO3]
700 m/z
INpHD#LR# m/z 438.5 as [M + 2H]2+
692.4
398.3
% [M + 2H – H3PO4]2+ 389.3
0
200
300
763.4
578.9
324.1
400
500
600
700
m/z
Figure 7.12.╇ Product ion spectra collected on the LCQ ion trap mass spectrometer of (a) total ion chromatogram (TIC), (b) the in vitro phosphorylated histidine-containing peptide INpHDLR at m/z 424.3 as [M╯+╯2H]2+. The predominant product ion at m/z 375.3 is produced through neutral loss of the phosphate moiety [M╯+╯2H╯−╯H3PO4]2+ and (c) the methylated form of the phosphorylated histidine-containing peptide INpHD#LR# (# represents methylated peptides) at m/z 438.5 as [M╯+╯2H]2+. The predominant product ion at m/z 398.3 is for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯HPO3]2+ and at m/z 389.3 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯H3PO4]2+.
272â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
However, little other peptide sequence information in the way of product ions is still the case. 7.4.5.1â•… Rational for H3PO4 Loss.╇ Methylating the peptide effectively removes any available hydroxyl group that would otherwise promote loss of H3PO4 in the form of HPO3 loss plus water. A brief discussion of phosphate neutral loss from the amino acid residues during CID will illustrate the importance of this observation. 7.4.5.1.1â•… Ser and Thr H3PO4 Loss.╇ The loss of the phosphorylation that is associated with Ser has been proposed through a β (beta)elimination mechanism producing dehydroalanine as [M╯−╯H3PO4].37 When the phosphorylation modification of a peptide takes place on the Thr amino acid residue, the predominant product ion spectral peak derived from CID has also been proposed through β-elimination of the phosphate group. The neutral loss of the phosphate group, −98╯Da as [M╯+╯H╯−╯H3PO4]+, from the Thr residue produces dehydroaminobutyric acid.37 These two forms of β-elimination mechanism producing dehydroalanine from phosphorylated Ser and dehydroaminobutyric acid from Thr are illustrated in Figure 7.13. Recently, however, Palumbo et al.38 have demonstrated that the neutral loss of the phosphate group as H3PO4 does not occur predominantly through a β-elimination reaction that is charge remote driven but instead, takes place through an SN2 neighboring group participation reaction that is charge directed, as illustrated in Figure 7.14. A difference in the product ion spectrum for fragmentation of phosphorylated Thr as compared with phosphorylated Ser is a product ion peak observed for the neutral loss of 80╯Da. This product ion represents dephosphorylation through neutral loss of HPO3 from the precursor ion as [M╯−╯HPO3] and is illustrated in Figure 7.15. This particular loss is not observed in the product ion spectrum of phosphorylated Ser that would otherwise result in the structure of the original amino acid residue. 7.4.5.1.2â•… Tyr HPO3 and H3PO4 Loss.╇ Product ion spectra of phosphorylated Tyr (pY)-containing peptides also illustrate losses associated with 80 and 98╯Da, which would appear to be similar to the losses observed with phosphorylated Thr peptides. The 80-Da loss is due to dephosphorylation of the Tyr residue resulting in the original structure of the Tyr residue, similar to a mechanism for phosphohistidine as illustrated in Figure 7.16. However, due to the structure of Tyr, a similar mechanism of β-elimination for loss of the phosphate (−98╯Da, H3PO4)
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 273
R
N H
H
O
C
C
O
CH2
R1 R
N H
C
O O
P
OH
O C
O
R1
+
O
P OH
CH2 OH
OH
Dehydroalanine
Phosphate
OH
Phosphoserine β-elimination H
O
H2N
C
C
H3C
CH
OH O H2N
C
H3C
CH
C
OH OH
+
O
P
OH
O O
P
OH
Dehydroaminobutyric acid
OH
Phosphate
OH
Phosphorylated threonine (pT) β-elimination Figure 7.13.╇ The loss of the phosphorylation with serine (Ser) through a β (beta)elimination mechanism producing dehydroalanine as [M╯−╯H3PO4].31 The neutral loss of the phosphate group, −98╯Da as [M╯+╯H╯−╯H3PO4]+, from the threonine residue produces dehydroaminobutyric acid.31
group is not likely. The neutral loss of phosphate as 98╯Da probably does not happen through a two-step mechanism involving both water (H2O) loss of 18╯Da and HPO3 loss of 80╯Da, in any order. The neutral loss of 98╯Da is likely associated with some form of peptide rearrangement in the fragmentation pathway mechanism. The condition of phosphate loss from Tyr is identical to that of pH where the amino acid structure does not allow a loss of 98╯Da in the form of H3PO4, as illustrated in Figure 7.17. 7.4.5.1.3╅ His HPO3 and H3PO4 Loss.╇ The product ion spectra in Figure 7.12 are suggesting that methylation is promoting phosphate loss
274â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE HO OO
OH P
N H
O
(Pathway A) –H3PO4
H+
H N
O
H N
N H
O
H+
Charge-remote β-elimination
O
HO + HO P OH O
H .. N H
(Pathway B) –H3PO4
O H N O
OH + N H
H N
Charge-directed E2 elimination
H N
Charge-directed SN2 neighboring group participation
O
HO + HO P OH (Pathway C) –H3PO4
O
O .. N H
H N
O + N H
O
O
Figure 7.14.╇ The neutral loss of the phosphate group as H3PO4 does not occur predominantly through a β-elimination reaction that is charge remote driven but instead, takes place through an SN2 neighboring group participation reaction that is charge directed.
O H2N H3C
H C
C
O
OH
CH
H2N
H C
O
H3C
CH
O C
OH
+
P O
OH
H O
P
O
OH
OH
Threonine
HPO 3
Dephosphorylation of threonine Figure 7.15.╇ Product ion fragmentation pathway for phosphorylated threonine for the neutral loss of 80╯Da. This product ion represents dephosphorylation through neutral loss of HPO3 from the precursor ion as [M╯−╯HPO3].
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 275
O
O N H
CH C
N H
N H
CH2
CH C
N H
CH2
CID
+
O
O
P OH
O O
P
OH
H O
OH O N H
CH C
O N H
CH2
N H
O
+
N N
O
N H
CH2
CID
N
CH C
NH
P
O
OH
P H
O
OH
Neutral loss of 79.9663 Da as HPO3
Figure 7.16.╇ Neutral loss of the phosphate group as −80╯Da [M╯−╯HPO3] is similar for tyrosine and histidine where the original, unmodified residue is obtained.
from the His residue in the form of HPO3, returning the residue to its original state. However, with the observance of a 98╯Da loss, and the removal of any free hydroxyl groups through methylation, the fragmentation pathway mechanism for H3PO4 involves a complex rearrangement process within the peptide chain backbone that at first thought could be associated with the gas-phase stage of the production of the b ion during CID. However, it is more likely that a charge-directed SN2 neighboring group participation reaction is taking place similar to the mechanisms reported by Palumbo et al.,38 illustrated in Figure 7.14.
276â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
O
O N H
CH C
N H
N H
CH2
?
CH C CH2
O
+
CID O O
P
N H
HO
P
OH
OH
OH
H O
OH O N H
CH C
O N H
CH2 N
?
CID N
O
N H
CH C
N H
CH2
O
+
N
HO
P
OH
OH
NH
P H
O
OH
Neutral loss of 97.9769 Da as H PO 3
4
Figure 7.17.╇ Due to the structure of tyrosine, a similar mechanism of β-elimination for loss of the phosphate (−98╯Da, H3PO4) group is not likely. The neutral loss of phosphate as 98╯Da probably does not happen through a two-step mechanism involving both water (H2O) loss of 18╯Da and HPO3 loss of 80╯Da, in any order. The neutral loss of 98╯Da is likely associated with some form of peptide rearrangement in the fragmentation pathway mechanism. The condition of phosphate loss from tyrosine is identical to that of phosphorylated histidine where the amino acid structure does not allow a loss of 98╯Da in the form of H3PO4.
Figure 7.18 presents a rationale for the neutral loss of 98╯Da as H3PO4. In step 1, the carbonyl backbone attacks the charged phosphate group, transferring the charge to the peptide amide moiety. In step 2, rearrangement takes place with the subsequent neutral loss of the phosphate group.
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 277
OH
Step 1
HO
O
P
C
OH
N
OH HO
H N
N
C H2
N OH C N
CH2 H2N
O
P
H N
C H2
CH2
CH C
OH
H2N
CH C
O
OH
O
OH HO
Step 2
P
O
N OH C N
H N
N C H2
H N
+ CH2
CH C O
OH
C H2
N
CH2 H2N
C
H2N
O HO
P
OH
OH
CH C O
OH
Neutral loss of 97.9769 Da as H 3PO4
Figure 7.18.╇ Rationale for the neutral loss of 98╯Da as H3PO4 from phosphohistidine. In step 1, the carbonyl backbone attacks the charged phosphate group transferring the charge to the peptide amide moiety. In step 2, rearrangement takes place with the subsequent neutral loss of the phosphate group.
7.4.6 Q-TOF/MS/MS Product Ion Spectra Enhanced fragmentation efficiency of phosphorylated peptides is observed when using CID on a Q-TOF mass spectrometer, affording higher confidence in the identification of the synthesized phosphohistidine-containing peptide and phosphorylation site location. To further study the fragmentation behavior of pH-containing peptides, product ion spectra were collected on a Q-TOF mass spectrometer. The remaining studies of the nonmethylated and methylated phosphohistidine-containing peptides were collected on the Q-TOF mass spectrometer. 7.4.6.1╅ pH-Containing Peptide INpHDLR.╇ Figure 7.19a illustrates the product ion spectrum of the nonmethylated form of the
278â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE (a) INpHDLR nonmethylated, m/z 847 as [M + H]+
100
848.5
%
[M – HPO3]+
y3 252.1
0
240
[M – H3PO4]+
b4 – HPO3 b3 368.2
288.3
280
320
360
445.2 400
750.5 749.5
b4 – HPO3
440
480.3
665.4
593.4
480
520
560
600
640
721.5 733.5
680
720
768.5
760
800
840
m/z
(b) 100
INpHDLR methylated, m/z 875 as [M + H]+
y4 – HPO3 b3 – HPO3 %
267.1 302.2 0
200
250
300
b5 – HPO3
y5 – HPO3
b4 – HPO3
b4
y3
y2
[M – H3PO4]+
y5
[M – HPO3]+
875.5
795.5 777.5
568.4 365.2 350
431.3 400
450
y4 682.5 675.4 574.2 607.4 648.4
494.3 500
550
600
650
700
762.5 750
800
850
m/z
Figure 7.19.╇ (a) Product ion spectrum of the nonmethylated form of the phosphorylated histidine-containing peptide INpHDLR at m/z 847 as [M╯+╯H]+. The predominant product ion at m/z 749.5 is for the neutral loss of the phosphate moiety as [M╯+╯H╯−╯H3PO4]+ at 98╯Da. Small product ion at m/z 767.5 for the neutral loss of the phosphate moiety as [M╯+╯H╯−╯HPO3]+ at 80╯Da. (b) Product ion spectrum of the methylated form of the phosphorylated histidine-containing peptide INpHD#LR# (# represents methylated peptides) at m/z 875 as [M╯+╯H]+. Predominant product ion at m/z 795.5 is for the neutral loss of the phosphate moiety as [M╯+╯H╯−╯HPO3]+ at 80╯Da. Product ion at m/z 777.5 is for the neutral loss of the phosphate moiety as [M╯+╯H╯−╯H3PO4]+ at 98╯Da. Methylating the peptide has had an effect on the neutral loss of the phosphate moiety.
pH-containing peptide INpHDLR at m/z 847 as [M╯+╯H]+, where the predominant product ion at m/z 749.5 is for the neutral loss of the phosphate moiety as [M╯+╯H╯−╯H3PO4]+ at 98╯Da. Notice the small product ion at m/z 767.5 for the neutral loss of the phosphate moiety as [M╯+╯H╯−╯HPO3]+ at 80╯Da. As can be observed in the product ion spectrum, good coverage of the peptide backbone is achieved. Figure 7.19b illustrates the product ion spectrum of the methylated form of the pH-containing peptide INpHD#LR# (# represents methylated peptides) at m/z 875 as [M╯+╯H]+. In the methylated form of the peptide, the predominant product ion at m/z 795.5 is for the neutral loss of the phosphate moiety as [M╯+╯H╯−╯HPO3]+ at 80╯Da. Notice now that the product ion at m/z 777.5 for the neutral loss of the phosphate moiety as [M╯+╯H╯−╯H3PO4]+ at 98╯Da is not the major loss as shown in the nonmethylated form of the peptide in Figure 7.19a. Methylating the
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 279
INpHDLR methylated, MS/MS m/z 438 as [M + 2H]2+
%
[M + 2H – HPO3]2+ b3 – H3PO4 100 346.7 398.3 y4 – HPO32+ [M + 2H – H3PO4]2+ y4 – HPO3 y5 – HPO32+ y5 – HPO3 y3 389.8 284.7 b4 – H3PO4 692.4 y 341.7 b2 2 431.3 438.2 568.4 0
682.4
228.2 200
240
280
320
360
400
440
480
520
560
600
640
m/z
Figure 7.20.╇ Product ion spectrum of the doubly charged (2+) methylated form of the phosphorylated histidine-containing peptide INpHDLR at m/z 438.5 as [M╯+╯2H]2+. The predominant product ion at m/z 398.3 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯HPO3]2+ is similar to the singly charged peptide, followed by the product ion at m/z 389.3 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯H3PO4]2+.
peptide has had an effect on the neutral loss of the phosphate moiety, which we will now further investigate. 7.4.6.2â•… Doubly Charged (2+) Peptide INpHDLR.╇ Figure 7.20 illustrates the product ion spectrum of the doubly charged (2+) methylated form of the pH-containing peptide INpHD#LR# (# represents methylated peptides) at m/z 438.5 as [M╯+╯2H]2+, where the predominant product ion at m/z 398.3 is also for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯HPO3]2+, followed by the product ion at m/z 389.3 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯H3PO4]2+. This shows similarity with the singly charged product ion spectrum in Figure 7.19b. 7.4.6.3â•… pH-Containing Peptide pHLGLAR.╇ A second pH-containing peptide was methylated and phosphorylated as a check of this fragmentation behavior. To compare, the product ion spectrum of the methylated form of the pH-containing peptide INpHD#LR# at m/z 438.5 as [M╯+╯2H]2+ is illustrated in Figure 7.21a. Figure 7.21b illustrates the product ion spectrum of the methylated form of the pH-containing peptide pHLGLAR# at m/z 380.0 as [M╯+╯2H]2+ where the predominant product ion at m/z 340.8 is for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯HPO3]2+, followed by the product ion at m/z 331.8 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯H3PO4]2+. In this product ion spectrum, a striking difference is observed between the
280â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE b2 b3*o b4*o (a)
[M + 2H – H3PO4]2+
100
I N pH D L R
[M + 2H – HPO3]2+
b3 – HPO3 – H2O 346.7
398.3
y5 – HPO32+ % b2 228.2 0
y5* y4*
y4 – HPO32+ 284.7
200
250
389.7
341.7 y2 302.2 300
350
y3 y2
b4 – HPO3 – H2O y4 – HPO3 y3 568.4 431.3 438.2 438.5 [M + 2H]2+
400
450
500
550
y5 – HPO3 682.4 600
650
m/z
b2* (b) 100
[M + 2H – HPO3]2+ 340.8 [M + 2H]2+ 380.2
430.3
[M + 2H – H3PO4]2+ %
0
y2 – NH3 b – HPO 2 3 223.2 251.2 y2 243.2 260.2 283.1
331.8
pH L G L A R y4
y3 373.2
y5 y4 y3 y2
y5 543.4 553.4
200 220 240 260 280 300 320 340 360 380 400 420 440 460 480 500 520 540 560
m/z
Figure 7.21.╇ Product ion spectra collected on the Q-TOF mass spectrometer of (a) the methylated form of the phosphorylated histidine-containing peptide INpHD#LR# (# represents methylated peptides) at m/z 438.5 as [M╯+╯2H]2+. Predominant product ion at m/z 398.3 is for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯HPO3]2+ and at m/z 389.3 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯H3PO4]2+, and (b) the methylated form of the phosphorylated histidine-containing peptide pHLGLAR# at m/z 380.0 as [M╯+╯2H]2+. The predominant product ion at m/z 340.8 is for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯HPO3]2+ and at m/z 331.8 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯H3PO4]2+.
two neutral losses associated with the phosphate moiety on the His residue. 7.4.6.4â•… Singly Charged (1+) Peptide pHLGLAR.╇ Finally, the singly charged form of the methylated, phosphohistidine-containing peptide pHLGLAR product ion spectrum was collected at m/z 760 as [M╯+╯H]+ and is illustrated in Figure 7.22a. For comparison, the doubly charged form of the peptide is also illustrated in Figure 7.22b. An enhanced product ion for the neutral loss of the phosphate moeity at −80╯Da as [M╯+╯H╯−╯HPO3]+ is observed along with the loss of −98╯Da as [M╯+╯H╯−╯H3PO4]+.
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 281
pHLGLAR methylated, MS/MS m/z 760 as [M + H]+ [M + H – H3PO4]+
100
[M + H – HPO3]+
%
223.2
0
y2
760.5
680.5
b2 – HPO3 251.2 b3 – HPO3
y5 b4 – HPO3 y4 b5 – HPO3 543.5 y3 308.2 421.3 430.3 356.3
662.5
180 220 260 300 340 380 420 460 500 540 580 620 660 700 740
m/z
pHLGLAR methylated, MS/MS m/z 380 as [M + 2H]2+ [M + 2H – HPO3]2+ [M + 2H]2+ 340.8 y4 380.2 2+ [M + 2H – H3PO4] 430.3 y2 – NH3 b2 – HPO3 y y3 223.2 251.2 2 260.2 331.8 373.3
%
100
0
200
240
280
320
360
400
440
y5 543.4
480
520
560
m/z
Figure 7.22.╇ (a) Singly charged form of the methylated, phosphohistidine-containing peptide pHLGLAR product ion spectrum at m/z 760 as [M╯+╯H]+. (b) The doubly charged form of the peptide at m/z 380 as [M╯+╯2H]2+. For the singly charged species, an enhanced product ion for the neutral loss of the phosphate moeity at −80╯Da as [M╯+╯H╯−╯HPO3]+ is observed along with the loss of −98╯Da as [M╯+╯H╯−╯H3PO4]+.
7.4.7 Behavior of Monophosphohistidine and Diphosphohistidine Peptide The selective, specific in vitro synthesis of the His residue allows the exclusive phosphorylation of a peptide on the His residue that contains other amino acid residues that can be phosphorylated such as Ser, Thr, or Tyr. This allows further investigation into the fragmentation behavior of phosphohistidine-containing peptides. 7.4.7.1╅ Peptide Angio I as DRVYIHPFHL╇ 7.4.7.1.1╅ Singly Phosphorylated Angio I.╇ The peptide angio I, DRVYIHPFHL, contains two His residues that may be phosphorylated and a Tyr residue. Figure 7.23a illustrates the product ion spectrum of the nonmethylated form of the singly pH-containing peptides DRVYIpHPFHL and DRVYIHPFpHL (the product ion spectrum shows a mixture of the two isomeric forms of this phosphorylated peptide) at m/z 688.8 as [M╯+╯2H]2+ where the predominant product ion at m/z 640.8 is for
282â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE (a)
[M + 2H – H3PO4]2+ 640.3 [M + 2H – HPO3]2+ DRVYIpHPFHL and [M + 2H]2+ DRVYIHPFpHL
100
%
648.8688.8 0
200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000
(b) 100
D#RVYIpHPFHL# and D#RVYIHPFpHL#
%
0
[M + 2H – H3PO4]2+ [M + 2H – HPO3]2+
[M + 2H]2+ + 653.9 662.9 702.8
260
320
380
440
500
560
620
680
740
800
Figure 7.23.╇ Q-TOF product ion spectra of the nonmethylated form of the singly phosphorylated histidine-containing peptides DRVYIpHPFHL and DRVYIHPFpHL at m/z 688.8 as [M╯+╯2H]2+. Predominant product ion at m/z 640.8 is for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯H3PO4]2+, and at m/z 648.8 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯HPO3]2+, and (b) of the methylated form of the singly phosphorylated histidine-containing peptides D#RVYIpHPFHL# and D#RVYIHPFpHL# at m/z 702.8 as [M╯+╯2H]2+. Predominant product ion at m/z 662.9 is for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯HPO3]2+, and at m/z 653.9 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯H3PO4]2+. This illustrates the enhanced loss of 80╯Da as HPO3 upon methylation of the peptide.
the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯H3PO4]2+, followed by the product ion at m/z 648.8 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯HPO3]2+. Figure 7.23b illustrates the product ion spectrum of the methylated form of the singly pH-containing peptides D#RVYIpHPFHL# and D#RVYIHPFpHL# at m/z 702.8 as [M╯+╯2H]2+ where the predominant product ion at m/z 662.9 is for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯HPO3]2+, followed by the product ion at m/z 653.9 for the neutral loss of the phosphate moiety as [M╯+╯2H╯−╯H3PO4]2+. The product ion spectrum in Figure 7.23b illustrates the enhanced loss of 80╯Da as HPO3 upon methylation of the peptide. It is clear from these product ion spectra that the methylation is promoting phosphate loss from the His residue in the form of HPO3, returning the residue to its original state. However, with the observance of a 98╯Da loss, and the removal of any free hydroxyl
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 283
(a)
MS/MS m/z 728 as [M + 2H]2+ DRVYIpHPFpHL
100
732.6 649.4 648.9
%
Angiotensin I nonmethylated with both histidine residues phosphorylated
649.9 631.3 177.1
0
277.2 221.1 233.2 257.3 301.3
513.3
432.9
601.7 570.8 592.7 547.4 617.4
680.3 679.8 680.8 681.3 650.8
715.5
784.5 m/z
180 220 260 300 340 380 420 460 500 540 580 620 660 700 740 780
(b)
[M – HPO3 – HPO3]2+
100
732.6
[M – H3PO4 – HPO3]2+ 649.4
[M – H3PO4]2+
648.9
%
[M – H3PO4 – H3PO4]2+
649.9
0
631.3 640.4 630.8 631.8 639.9 640.9 632.4 642.7
715.5
680.3 679.8 680.8 681.3
650.4 650.8
729.4 728.8
730.6 732.9 m/z
620
630
640
650
660
670
680
690
700
710
720
730
Figure 7.24.╇ Angiotensin I: behavior of the diphosphorylated peptide that is nonmethylated. (a) Product ion spectrum of the diphosphorylated form of the peptide in the nonmethylated form and +2 charge state, observed to lose 98╯Da as [M╯+╯2H╯−╯H3PO4]2+, 138╯Da as [M╯+╯2H╯−╯H3PO4╯−╯HPO3]2+, and 196╯Da as [M╯+╯2H╯−╯2H3PO4]2+. (b) Expanded region of the product ion spectrum associated with the neutral loss of the phosphate moiety.
groups through methylation, the fragmentation pathway mechanism for H3PO4 involves a complex rearrangement process within the peptide chain backbone. 7.4.7.1.2â•… Doubly Phosphorylated Angio I.╇ The diphosphorylated form of the peptide was observed, in the nonmethylated form and +2 CS, to lose 98╯Da as [M╯+╯2H╯−╯H3PO4]2+, 138╯Da as [M╯+╯2H╯−╯H3PO4╯−╯HPO3]2+, and 196╯Da as [M╯+╯2H╯−╯2H3PO4]2+. The product ion spectrum is illustrated in Figure 7.24a. The region of the product ion spectrum associated with the neutral loss of the phosphate moiety has been expanded and is illustrated in Figure 7.24b. The methylated form of the diphosphorylated peptide was observed to only lose 98╯Da as [M╯+╯2H╯−╯H3PO4]2+ in association with the phosphorylation and is illustrated in Figure 7.25. This is likely due to the availability of loss of H2O from one of the phosphate groups. 7.4.7.1.3â•… CS-Dependent Phosphorylated Angio I.╇ Interestingly, the methylated forms of the monophosphorylated and diphosphorylated
284â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE Diphosphorylated and methylated MS/MS m/z 742 as [M + 2H] 2+
693.8
[M + 2H – H3PO4 ] 2+
100 DRVYIpHPFpHL
%
694.8 684.8 174.9 187.0 0
160
283.2 289.2 255.1
200
240
280
343.2
320
388.3
360
400
486.8 527.3 440
480
520
583.3 560
724.8 712.8 745.8
636.3 655.9
600
640
680
m/z 760
720
Figure 7.25.╇ The methylated form of the diphosphorylated peptide was observed to only lose 98╯Da as [M╯+╯2H╯−╯H3PO4]2+ in association with the phosphorylation.
(a) 100
469.2
No loss observed associated
%
with the HPO3 group
183.1
Angio I methylated with one pHis MS/MS m/z 468 as [M + 3H] 3+
470.3 468.2 471.3
283.2 249.2 332.7 223.2 364.2
451.4
527.3 563.3
0
75
125
175
225
275
325
375
425
475
525
575
647.4 625
675
767.5 795.4 725
775
825
m/z
(b) 100
Angio I methylated with two pHis MS/MS m/z 495 as [M + 3H] 3+
495.3
No loss observed associated %
with the HPO3 group 177.1 0
140
180
249.1 283.2 302.2 351.2 220
260
300
340
380
443.3 420
479.3
460
500
647.4 534.3 591.3 619.4 667.9 701.4 540
580
620
660
700
784.5 740
780
m/z
Figure 7.26.╇ (a) The angio I methylated peptide with one phosphohistidine (pHis) residue at m/z 468 as [M╯+╯H]3+. (b) The angio I methylated peptide with two phosphohistidine residues at m/z 495 as [M╯+╯H]3+. The methylated forms of the monophosphorylated and diphosphorylated peptides were both observed to be “charge state dependent,” where loss associated with the HPO3 group as 80 or 98╯Da was not observed in either of the +3 charge states.
peptides were both observed to be “CS dependent,” where loss associated with the HPO3 group as 80 or 98╯Da was not observed in either of the +3 CSs. Figure 7.26 illustrates this effect where Figure 7.26a is the angio I methylated peptide with one phosphohistidine residue at m/z 468 as [M╯+╯H]3+, and Figure 7.26b is the angio I methylated peptide with two phosphohistidine residues at m/z 495 as [M╯+╯H]3+.
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 285
7.4.8 Behavior of Phosphotyrosine and Phosphohistidine Peptide The peptide angio II, DRVpYIHPF, contains a pY residue and an unmodified His residue. This allows the study of the behavior of the neutral loss of the phosphate group from His when phosphotyrosine is present. 7.4.8.1╅ Peptide Angio II as DRVpYIHPF.╇ The product ion spectrum of the nonmethylated and nonphosphohistidine-containing peptide DRVpYIHPF illustrated loss of both HPO3 and H3PO4 as is often observed with phosphotyrosine-containing peptides. The product ion spectrum of the nonmethylated and nonphosphohistidine-containing peptide DRVpYIHPF at m/z 563 as [M╯+╯2H]2+ is illustrated in Figure 7.27a where both HPO3 and H3PO4 losses are shown. 7.4.8.2╅ Phosphorylated Angio II as DRVpYIpHPF.╇ The nonmethylated and phosphohistidine-containing peptide DRVpYIpHPF was also synthesized and analyzed. The product ion spectrum also illustrated loss of both HPO3 and H3PO4 but at higher intensities as compared with the nonmethylated and nonphosphohistidine-containing peptide in Figure 7.27a. The product ion spectrum of the nonmethylated and
(a) 263.1
100
[M + 2H – HPO3 ] 2+
DRVpYIHPF
563.8
m/z 563 as [M + 2H] 2+
%
[M + 2H – H 3PO4 ] 2+ 864.4
0
255.1 272.2 175
225
275
523.8
418.7 432.7 325
375
425
836.4
699.4 727.4
475
525
575
625
675
725
775
825
m/z 875
(b) 100 263.1
%
540.8 563.8
371.2
272.1
343.2 260
300
340
506.2 410.2 432.7 472.7 511.2 380
420
DRVpYIpHPF m/z 603 as [M + 2H] 2+
[M + 2H – H 3PO4 ] 2+
0
603.7
[M + 2H – HPO3 ] 2+
460
500
554.8 595.2 614.3
540
580
620
818.4 727.3 749.4 660
700
740
780
820
864.4 860
m/z
Figure 7.27.╇ (a) Product ion spectrum of the nonmethylated and nonphosphohistidinecontaining angiotensin II peptide DRVpYIHPF at m/z 563 as [M╯+╯2H]2+ where both HPO3 and H3PO4 losses are shown. (b) Product ion spectrum of the nonmethylated and phosphohistidine-containing peptide DRVpYIpHPF at m/z 603 as [M╯+╯2H]2+.
286â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE (a) 100 263.1 577.8
[M + 2H – HPO3 ] 2+ %
[M + 2H – H 3PO4 ] 2+
0
432.7 446.7 399.2
300.1 323.2 260
300
340
380
420
460
DRVpYIHPF MS/MS m/z 577 as [M + 2H]2+
561.3 537.8 528.8 504.2 500
540
642.3 580
620
660
864.4
727.4 755.4 700
740
780
820
860
m/z
(b)
%
MS/MS m/z 617 as [M + 2H]
277.2
242.2
288.2
2+
[M + 2H – HPO3 ] 2+
263.1 0
619.8
DRVpYIpHPF
100
503.3
570.8 556.8
610.8
371.2 397.2 328.2 441.3 466.7 m/z 220 240 260 280 300 320 340 360 380 400 420 440 460 480 500 520 540 560 580 600 620
Figure 7.28.╇ (a) Product ion spectrum of the methylated and phosphotyrosinecontaining angiotensin II peptide DRVpYIHPF at m/z 577 as [M╯+╯2H]2+ where both HPO3 and H3PO4 losses are shown. (b) The methylated, phosphotyrosine and phosphohistidine angiotensin II peptide DRVpYIpHPF product ion spectrum at m/z 617 as [M╯+╯2H]2+, where only 80╯Da loss as HPO3 is observed.
phosphohistidine-containing peptide DRVpYIpHPF at m/z 603 as [M╯+╯2H]2+ is illustrated in Figure 7.27b where both HPO3 and H3PO4 losses are shown. These two losses appear to be enhanced with the phosphorylation of the His residue. The methylated form of this peptide was synthesized and was also observed to have losses associated with HPO3 and H3PO4 upon CID. The methylated and phosphotyrosine angio II peptide DRVpYIHPF product ion spectrum at m/z 577 as [M╯+╯2H]2+ is illustrated in Figure 7.28a with similar behavior as that of Figure 7.27a where HPO3 loss is greater than H3PO4 loss. The methylated, phosphotyrosine and phosphohistidine angio II peptide DRVpYIpHPF product ion spectrum at m/z 617 as [M╯+╯2H]2+ is illustrated in Figure 7.28b. Here, in contrast to Figure 7.27b, only the HPO3 loss is observed. Measurement of the phosphohistidine-containing peptide that also contains phosphotyrosine will result in the observance of both 80 and 98╯Da losses. However, with phosphorylation of the His residue, the losses are enhanced, and upon methylation, the loss of 80╯Da as HPO3 is greatly enhanced, which may be used for verification of the phosphorylation sites.
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 287
7.4.9 Behavior of Phosphotyrosine-, Phosphothreonine-, and Phosphohistidine-Containing Peptide The nonmethylated phosphorylated form of angio II (Sar1Thr8) and the methylated form were analyzed using the nano-LC nano-ESI Q-TOF mass spectrometer (see in vitro phosphorylation of angio II (Sar1Thr8) in Section 7.4.3.2). The synthetic peptide angio II (Sar1Thr8) with sequence of SarArgValTyrIleHisProThr possesses three amino acid residues that may be phosphorylated (Tyr, His, and Thr), an N-terminal synthetic amino acid residue of N-methylglycine (sarcosine), and a C-terminal residue that may be methylated. 7.4.9.1╅ Peptide Angio II (Sar1Thr8).╇ The following figures are a series of product ion spectra starting with the nonphosphorylated and nonmethylated synthetic peptide angio II (Sar1Thr8) followed by sets of nonmethylated and phosphorylated versus methylated and phosphorylated at increasing numbers of phosphorylation from one phospho group to three phospho groups. 7.4.9.1.1╅ Peptide Angio II (Sar1Thr8): Nonphosphorylated.╇ Figure 7.29 illustrates the nonmethylated and methylated forms of the (a)
[M + 2H]2+
m/z
(b) [M + 2H]2+
m/z
Figure 7.29.╇ Product ion spectra collected on the Q-TOF/MS of the (a) methylated and (b) nonmethylated forms of the synthetic peptide angio II (Sar1Thr8) that is nonphosphorylated. Product ion neutral losses associated with the HPO3 group as −80╯Da, or H3PO4 as −98╯Da are not observed as expected.
288â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE (a) [M + 2H]2+ [M + 2H – H3PO4]2+
m/z
(b) [M + 2H – HPO3]2+
[M + 2H]2+
[M + 2H – H3PO4]2+
m/z
Figure 7.30.╇ Product ion spectra of (a) the nonmethylated and (b) methylated forms of the synthetic peptide angio II (Sar1Thr8) that contains one phosphorylation. In (a) of the nonmethylated and phosphorylated peptide, the loss for the HPO3 group as −80╯Da is not observed. In (b), the product ion spectra neutral losses associated with the HPO3 group as −80╯Da and the H3PO4 group as −98╯Da are observed.
synthetic peptide angio II (Sar1Thr8) that is also nonphosphorylated. In both of these product ion spectra, neutral losses associated with the HPO3 group as −80╯ Da, or H3PO4 as −98╯Da are not observed as expected. 7.4.9.1.2â•… Peptide Angio II (Sar1Thr8): One Phosphorylation.╇ Figure 7.30 illustrates the nonmethylated and methylated forms of the synthetic peptide angio II (Sar1Thr8) that also contains 1 phosphorylation. In Figure 7.30a, of the nonmethylated and phosphorylated peptide, the loss for the HPO3 group as −80╯Da is not observed. In Figure 7.30b, the product ion spectra neutral losses associated with the HPO3 group as −80╯Da and the H3PO4 group as −98╯Da are observed. 7.4.9.1.3â•… Peptide Angio II (Sar1Thr8): Two Phosphorylations.╇ Figure 7.31 illustrates the nonmethylated and methylated forms of the synthetic peptide angio II (Sar1Thr8) that also contains two phosphorylations. In Figure 7.31a of the nonmethylated and diphosphorylated peptide, the loss for the HPO3 group as −80╯Da is not observed. Losses associated with H3PO4 at −98╯Da for one phosphorylation and for
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 289
(a)
[M + 2H – H3PO4]2+
[M + 2H]2+
[M + 2H – 2H3PO4]2+
m/z
(b) [M + 2H – HPO3]2+ [M + 2H – H3PO4]2+
[M + 2H]2+
[M + 2H – HPO3 – H3PO4]2+
m/z
Figure 7.31.╇ Product ion spectra for (a) the nonmethylated and (b) methylated forms of the synthetic peptide angio II (Sar1Thr8) that also contains two phosphorylations. In (a) the nonmethylated and diphosphorylated peptide, the loss for the HPO3 group as −80╯Da is not observed. Losses associated with H3PO4 at −98╯Da for one phosphorylation and for 2H3PO4 at −196╯Da are observed. In (b), the product ion spectra neutral losses associated with the HPO3 group as −80╯Da, the H3PO4 group as −98╯Da, and HPO3╯+╯H3PO4 at −178╯Da are observed.
2H3PO4 at −196╯Da are observed in the product ion spectrum. In Figure 7.31b, the product ion spectra neutral losses associated with the HPO3 group as −80╯Da, the H3PO4 group as −98╯Da, and HPO3╯+╯H3PO4 at −178╯Da are observed. 7.4.9.1.4â•… Peptide Angio II (Sar1Thr8): Three Phosphorylations.╇ Figure 7.32 illustrates the nonmethylated and methylated forms of the synthetic peptide angio II (Sar1Thr8) that also contains three phosphorylations. In Figure 7.32a of the nonmethylated and triphosphorylated peptide, the loss for the HPO3 group as −80╯ Da is not observed. Loss associated with H3PO4 at −98╯ Da for one phosphorylation is observed in the product ion spectrum. In Figure 7.32b, the product ion spectra neutral losses associated with the HPO3 group as −80╯ Da, the H3PO4 group as −98╯ Da, 2HPO3 at −160╯ Da, and HPO3╯ +╯ H3PO4 at −178╯ Da are observed. The bottom part of the figure illustrates an expanded x-axis of the associated phosphate moiety neutral loss region.
290â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE (a) [M + 2H]2+
m/z
(b)
[M + 2H]2+
m/z
(c) [M + 2H – HPO3 – H3PO4]2+ [M + 2H – H3PO4]2+
[M + 2H – HPO3]2+
[M + 2H – 2HPO3]2+
m/z
Figure 7.32.╇ Product ion spectra of (a) the nonmethylated and (b) the methylated forms of the synthetic peptide angio II (Sar1Thr8) that also contains three phosphorylations. In (a) the nonmethylated and triphosphorylated peptide, the loss for the HPO3 group as −80╯Da is not observed. Loss associated with H3PO4 at −98╯Da for one phosphorylation is observed. In (b), the methylated and triphosphorylated peptide product ion spectra neutral losses associated with the HPO3 group as −80╯Da, the H3PO4 group as −98╯Da, 2HPO3 at −160╯Da, and HPO3╯+╯H3PO4 at −178╯Da are observed. (c) Expanded x-axis of the associated phosphate moiety neutral loss region for the methylated and triphosphorylated peptide.
7.4.9.1.5â•… Peptide Angio II (Sar1Thr8): Summary.╇ Table 7.3 is a compilation of the product ion spectral results of the different forms of angio II (Sar1Thr8) studied. In the table, the relative spectral peak responses for the neutral loss associated with −98╯Da as H3PO4 or −80╯Da as HPO3 are compared. All forms of angio II were observed to lose the phosphate moiety as −98╯Da in the form of H3PO4; however, the loss associated with −80╯Da as HPO3 was only observed with the methylated form of the peptide. Manual inspection of the product ion spectra
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 291
TABLE 7.3.╇╛Relative Spectral Peak Response for Losses Associated with the Phosphorylation Moiety of Angio II (Sar1Thr8)
Peptide Form
Nonmeth with one phosphoryl Meth with one phosphoryl Nonmeth with two phosphoryls Meth with two phosphoryls Nonmeth with three phosphoryls Meth with three phosphoryls
−80╯Da Loss as HPO3
−98╯Da Loss as H3PO4
−160╯Da Loss as 2HPO3
−178╯Da Loss as HPO3╯+╯H3PO4
−196╯Da Loss as 2H3PO4
–
Major
–
–
–
Minor
Major
–
–
–
–
Major
–
–
Minor
Minor
Major
–
Minor
–
–
Minor
–
–
–
Major
Major
Minor
Minor
–
associated with both one and two phosphorylation moieties revealed that the spectra represented a mixture of the amino acid site modifications (e.g., for one phosphorylation, three forms of the peptide made up the spectrum as pTyr, pHis, and pThr). The −80╯Da loss observed in the methylated and phosphorylated forms of the peptide are associated with the pHis residue as evidenced by the lack of this loss in the nonmethylated form. 7.4.10 Validation of Cu(II)-Based IMAC Phosphohistidine Enrichment IMAC is used for phosphopeptide cleanup and enrichment in phosphoproteomic studies. The typical IMAC approach for Ser/Thr/Tyr phosphopeptide enrichment is to use iron(III)-based complex formation between the iminodiacetic acid (IDA) resin column bed and the phosphate group under acidic conditions. In the IMAC column, the two terminal carboxyl groups of the IDA resin form a circular metal chelate complex that, in turn, will form a complex with the phosphate group under acidic conditions, allowing the washing and enrichment of the phosphopeptides.
292â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
7.4.10.1â•… Fe(III)-Based IMAC versus Cu(II) Based.╇ For Fe(III)based IMAC, the optimal acidic conditions are at pH levels of ∼2.5–3.5. This is not compatible with the phosphohistidine amino acid residue that is acid labile and will release the phosphate group under these conditions. An alternative IMAC approach has been developed based on copper(II)-based metal chelate complex. Under neutral conditions, the Cu(II)-based IMAC will complex with the imidazole N of the His amino acid residue. The imidazole nitrogen has a pKa of 6.0 and thus, is protonated below this pH value. The methodology utilizes a pH 5.5 MES buffer that protonates the nonphosphorylated imidazole ring of the His residue, thus avoiding any nonspecific binding of nonphosphorylated His. The pH amino acid residue imidazole ring is neutral under pH 5.5 MES buffer conditions and thus will complex and enrich using the Cu(II)-based IMAC methodology. 7.4.10.2â•… Cu(II)-Based IMAC of Angio I.╇ Angio I and angio II in the nonmethylated form were used to set up and optimize the Cu(II)based IMAC methodology. Figure 7.33 illustrates the recovery of angio I nonmethylated and containing two phosphohistidine residues. The relative responses in the TIC in Figure 7.33a, while not quantitative, does give an estimation of the stability of the phosphohistidine residue (a)
(b)
m/z
Figure 7.33.╇ Recovery of angiotensin I nonmethylated and containing two phosphohistidine (pHis) residues. The relative responses in the TIC in (a) give an estimation of the stability of the phosphohistidine residue through the sample methodology and analytical system. (b) Product ion spectrum collected of the phosphohistidine-containing peptide, hand annotated and confirmed.
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 293
through the sample methodology and analytical system. This would constitute the in vitro phosphorylation of the His residue with potassium phosphoramidate, the loading and desalting of the modified peptide on the SPE precolumn, the separation of the modified peptide in the approximately neutral pH HPLC system (10╯mM ammonium acetate at pH 6.7), and finally, the ESI and mass spectral measurement. It is difficult to say, as in Figure 7.33a, where the non-pHis-containing peptide response is larger than the 2-pHis-containing peptide, where the preservation of the pH residue may have broken down. The case may be that prior to separation of the peptides by the HPLC column, the 2-pHis-containing peptide reverted back to the non-pHis-containing peptide. It may be that for this particular synthesis, the production of the 2-pHis-containing peptide was low. In any respect, though, the preservation is demonstrated to an end effect of measurement and identification. Figure 7.33b illustrates the product ion spectrum collected of the phosphohistidine-containing peptide. Both this spectrum and the one collected for the non-pHis-containing peptide in Figure 7.33a were hand annotated and confirmed. 7.4.10.3â•… Cu(II)-Based IMAC of Angio II.╇ In contrast to angio I illustrated in Figure 7.33, the angio II peptide illustrated in Figure 7.34 indicated good preservation of the pHis-containing peptide as shown by the overwhelming amount of peptide measured at ∼31 minutes for the pHis-containing peptide as opposed to the non-pHiscontaining peptide at ∼33 minutes. In both cases, for angio I and angio II, the system is conducive to the preservation of the phosphohistidine moiety. 7.4.11 In Vivo Measurement of Phosphohistidine 7.4.11.1â•… Time-Based Digestion Study.╇ The Cu(II) IMAC-enriched C. crescentus sample described in Section 7.4.3.6 was loaded onto the nano-HPLC system and analyzed on the LTQ mass spectrometer using MSA for product ion spectral accumulation. Figure 7.35 illustrates a comparison of a 60-minute V-8 digestion fraction and a 90-minute V-8 digestion fraction. For all the different time frame fractions collected, it was observed that the most identifications were made in the 60minute digestions, with minimal peak broadening. Figure 7.36 illustrates the TIC of the peptides collected from the 60-minute V-8 digested sample. The 1-D SDS-PAGE gel insert illustrates the digestion of the C. crescentus sample at various stages of time including an undigested lane, followed by lanes representing 30-, 60-, and 90-minute digestion
294â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
100
33.14 547.28
603.68 31.52 603.68
30.93 603.68 30.73 263.12
%
Angio II nonmethylated and non-pHis
33.33 563.72
Angio II nonmethylated 31.32 and pHis
32.70 32.74 603.70 547.32
34.81 563.72 34.92 563.73 35.59 40.38 263.12 38.77 563.72 263.73
Time 0 10.00 12.50 15.00 17.50 20.00 22.50 25.00 27.50 30.00 32.50 35.00 37.50 40.00 42.50 45.00 47.50 50.00 52.50 55.00 57.50 60.00
100
603.7
283.1
604.2 563.7 546.2
%
271.2 272.1
0
255.1 217.1
273.1
343.2
511.2 511.7 506.2 510.7 372.2 432.7 472.0 410.2 530.2
504.2 504.7 595.2 580.7
604.7 614.2 615.2
529.3 836.2 727.3 738.3756.3
840.3 804.3 805.4
844.3
m/z
175 200 225 250 275 300 325 350 375 400 425 450 475 500 525 550 575 600 625 650 675 700 725 750 775 800 825 850 875 900 925 950 975
Figure 7.34.╇ Recovery of angiotensin II nonmethylated and containing one phosphohistidine residue. The angiotensin II peptide indicated good preservation of the pHiscontaining peptide as shown by the overwhelming amount of peptide measured at ∼31 minutes for the pHis-containing peptide as opposed to the non-pHis-containing peptide at ∼33 minutes.
times. As previously reported, it appears that up to 90% proteolysis has taken place after 30–60 minutes of digestion.22 It was observed that many of the peptides derived from the endopeptidase activity of the V-8 protease were of intermediate (8–15 residues) to long (30+ residues) sequences of amino acids similar in length to tryptic peptides. Peptides were observed and identified in both +2 CS and +3 CSs. 7.4.11.2â•… Phosphohistidine-Containing Peptides.╇ In the study, a total of 111 pH sites on 102 unique phosphohistidine-containing peptides equating to 99 C. crescentus proteins were identified and are listed in Table 7.4 (see Section 7.5 for all associated product ion spectra). 7.4.11.3â•… Phosphohistidine Product Ion Spectra.╇ In the product ion spectra, we are also observing neutral losses of −80╯Da from the b- and y-type ions. This was previously observed in the fragmentation patterns studied of the in vitro His phosphorylated peptides (e.g., see Fig. 7.21). The neutral loss of −80╯Da is associated with the loss of the phosphate group from the His amino acid residue as HPO3. Figure 7.37 illustrates
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 295
(a) 100
%
0
0
50
100
150 200 Time (minutes)
250
100
150 200 Time (minutes)
250
300
(b) 100
0
50
300
Figure 7.35.╇ Comparison of TIC for (a) a 60-minute V-8 digestion fraction and (b) a 90-minute V-8 digestion fraction. Most identifications were made in the 60-minute digestions with minimal peak broadening.
a product ion mass spectrum of the phosphohistidine-containing peptide YPQDAGAAALS*AHPARH*PLG (hypothetical protein) at m/z 1081.6 as [M╯+╯2H]2+, where neutral losses associated with the phosphoryl group of −98╯Da as H3PO4 (designated with Δ, loss associated with
296â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE Digested Digested Digested Undigested MW 90 minutes 60 minutes 30 minutes marker
b8 b9
b6
b3
b11 b12 b13 b14 b15
b17 & b17
b19-HPO3
Y P Q D A G A A A L pS A H P A R pH P L G
++
** b17
y19 y18 y17 y16 y15 & y15
a*
11
y15++
100
++ ++
y11
++
y15*
y16
++
y*++ -HPO3 y15 ++ 10 b15 b* y6* -HPO3 8 b6 b14 b3
%
*
a10
* y9 y11 y10 y13 -HPO3-HPO3-HPO3
++
b11 b13
++
y10
* b17
++
b17
++
y18 ++
y17
y19
++ o
y6
-HPO3
b12
b19
a12
y13 y15
b14 y15
-HPO3
60 minutes digestion TIC
0 0
50
100
150 Time (minutes)
200
250
300
Figure 7.36.╇ Total ion chromatogram of the peptides collected from the 60-minute V-8 digested sample. The 1-D SDS-PAGE gel inset illustrates the digestion of the C. crescentus sample at various stages of time including an undigested lane, followed by lanes representing 30-, 60-, and 90-minute digestion times. Second inset shows an example of an MSA product ion spectrum collected. MW, molecular weight.
both Ser and His) and of −80╯Da as HPO3 (loss associated only with His) are observed. 7.4.12 Gene Ontology of Phosphorylated Proteins 7.4.12.1â•… Localization of Phosphorylated Proteins.╇ The localization of the His phosphoproteome of C. crescentus was extracted from the identified proteins using the PSORTb (v.2.0)26 subcellular localization prediction tool. The distribution of the phosphohistidine-containing proteins was determined at 40% cytoplasmic, 13% cytoplasmic membrane, 3% periplasmic, 3% outer membrane, 12% noncytoplasmic signal peptide containing, and 29% unknown as illustrated in Figure 7.38a. To get a better view of the localization of the phosphohistidine proteome, the 29% unknown assignment was removed and the other assignments were normalized to 100%. This is illustrated in Figure 7.38b where the distribution of the phosphohistidine-containing proteins is now determined at 56% cytoplasmic, 18% cytoplasmic membrane, 4% periplasmic, 4% outer membrane, and 17% noncytoplasmic signal peptide containing. It can now be seen that the localization is
297
Indole-3-glycerol phosphate synthase Serine acetyltransferase AroK shikimate kinase N-(5-phosphoribosyl)anthranilate isomerase Diaminopimelate epimerase Amidotransferase HisH HemC porphobilinogen deaminase RibF riboflavin biosynthesis protein HemN oxygen-independent coproporphyrinogen III oxidase Membrane protein, putative Alpha-1,2-mannosidase family protein Articulin, putative Glycosyl transferase, group 1 family protein S-layer protein RsaA UDP-glucose 6-dehydrogenase
1899 2651 3008 3545 3686 3735 0072 0703 1411 0089 0533 0675 0755 1007 2379
RS*QAEIDAAAKAAS*APRGFKAALEAH*HAP
SVVLRPVPAH*CT*AAGVPARLVNCPT*CEE AEAQVH*VET*GDT*PHMVAV GVIKAFSVSSS*ADVD*QAAAFDGVAQH*
TDDFARGT*GSLVEHH*PLFPEGVNVGFAHIA AT*ADH*GGPFT*AAVAKDNVAGVQFHPEKSQ LMQARIAH*ALGVPAGAS*KDEIE
KT*PLGVIS*FDPH*PRR
AALT*EAGYVRIGLD*H*Y
S*RRGFARKIVALFPH*H*A YIH*AGRPDRT*HEIVRGLMAKHY*RL
CDIHWRPAPPPS*CGH*PCAPIPAPTPCASPCGY *GG ACGMPQDAKLALAVGRFH*PEKR
VETVNIAATDTNTT*AH*VDTLTLQATSAKS* GLVSGACFADFGH*VVT*CIDKDPS*KIER
Annotation
CC #
Peptide
TABLE 7.4.╇ Phosphohistidine-Containing Peptides Identified from C. crescentus
Cell envelope Cell envelope
Cell envelope
Cell envelope
Cell envelope Cell envelope
(Continued)
Amino acid biosynthesis Amino acid biosynthesis Biosynthesis of cofactors, prosthetic groups, and carriers Biosynthesis of cofactors, prosthetic groups, and carriers Biosynthesis of cofactors, prosthetic groups, and carriers
Amino acid biosynthesis Amino acid biosynthesis Amino acid biosynthesis
Amino acid biosynthesis
Predicted Function
298 CC #
2543 2868 0902 1025 1213 1425 1553 2317 3639 0451 1216 1348 2604 2997
Peptide
AGRH*IAVLLGGPSSERKVSLVS*GAACAEA EMQTALDT*ALSAGAPGVLLLH*C S*GNKNQTYSAGGVKAQT*H* ALT*DASFRAH*KFDVAFSD*R
Y*QTEQH*RRAMAADLLAKHGLS*FAKWV
H*MPGARNIPLS*ALIAPDGTMLS*AEKLK AHPIEVAGILTEYRLDT*ATIVT*ALLH*D
VGQT*GEALRGIH*EKVGGIDELVNAIAAS*
M.S*VERTLH*H*FPLDPASRQ
LGPT*WILETGLHTGH*LS*AAGHLKYETGA
LPAPAPCMQRET*FAPLLH*VVPYNS*FDMAIAI
AHGGATQH*FDVIIVGAGIS*GIGGAY*HL
SRRH*VEKGLAGTGGAAGAGRRIAS*NY*F
S*EAFH*HATLDAVLSALFS*RRA
TABLE 7.4.╇ (Continued)
Cytochrome P450 family protein
Monooxygenase, flavinbinding family Aminotransferase, class V
d-alanine-d-alanine ligase B NeuB protein, putative Flagellar hook protein Penicillin amidase family protein Transcriptional regulator, MarR family Rhodanese family protein Guanosine-3,5bis(diphosphate) 3-pyrophosphohydrolase Methyl-accepting chemotaxis protein McpM Glutathione S-transferase family protein Metallo-beta-lactamase family protein Aldehyde dehydrogenase
Annotation
Central intermediary metabolism Central intermediary metabolism Central intermediary metabolism Central intermediary metabolism Central intermediary metabolism
Cellular processes
Cellular processes
Cellular processes Cellular processes
Cellular processes
Cell envelope Cell envelope Cellular processes Cellular processes
Predicted Function
299
DNA polymerase III, beta subunit Type I restrictionmodification system, M subunit, putative RecG ATP-dependent DNA helicase SucC succinyl-CoA synthetase, beta subunit Hutl imidazolonepropionase Cytochrome c oxidase assembly protein, putative Sugar isomerase, KpsF/ GutQ Alpha-amylase family protein Pectin acetylesterase XylB xylosidase/ arabinosidase Succinate-semialdehyde dehydrogenase
3132 0156 0620 1437 0337 0960 1374 2263 2286 2313 2802 3140
DLAH*HLPAQADHKVIAELGGS*RIVTRA
TRFAISTEETRY*YLNGLYVH*TVNEGGE
LFLLH*LLSKMRPAVDGGSRFGIVLNGS*PLFT *G
PANVD*WGVH*LLAMSAT*PIP and AMADAAS AGFQS*ALMAPT*EILARQH*FETIA T*H*HRALAKALGLTGGLAKEAAS*L
GAH*ALPPEY*RDDEDGY*V MAAVPIS*LGVLH*QAGAAVLLAVA
VVCTGMGKS*GH*VARKIAAT*L
GQESAAH*HGYWITD*FT*DVDPH
QVREGLPPVFQLH*AADDKAVPVENS*LLMF SAL YYLTAAEGGT*AEGH*SQVVLRS*
VLT*GGDVHGLGGH*FY*
Aminotransferase, class III
FMN oxidoreductase
3083
FET*WAKEAKS*GGGQVWMQINH*P
Annotation
CC #
Peptide
Energy metabolism
Energy metabolism
Energy metabolism
Energy metabolism
Energy metabolism
Energy metabolism Energy metabolism
Energy metabolism
DNA metabolism
DNA metabolism
(Continued)
Central intermediary metabolism Central intermediary metabolism DNA metabolism
Predicted Function
300
Fatty oxidation complex, beta subunit, putative Enoyl-CoA hydratase/ isomerase family protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein
0077 0397 0633 1556 1869 2083 2798 2922
0631 0710 0813 0858 1079
SSWLLPARVGH*ARAYAMFALGEAVDGAT*
FQDYAGH*GFDGIVD*FPPHAIV
IPKASQKPGVPWHDPIIPGAPGH*GEGH*NS*
H*NPAIMQTLAARAPVPVKSAEAY*
VEAVRERH*PRAKVALAGIS*MGGGLAIS*AM
EH*NLES*FT*LMAGLAAVTSKIKIFATVATLT
S*T*LAVIVVTGVLMANAPAEPKDLVAAPPA AVH* DLVEKLEAS*GT*DLQHALQKIAIPEAEARGQ SVH* H*LKLDDQLVYAEG SVGVS*AFIGTLGLCFAFH*GLI FLNPILT*GDRPDPSILKDGADYY*MTH* QAGFLVVAPLHVDS*QKH*PRKT*AYNL IH*GLTSGAIIILGMIAT*IVLS*K 0236
Acetyl-CoA synthetase
3581
INVS*GH*RLGTAEIESALVAHETVAEAAVV GYP NH*VH*HAGNSSGIVDGAAGVLIGTKE
Annotation
CC #
Peptide
TABLE 7.4.╇ (Continued)
No data No data No data No data No data
No data
Hypothetical proteins
Hypothetical proteins
Hypothetical proteins
Hypothetical proteins
Hypothetical proteins
Fatty acid and phospholipid metabolism Fatty acid and phospholipid metabolism Hypothetical proteins
Energy metabolism
Predicted Function
301
Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein M20/M25/M40 family peptidase ClpB ATP-dependent Clp protease, ATP-binding subunit Peptidase, M23/M37 family ATP-dependent Clp protease, ATP-binding subunit ClpA Peptidase, M16 family Prolyl oligopeptidase Heat shock protein HtpX, putative Peptidase, M16 family Ribosomal large subunit pseudouridine synthase d
1466 1684 1978 2184 2420 2697 2770 2840 3114 0736 0878 2248 2468 2578 3688 2509 3398 0453
AVALMS*GH*DKKPAEDPPASRAGLIVET*G RAD ILICT*LLSLAALS*VLVDGH*RAAPITARAP NH*FT*VSAT*KQITSV YPQDAGAAALS*AHPARH*PLG H*T*LAAISLAASASPLTAPPRAT*LASPRAQ NDVDAAGT*FIH*PE SFPDLPPRSFH*GLPGMLAD*ALPDKY*GHVL H*MAFLAD*DLLE GGLWGAVH*Y*AGVLLVWL MLH*TTIAPTMLKGS*PKE and DDQPAEWDKLHAWLQTT*Y*PQAH*KAM YEVH*HGVRISDSAIVAAAT*LS*
LH*LSKTLVS*PGTY*VRRGDEVALTGKEGR Y*EDFH*KLKYTADALKVAVELSAKYIT
H*RGVLS*NGLKVVLAERRDT FT*H*GPKGWTAQTLDLPANSAIGLGSS*DD IT*VIDSAQPNAFATGRDPD*H*A
D*H*GALGIY*AGCAASDAV PADPEPEALPLS*ILY*EDAH*LIVVDKAAG
Annotation
CC #
Peptide
Protein fate Protein synthesis
Protein fate Protein fate Protein fate
Protein fate Protein fate
Protein fate
No data No data No data No data No data No data No data No data Protein fate
No data
(Continued)
Predicted Function
302 CC #
0460 3200 1703 1928 2444 2500 0284 1562 2932 3094 3315 3327 3253 0139 0214
Peptide
PGWH*IECS*AMIDKALGQT
H*D*GAATMDWMEQEQE
LAQTGGITEH*EMRRTFN
FARTS*QRPAEGH*AVDELVRRIMAAPGEIT*
RINAS*VS*INH*LSFNELDIGDYRTFAKLNPPL
H*ANVAGVAGGEAFASDGLFSVSLDA LRAAHE KS*ALSYLAICLT*VGH*APG EEAEAT*VH*LVDVQAELAS*RA H*IEQRTALLASVS*HDLRTPLT*RL FKT*VNDT*LGH*PLGDALLKIAAERLRGCV
RAAKSGIPVLIT*GES*GVGKELIARAVH*G
PGSPDVTRGNEQTLLIS*FAGPPSIY*RH*
VH*LKRSTWDSAQSFT*AWAHAVARY*K
EISEVLVLGRGAQRLRIAGS*S*H*VVNE GH*EEESAVTIGLKQT*RIDLRGEY*DADL
TABLE 7.4.╇ (Continued)
CysS cysteinyl-tRNA synthetase FusA translation elongation factor G PurM phosphoribosylformyl glycinamidine cyclo-ligase Inosine-uridine preferring nucleoside hydrolase Aspartate transcarbamoylase, pyrC subunit Phosphoribosylformylglycina midine synthase II Response regulator GTP-binding protein Era Sensor histidine kinase Sensory box/GGDEF family protein Sigma-54 dependent DNAbinding response regulator Sensory box histidine kinase, putative RNA polymerase sigma-70 factor, ECF subfamily TonB-dependent receptor TonB-dependent receptor
Annotation
Transport and binding proteins Transport and binding proteins
Transcription
Regulatory functions
Regulatory functions
Purines, pyrimidines, nucleosides, and nucleotides Regulatory functions Regulatory functions Regulatory functions Regulatory functions
Purines, pyrimidines, nucleosides, and nucleotides Purines, pyrimidines, nucleosides, and nucleotides Purines, pyrimidines, nucleosides, and nucleotides
Protein synthesis
Protein synthesis
Predicted Function
303
Major facilitator family transporter DMT transporter, 10 TMS drug/metabolite exporter (DME) family PhoH family protein Metallo-beta-lactamase family protein Pentapeptide repeat family protein Glutaminyl-tRNA synthetase, putative Hydantoinase/oxoprolinase ROK family protein Glucose inhibited division protein A HAD-superfamily hydrolase, subfamily IIA
2244 2486 2568 0055 1176 1891
2465 3755 0705
QVSAFALGAAT*RGVVVGAVTAIVIGVLPGAG LS*VAH*L H*FNNPGGGVIASVLGSFGGEAGQS*FAS*A CPG IIS*KLVVAH*WGVPPLY*YAAVRFALVA
T*QVDLLNPRDS*GLAH*AVSILEGVEGV Y*AKIFHDILDYH*T*APEVAA
NFQGARFDGARFH*NADMTGSNLRGGIFNS* ADF H*LS*FLEEGADGS*S
D*VYMLNAPYNGGT*H*L and D*LH*LRRFGFVSPTTPLVVETLS*VEAIG DGAGAGAS*VVFAAIVGTGCGGGVVVD*GKII NGH*NG GGGHAGCEAAAAS*ARAGARTLLLTH*K
S*H*D*VVAQLDALGVPRSAW
2369
2205
Cation efflux family protein TonB-dependent receptor Iron compound ABC transporter, periplasmic substrate-binding protein ABC-2 type transporter
0303 1131 1191
HVT*IQVES*GHGAH*ACRL VVDGQTY*LLT*SPTNGGSAKIKGIELAYQH*LF H*RAVAPDS*YLRDKAAGLPLRRAT*FESLV
Annotation
CC #
Peptide
Unknown function
Unknown function
Unknown function
Unknown function
Unknown function
Unknown function
Unknown function Unknown function
Transport and binding proteins
Transport and binding proteins
Transport and binding proteins
Transport and binding proteins Transport and binding proteins Transport and binding proteins
Predicted Function
304â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE b17 b8 b 9
b6
b
3
b11 b
b b b 12 13 14 15
&
b17
b19-HPO3
Y P Q D A G A A A L pS A H P A R pH P L G
100
++ **
y
y
y
y y y 19 18 17 16 15
b17
13
-HPO 3
&
y
15
Relative Abundance (%)
* a 11
y ++
y
++
++
y
y*
15
15
11 ++ 16 y *++ -HPO3 y 15 ++ 10 b15 b* y * -HPO 8 6 b6 3 b14 b 3
* a 10
++
b17
++
y18 ++
y17
y*
-HPO 3
600
800
y
9
6
++ b*
17
b11 b
19
++
b19o
13
13
++
y
y
-HPO 3
y b12
y
13
y
15
*o
10 b 14
a
12
-HPO 3
b14
y*
15
0 400
y
11 10
x4
x4
++
y
-HPO -HPO 3 3
1000
1200
1400
y
15
m/z 1600
Figure 7.37.╇ Product ion mass spectrum of the phosphohistidine-containing peptide YPQDAGAAALS*AHPARH*PLG (hypothetical protein) at m/z 1081.6 as [M╯+╯2H]2+ obtained from the 60-minute digestion illustrated in Figure 7.35. Observed in the spectrum are neutral losses associated with the phosphoryl group of −98╯Da as H3PO4 (designated with Δ) and of −80╯Da as HPO3.
43% membrane associated and 56% cytoplasmic. This distribution is not unexpected as majority of the known phosphohistidine-containing proteins are the membrane-bound sensor His kinase. 7.4.12.2╅ Function of Phosphorylated Proteins.╇ The functions of the identified phosphorylated proteins were obtained from the C. crescentus functional database downloaded from The Institute for Genomic Research (TIGR) Comprehensive Microbial Resource (CMR) and are listed in Table 7.4. The distributions for the functions of the identified phosphohistidine proteome are illustrated in Figure 7.39a, which includes a 29% unknown function assignment. Again, to get a better view of the function assignment of the phosphohistidine proteome, the 29% unknown assignment was removed and the other assignments were normalized to 100%. This is illustrated in Figure 7.39b, where the distribution of the phosphohistidine-containing proteins has the largest distribution of the phosphohistidine-containing proteins at 30% metabolism-related (listed in Table 7.4, these are ATP-dependent or
(a)
Percent of total (%)
40% 29%
13%
12%
no w n
e
U nk
N
on
cy t
op
Cy
la
to
pl
sm
ic
sig
er
na
m
lp
ep
tid
an br
em
la ip O ut
as m
3%
e
ic sm
e an br
Pe r
em m
ic
Cy
to
pl
as m
ic
3%
56%
Percent of total (%)
(b)
18%
17% 4%
Cy to
pl
as m
Cy
to pl as m ic ic m em br an e P er N i on p O la cy ut sm to e rm ic pl as em m ic br an sig e na lp ep tid e
4%
Figure 7.38.╇ The prokaryote bioinformatic PSORTb (v.2.0) subcellular localization prediction tool was used to extract the localization of the histidine phosphoproteome of C. crescentus. (a) The distribution of the phosphohistidine-containing proteins was determined at 40% cytoplasmic, 13% cytoplasmic membrane, 3% periplasmic, 3% outer membrane, 12% noncytoplasmic signal peptide containing, and 29% unknown. (b) Localization of the phosphohistidine proteome minus the 29% unknown assignment (normalized to 100%). The distribution of the phosphohistidine-containing proteins at 56% cytoplasmic, 18% cytoplasmic membrane, 4% periplasmic, 4% outer membrane, and 17% noncytoplasmic signal peptide containing. 305
29% (a)
Percent of total (%)
21% 15% 11%
9%
8%
6%
nd
D
N
op ea
30%
(b)
Percent of total (%)
21% 15%
13%
ns in
g
ns
cr
pr ot ei
ip tio n
s
bi nd nd
ul
at or
Tr a
fu n y
d an
ta or ns p Tr a
te fa ei n
Pr ot
ct io n
sis he nt sy
ab m et id
y, lip rg ne
,e A N D
ol
es ro c
la rp llu
ce nd
op ea el
nv C
el
le
eg
s se
is es th yn os bi ct or fa co
ism
2%
an d id ac in o m A
11%
8%
R
A
C
m
el
in
o
le
ac
nv
el
id
an d
co f
ac to
rb io
sy nt ce he l sis lu A la ,e rp ne rg ro y, ce ss lip es i d Pr m ot et ei ab n ol fa ism te an d sy R eg nt he ul at sis or y fu nc Tr tio an ns sp Tr or an ta sc nd ri pt bi io nd n in g pr U ot nk ei no ns w n fu nc tio n
1%
Figure 7.39.╇ The functions of the identified phosphorylated proteins were obtained from The C. crescentus functional database. (a) The distributions for the functions of the identified phosphohistidine proteome including a 29% unknown function assignment. (b) The 29% unknown assignment was removed and the other assignments normalized to 100%. The distribution of the phosphohistidine-containing proteins has the largest distribution of the phosphohistidine-containing proteins at 30% metabolism related.
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 307
ATP-related proteins). For example, the cytochrome c oxidase assembly protein (CC 1374) is a large transmembrane protein complex that is involved in membrane potential differential for the production of ATP. Also included are the ATP-dependent Clp protease ATP-binding subunit ClpB (CC 0878) and the ATP-dependent deoxyribonucleic acid (DNA) helicase RecG (CC 1437), both involved in DNA replication and repair. The NeuB protein (CC2868) is involved in sialic acid synthesis (probably through phospho group transfer), AroK shikimate kinase (CC 3008), a phosphotransferases, PurM phosphoribosylformylglycinamidine cycloligase (CC 1703), and glutaminyl-tRNA synthetase (CC 2205), both ATP-dependent enzymes.This may be due to a role of the phosphohistidinecontaining protein being involved in phospho group transfer via the His residue (an extended type of phosphotransferases). 7.4.13 Predicted Regulatory Protein Motif Study The ScanProsite6 tool was used to investigate the phosphohistidineassociated motifs of six regulatory proteins identified in the study that contained a pH residue (see Table 7.4, CC 0284, 1562, 2932, 3094, 3315, and 3327). The peptide identified from the sensor His kinase (CC 2932) H*IEQRTALLASV[SH]DLRTPLT*RL contains a phosphorylation on the H-243 site and also on the predicted H-251 site. The response regulator (CC 0284) contains a regulator domain between residues 7 and 73 where the Asp-57 residue may be phosphorylated. The peptide identified, KS*ALSYLAICLT*VGH*APG, from the response regulator contains a phosphorylation on the H-126 site in close proximity to the conserved domain. The peptide identified from the sensory box/GGDEF protein (CC 3094) FKT*VNDT*LGH*PLGDALLKIAAERLRGCV contains a phosphorylation on the H-330 site, which is included in the GGDEF domain (residues 309–442). Finally, the DNA-binding response regulator (CC 3315) contains a response regulator domain between residues 4 and 199 where the Asp-54 residue may be phosphorylated, and an ATP-binding region between residues 168 and 181. The peptide identified, RAAKSGIPVLIT*GES*GVGKELIARAVH*G, contains a phosphorylation site on the Thr-167 and Ser-170 residues within the ATP-binding region and a phosphorylation on the His-186 residue. In C. crescentus, there are 49 predicted His phosphorylated proteins and 54 predicted aspartate phosphorylated proteins out of the 3767 putative open reading frames (private communications). The lack of identifications in predicted sites may be due to the highly transitory nature of the two-component signaling system and to the limited
308â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
information collected thus far concerning these phosphorylation events. However, we have observed that 82% of the phosphohistidinecontaining proteins identified in this study have also been observed and identified in the normal proteome of C. crescentus (data being reported in Chapter 6). 7.4.14 Validation of Phosphohistidine-Containing Proteins 7.4.14.1╅ Phosphorylation Motif Study.╇ The 102 unique phos� phohistidine-containing peptides listed in Table 7.4 were analyzed for 12 different phosphorylation motifs27 and compared with the basic trend of a global analysis of the phosphoproteome of C. crescentus (data reported elsewhere). Figure 7.40 illustrates the distribution of the phosphorylation motifs associated with kinases. As can be seen in Figure 7.38, the general trend in the motif distribution is similar for the enriched phosphohistidine-containing peptides as that of the global phosphoproteome study. There does, however, appear to be a decrease in the proline (Pro)-directed motifs in the pHis-enriched sample along with an elevation in the motifs associated with the pyruvate dehydrogenase kinase, isozyme 1 (PDK1) kinase. 20
Phosphorylation Sites
18 16
Global pHis Enriched
14 12 10 8 6 4 2 0
Figure 7.40.╇ Distribution study of 12 different phosphorylation motifs in the 102 unique phosphohistidine-containing peptides identified in the C. crescentus extracts. Also included is the distribution for the global Ser/Thr/Tyr phosphoproteome illustrating a similar trend in the motif distribution for the enriched phosphohistidine-containing peptides as that of the global phosphoproteome study.
STUDY OF PH-CONTAINING PEPTIDES BY NANO-ESI TANDEM MSâ•…â•… 309
TABLE 7.5.╇╛Motif Distribution of C. crescentus Phosphohistidine Proteome
Residue Type (X)
Nonpolar [GLIAVPMWF] â•… G â•… L/I â•… A â•… V â•… P â•… M â•… W â•… F Polar [STYCQN] â•… S â•… T â•… Y â•… C â•… Q â•… N Negative [DE] â•… D â•… E Positive [KRH] â•… K â•… R â•… H Phosphorylated p[STDH] â•… S â•… T â•… D â•… H
XH*
H*X
Sites
Sites
71 (64% of total) 18 (25% of NP) 15 (21% of NP) 15 (21% of NP) 13 (18% of NP) 2 (3% of NP) 0 (0% of NP) 1 (1% of NP) 7 (10% of NP) 15 (14% of total) 3 (20% of P) 3 (20% of P) 1 (7% of P) 0 (0% of P) 5 (33% of P) 4 (27% of P) 4 (4% of total) 2 (50% of Neg) 2 (50% of Neg) 9 (8% of total) 1 (11% of Pos) 7 (78% of Pos) 1 (11% of Pos) 12 (11% of total) 2 (17% of p) 4 (33% of p) 4 (33% of p) 2 (17% of p)
65 9 16 13 8 10 2 1 6 18 3 4 2 2 2 5 8 4 4 18 4 5 9 3
(58% of total) (14% of NP) (25% of NP) (20% of NP) (12% of NP) (15% of NP) (3% of NP) (2% of NP) (9% of NP) (16% of total) (17% of P) (22% of P) (11% of P) (11% of P) (11% of P) (28% of P) (7% of total) (50% of Neg) (50% of Neg) (16% of total) (22% of Pos) (28% of Pos) (50% of Pos) (3% of total)
The % of total determined from 111 phosphohistidine sites. NP, nonpolar; P, polar; Neg, negative; Pos, positive; p, phosphorylated.
7.4.14.2╅ Phosphohistidine Kinase Motif.╇ The kinase motif associated with the His phosphorylation site was determined from the study on the entire phosphohistidine proteome, and the results are listed in Table 7.5. The amino acid residue immediately to the left of the pH residue has a 64% occurrence as a nonpolar residue (residues G, L, I, A, V, P, M, W, or F) while directly to the right has a 58% occurrence. The occurrence of a polar residue (S, T, Y, C, Q, or N), a negative (D or E), positive (K, R, or H), or phosphorylated (S, T, D, or H), is also
310â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
listed in Table 7.5. The most generalized motif at 37% of the total pH sites identified would be composed of nonpolarH*nonpolar as [GLIAVPMWF]H* [GLIAVPMWF]. 7.4.15 The pDpH Motif An additional observation made from these studies is of peptides that contain the dual phosphorylation amino acid sequence of “pDpH,” such as the peptide IpTVIDSAQPNAFATGRDPpDpHA from the heat shock protein HtpX. The product ion spectrum illustrating the annotation of the peptide is illustrated in Figure 7.41 for the doubly
b
7
b8
a10 b11
b13 b14 b15
b18
b19
I pT V I D S A Q P N A F A T G R D P pD pH A y y17 y16 y15 y14 y13 y12 18 & y13
y
20
y8
y4 y3 &
y4
-HPO 3
Relative Abundance (%)
100
-HPO 3
IDSAQPNAFATGRDP -3H2O - 3NH 3
++
o*
y
16
a10*
++
b7 ++ yo
IF
++
y o*
b13
VIDSAQPNAFATGRD -4H2O - 2NH3
IDSAQ 2PNAFAT3GRDPpD -3H O - 3NH O b14 y13 b14 y O
20
++ *
y18 b 19 * b b18o b ++ 11 * 8 y20
13
++
++ o* 8 y -HPO3 ++ y15 4 -HPO y y 13 3 3 y 4
IF
++ o
b18
0 600
800
IF
1000
1200
1400
1600
m/z
Figure 7.41.╇ Product ion spectrum illustrating the annotation of the peptide IpTVIDSAQPNAFATGRDPpDpHA from the heat shock protein HtpX that contains the dual phosphorylation amino acid sequence of “pDpH” as the doubly charged precursor [M╯+╯2H]2+ at m/z 1218.0. The product ion spectrum contains continuous coverage of the peptide backbone fragmentation including a number of internal fragment ions associated with the proline amino acid residue.32
SUPPLEMENTARY MATERIALâ•…â•… 311
charged precursor [M╯+╯2H]2+ at m/z 1218.0. The product ion spectrum contains continuous coverage of the peptide backbone fragmentation including a number of internal fragment ions associated with the Pro amino acid residue.39 The identification of the five peptides containing a “pDpH” moiety are listed in Table 7.6 (see Section 7.5 for all associated product ion spectra). 7.4.16 Conclusions In addition to the b- and y-type ions typically observed during CID, phosphohistidine-containing peptides lose −80╯Da as HPO3 and −98╯Da as H3PO4 in a mechanism that is similar to phosphotyrosine where peptide rearrangement is required. Methylating of the phosphohistidinecontaining peptides (effectively removing readily accessible OH groups) results in an enhanced loss of 80╯Da upon CID, while a loss of 98╯Da is still observed, demonstrating the process of peptide backbone rearrangement. Approximately neutral pH (6.7) sample lysis, V-8 protease digestion, Cu(II) IMAC, and LC effectively preserve the phosphohistidine moiety, allowing its mass spectral detection and measurement. The application of the methodology resulted in a global approach where the identification of 99 phosphohistidine-containing proteins was made in C. crescentus.
7.5 SUPPLEMENTARY MATERIAL 7.5.1 Reviewing Spectra Using the SpectrumLook Software Package We suggest using a software package called SpectrumLook that allows readers to inspect the fragmentation (MS/MS) spectra for the phosphopeptides identified in this chapter. Using this software, readers can visually browse the MS/MS spectra that led to the phosphopeptide identifications, including viewing annotations for the identified b and y ions and neutral loss ions where appropriate. This software is supported by the Microsoft Windows platform. The SpectrumLook package can be accessed at http://omics.pnl.gov/software/ SpectrumLook.php. The C. crescentus mass spectral data can be accessed at http:// www.HamBooksOnline.com under PTM Book. Note: To access the file, type the aforementioned address and follow these steps :
312
AAK23392.1
AAK24480.1
AAK25360.1
AAK25162.1
AAK22690.1
IT*VIDSAQPNAFATGRDPD*H *A
D*H*GALGIY*AGCAASDAV
H*D*GAATMDWMEQEQE
S*H*D*VVAQLDALGVPRSAW
Reference
AALT*EAGYVRIGLD*H*Y
Peptide
Translation elongation factor G HAD-superfamily hydrolase, subfamily IIA
Peptidase, M16 family
Heat shock protein HtpX, putative
Oxygen-independent coproporphyrinogen III oxidase
Description
No data
fusA
No data
No data
hemN
Gene
Protein synthesis Unknown function
Protein fate
Biosynthesis of cofactors, prosthetic groups, and carriers Protein fate
Main Role
TABLE 7.6.╛╇ Phosphoaspartate/Phosphohistidine-Containing Peptides Identified from C. crescentus Extracts
Protein folding and stabilization Degradation of proteins, peptides, and glycopeptides Translation factors Enzymes of unknown specificity
Heme, porphyrin, and cobalamin
Sub1 Role
REFERENCESâ•…â•… 313
1. SpectrumLook_Installer.msi––the installer. To install, doubleclick on the file and follow the installation prompts. During installation, a shortcut to run the SpectrumLook program is placed at Start╯→╯Programs╯→╯PAST Toolkit╯→╯SpectrumLook. Alternatively, navigate to the C:\Program Files\SpectrumLook\ folder and double-click file “SpectrumLook.exe.” 2. Caulobacter_grouped.mzXML––the phosphopeptide spectra in mzXML format. 3. Caulobacter_grouped_syn.txt––a summary of the identifications determined by SEQUEST. See the Readme.txt file for a description of the columns in this file. 4. Caulobacter_grouped.ini––a parameter file that specifies the appropriate parameters for these data when browsing them with SpectrumLook. 5. Readme.txt and RevisionHistory.txt––text files that describe the SpectrumLook software.
REFERENCES ╇ 1.╇ Stock, J.; Ninfa, A.; Stock, A.M. Microbiol. Rev. 1989, 53, 450–490. ╇ 2.╇ Stock, J.B.; Stock, A.M.; Mottonen, J.M. Nature 1990, 344, 395–400. ╇ 3.╇ Mathews, H.R. Pharmacol. Ther. 1995, 67, 323–350. ╇ 4.╇ Waygood, E.B.; Mattoo, R.L.; Peri, K.G. J. Cell Biochem. 1984, 25, 139–159. ╇ 5.╇ Pirrung, M.C.; James, K.D.; Rana, V.S. Thiophosphorylation of histidine. J. Org. Chem. 2000, 65, 8448–8453. ╇ 6.╇ Napper, S.; Kindrachuk, J.; Olson, D.J.H.; Ambrose, S.J.; Dereniwsky, C.; Ross, A.R.S. Selective extraction and characterization of a histidinephosphorylated peptide using immobilized copper(II) ion affinity chromatography and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Anal. Chem. 2003, 75, 1741–1747. ╇ 7.╇ Chalmers, M.J.; Kolch, W.; Emmett, M.R.; Marshall, A.G.; Mischak, H. J. Chromatogr. B 2004, 803, 111–120. ╇ 8.╇ McLachlin, D.T.; Chait, B.T. Curr. Opin. Chem. Biol. 2001, 5, 591–602. ╇ 9.╇ Qian, W.J.; Goshe, M.B.; Camp, D.G., II; Yu, L.R.; Tang, K.; Smith, R.D. Anal. Chem. 2003, 75, 5441–5450. 10.╇ Garcia, B.A.; Shabanowitz, J.; Hunt, D.F. Methods 2005, 35, 256–264. 11.╇ Salih, E. Mass Spectrom. Rev. 2005, 24, 828–846. 12.╇ Smith, R.D.; Anderson, G.A.; Lipton, M.S.; Pasa-Tolic, L.; Shen, Y.; Conrads, T.P.; Veenstra, T.D.; Udseth, H.R. Proteomics 2002, 2, 513–523.
314â•…â•… PROKARYOTIC PHOSPHORYLATION OF HISTIDINE
13.╇ Oliver, C.J.; Shenolikar, S. Front. Biosci. 1998, 3, D961–D972. 14.╇ Janssens, V.; Goris, J. Biochem. J. 2001, 353, 417–439. 15.╇ Ceulemans, H.; Bollen, M. Physiol. Rev. 2004, 84, 1–39. 16.╇ Ficarro, S.B.; McCleland, M.L.; Stukenberg, P.T.; Burke, D.J.; Ross, M.M.; Shabanowitz, J.; Hunt, D.F.; White, F.M. Nat. Biotechnol. 2002, 20, 301–305. 17.╇ Kim, J.E.; Tannenbaum, S.R.; White, F.M. J. Proteome Res. 2005, 4, 1339–1346. 18.╇ Moser, K.; White, F.M. J. Proteome Res. 2006, 5, 98–104. 19.╇ Ballif, B.A.; Villen, J.; Beausoleil, S.A.; Schwartz, D.; Gygi, S.P. Mol. Cell. Proteomics 2004, 3, 1093–1101. 20.╇ Beausoleil, S.A.; Jedrychowski, M.; Schwartz, D.; Elias, J.E.; Villen, J.; Li, J.; Cohn, M.A.; Cantley, L.C.; Gygi, S.P. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 12130–12135. 21.╇ Gentile, S.; Darden, T.; Erxleben, C.; Romeo, C.; Russo, A.; Martin, N.; Rossie, S.; Armstrong, D.L. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 5202–5206. 22.╇ Macek, B.; Mijakovic, I.; Olsen, J.V.; Gnad, F.; Kumar, C.; Jensen, P.R.; Mann, M. Mol. Cell. Proteomics 2007, 6, 697–707. 23.╇ Macek, B.; Gnad, F.; Soufi, B.; Kumar, C.; Olsea, J.V.; Mijakovic, I.; Mann, M. Mol. Cell. Proteomics 2008, 7, 299–307. 24.╇ Narmandakh, A.; Gad’on, N.; Drepper, F.; Knapp, B.; Haehnel, W.; Fuchs, G. Phosphorylation of phenol by phenylphosphate synthase: role of histidine phosphate in catalysis. J. Bacteriol. 2006, 188(22), 7815–7822. 25.╇ Wind, M.; Wegener, A.; Kellner, R.; Lehmann, W.D. Analysis of CheA histidine phosphorylation and its influence on protein stability by highresolution element and electrospray mass spectrometry. Anal. Chem. 2005, 77, 1957–1962. 26.╇ Kleinnijenhuis, A.J.; Kjeldsen, F.; Kallipolitis, B.; Haselmann, K.F.; Jensen, O.N. Analysis of histidine phosphorylation using tandem MS and ionelectron reactions. Anal. Chem. 2007, 79, 7450–7456. 27.╇ Zu, X.-L.; Besant, P.G.; Imhof, A.; Attwood, P.V. Mass spectrometric analysis of protein histidine phosphorylation. Amino Acids 2007, 32, 347–357. 28.╇ Ross, A.R.S. Identification of histidine phosphorylations in proteins using mass spectrometry and affinity-based techniques. Two-component signaling systems, part B. Methods Enzymol. 2007, 423, 549–572. 29.╇ Wei, Y.F.; Matthews, H.R. Identification of phosphohistidines in proteins and purification of protein-histidine kinases. Methods Enzymol. 1991, 200, 388–414. 30.╇ Hultquist, D.E.; Moyer, R.W.; Boyer, P.D. The preparation and characterization of 1-phosphohistidine and 3-phosphohistidine. Biochemistry 1996, 5, 322–331.
REFERENCESâ•…â•… 315
31.╇ Medzihradszky, K.F.; Phillips, N.J.; Senderowicz, L.; Wang, P.; Turck, C.W. Synthesis and characterization of histidine-phosphorylated peptides. Protein Sci. 1997, 6, 1405–1411. 32.╇ Lasker, M.; Bui, C.D.; Besant, P.G.; Sugawara, K.; Thai, P.; Medzihradszky, K.F.; Turck, C.W. Protein histidine phosphorylation: increased stability of thiophosphohistidine. Protein Sci. 1997, 8, 2177–2185. 33.╇ Ndassa, Y.M.; Orsi, C.; Marto, J.A.; Chen, S.; Ross, M.M. Improved immobilized metal affinity chromatography for large-scale phosphoproteomics applications. J. Proteome Res. 2006, 10, 2789–2799. 34.╇ Drapeau, G.; Boily, Y.; Houmard, J. Purification and properties of an extracellular protease of Staphylococcus aureus. J. Biol. Chem. 1972, 247(20), 6720–6726. 35.╇ Ham, B.M.; Jacob, J.T.; Cole, R.B. Anal. Bioanal. Chem. 2007, 387, 889–900. 36.╇ Kim, J.; Petritis, K.; Shen, Y.; Camp, D.G., II; Moore, R.J.; Smith, R.D. Phosphopeptide elution times in reversed-phase liquid chromatography. J. Chromatogr. A 2007, 1172, 9–18. 37.╇ DeGnore, J.P.; Qin, J. Fragmentation of phosphopeptides in an ion trap mass spectrometer. J. Am. Soc. Mass Spectrom. 1998, 9, 1175–1188. 38.╇ Palumbo, A.M.; Tepe, J.J.; Reid, G.E. Mechanistic insights into the multistage gas-phase fragmentation behavior of phosphoserine- and phosphothreonine-containing peptides. JPR 2008, 7(2), 771–779. 39.╇ Fu, Q.; Tang, L.S.; Marder, E.; Li, L. Mass spectrometric characterization and physiological actions of VPNDWAHFRGSWamide, a novel B type allatostatin in the crab, Cancer borealis. J. Neurochem. 2007, 101, 1099–1107.
APPENDIX Iâ•… Atomic Weights and Isotopic Compositions
Symbol 1
H D 3 T 3 He 4 He 6 Li 7 Li 9 Be 10 B 11 B 12 C 13 C 14 C 14 N 15 N 16 O 17 O 18 O 19 F 20 Ne 21 Ne 22 Ne 23 Na 24 Mg 25 Mg 26 Mg 27 Al 2
Relative Atomic Mass
Abundance
Standard Atomic Weight
1.0078250321(4) 2.0141017780(4) 3.0160492675(11) 3.0160293097(9) 4.0026032497(10) 6.0151223(5) 7.0160040(5) 9.0121821(4) 10.0129370(4) 11.0093055(5) 12.0000000(0) 13.0033548378(10) 14.003241988(4) 14.0030740052(9) 15.0001088984(9) 15.9949146221(15) 16.99913150(22) 17.9991604(9) 18.99840320(7) 19.9924401759(20) 20.99384674(4) 21.99138551(23) 22.98976967(23) 23.98504190(20) 24.98583702(20) 25.98259304(21) 26.98153844(14)
99.9885(70) 0.0115(70)
1.00794(7)
0.000137(3) 99.999863(3) 7.59(4) 92.41(4) 100 19.9(7) 80.1(7) 98.93(8) 1.07(8)
4.002602(2)
99.632(7) 0.368(7) 99.757(16) 0.038(1) 0.205(14) 100 90.48(3) 0.27(1) 9.25(3) 100 78.99(4) 10.00(1) 11.01(3) 100
6.941(2) 9.012182(3) 10.811(7) 12.0107(8) 14.0067(2) 15.9994(3) 18.9984032(5) 20.1797(6) 22.989770(2) 24.3050(6) 26.981538(2)
Proteomics of Biological Systems: Protein Phosphorylation Using Mass Spectrometry Techniques, First Edition. Bryan M. Ham. © 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 317
318â•…â•… ATOMIC WEIGHTS AND ISOTOPIC COMPOSITIONS Symbol 28
Si Si 30 Si 31 P 32 S 33 S 34 S 36 S 35 Cl 37 Cl 36 Ar 38 Ar 40 Ar 39 K 40 K 41 K 40 Ca 42 Ca 43 Ca 44 Ca 46 Ca 48 Ca 45 Sc 46 Ti 47 Ti 48 Ti 49 Ti 50 Ti 50 V 51 V 50 Cr 52 Cr 53 Cr 54 Cr 55 Mn 54 Fe 56 Fe 57 Fe 58 Fe 59 Co 58 Ni 60 Ni 61 Ni 62 Ni 29
Relative Atomic Mass
Abundance
27.9769265327(20) 28.97649472(3) 29.97377022(5) 30.97376151(20) 31.97207069(12) 32.97145850(12) 33.96786683(11) 35.96708088(25) 34.96885271(4) 36.96590260(5) 35.96754628(27) 37.9627322(5) 39.962383123(3) 38.9637069(3) 39.96399867(29) 40.96182597(28) 39.9625912(3) 41.9586183(4) 42.9587668(5) 43.9554811(9) 45.9536928(25) 47.952534(4) 44.9559102(12) 45.9526295(12) 46.9517638(10) 47.9479471(10) 48.9478708(10) 49.9447921(11) 49.9471628(14) 50.9439637(14) 49.9460496(14) 51.9405119(15) 52.9406538(15) 53.9388849(15) 54.9380496(14) 53.9396148(14) 55.9349421(15) 56.9353987(15) 57.9332805(15) 58.9332002(15) 57.9353479(15) 59.9307906(15) 60.9310604(15) 61.9283488(15)
92.2297(7) 4.6832(5) 3.0872(5) 100 94.93(31) 0.76(2) 4.29(28) 0.02(1) 75.78(4) 24.22(4) 0.3365(30) 0.0632(5) 99.6003(30) 93.2581(44) 0.0117(1) 6.7302(44) 96.941(156) 0.647(23) 0.135(10) 2.086(110) 0.004(3) 0.187(21) 100 8.25(3) 7.44(2) 73.72(3) 5.41(2) 5.18(2) 0.250(4) 99.750(4) 4.345(13) 83.789(18) 9.501(17) 2.365(7) 100 5.845(35) 91.754(36) 2.119(10) 0.282(4) 100 68.0769(89) 26.2231(77) 1.1399(6) 3.6345(17)
Standard Atomic Weight
28.0855(3) 30.973761(2) 32.065(5)
35.453(2) 39.948(1) 39.0983(1) 40.078(4)
44.955910(8) 47.867(1)
50.9415(1) 51.9961(6)
54.938049(9) 55.845(2)
58.933200(9) 58.6934(2)
ATOMIC WEIGHTS AND ISOTOPIC COMPOSITIONSâ•…â•… 319
Symbol 64
Ni Cu 65 Cu 64 Zn 66 Zn 67 Zn 68 Zn 70 Zn 69 Ga 71 Ga 70 Ge 72 Ge 73 Ge 74 Ge 76 Ge 75 As 74 Se 76 Se 77 Se 78 Se 80 Se 82 Se 79 Br 81 Br 78 Kr 80 Kr 82 Kr 83 Kr 84 Kr 86 Kr 85 Rb 87 Rb 84 Sr 84 Sr 84 Sr 84 Sr 89 Y 90 Zr 91 Zr 92 Zr 94 Zr 96 Zr 93 Nb 92 Mo 63
Relative Atomic Mass
63.9279696(16) 62.9296011(15) 64.9277937(19) 63.9291466(18) 65.9260368(16) 66.9271309(17) 67.9248476(17) 69.925325(4) 68.925581(3) 70.9247050(19) 69.9242504(19) 71.9220762(16) 72.9234594(16) 73.9211782(16) 75.9214027(16) 74.9215964(18) 73.9224766(16) 75.9192141(16) 76.9199146(16) 77.9173095(16) 79.9165218(20) 81.9167000(22) 78.9183376(20) 80.916291(3) 77.920386(7) 79.916378(4) 81.9134846(28) 82.914136(3) 83.911507(3) 85.9106103(12) 84.9117893(25) 86.9091835(27) 83.913425(4) 85.9092624(24) 86.9088793(24) 87.9056143(24) 88.9058479(25) 89.9047037(23) 90.9056450(23) 91.9050401(23) 93.9063158(25) 95.908276(3) 92.9063775(24) 91.906810(4)
Abundance
0.9256(9) 69.17(3) 30.83(3) 48.63(60) 27.90(27) 4.10(13) 18.75(51)
Standard Atomic Weight
63.546(3) 65.409(4)
69.723(1) 39.892(9) 20.84(87) 27.54(34) 7.73(5) 36.28(73) 7.61(38) 100 0.89(4) 9.37(29) 7.63(16) 23.77(28) 49.61(41) 8.73(22) 50.69(7) 49.31(7) 0.35(1) 2.28(6) 11.58(14) 11.49(6) 57.00(4) 17.30(22) 72.17(2) 27.83(2) 0.56(1) 9.86(1) 7.00(1) 82.58(1) 100 51.45(40) 11.22(5) 17.15(8) 17.38(28) 2.80(9) 100 14.84(35)
72.64(1)
74.92160(2) 78.96(3)
79.904(1) 83.798(2)
85.4678(3) 87.62(1)
88.90585(2) 91.224(2)
92.90638(2) 95.94(2)
320â•…â•… ATOMIC WEIGHTS AND ISOTOPIC COMPOSITIONS Symbol
Relative Atomic Mass
94
93.9050876(20) 94.9058415(20) 95.9046789(20) 96.9060210(20) 97.9054078(20) 99.907477(6) 96.906365(5) 97.907216(4) 98.9062546(21) 95.907598(8) 97.905287(7) 98.9059393(21) 99.9042197(22) 100.9055822(22) 101.9043495(22) 103.905430(4) 102.905504(3) 101.905608(3) 103.904035(5) 104.905084(5) 105.903483(5) 107.903894(4) 109.905152(12) 106.905093(6) 108.904756(3) 105.906458(6) 107.904183(6) 109.903006(3) 110.904182(3) 111.9027572(30) 112.9044009(30) 113.9033581(30) 115.904755(3) 112.904061(4) 114.903878(5) 111.904821(5) 113.902782(3) 114.903346(3) 115.901744(3) 116.902954(3) 117.901606(3) 118.903309(3) 119.9021966(27) 121.9034401(29)
Mo Mo 96 Mo 97 Mo 98 Mo 100 Mo 97 Tc 98 Tc 99 Tc 96 Ru 98 Ru 99 Ru 100 Ru 101 Ru 102 Ru 104 Ru 103 Rh 102 Pd 104 Pd 105 Pd 106 Pd 108 Pd 110 Pd 107 Ag 109 Ag 106 Cd 108 Cd 110 Cd 111 Cd 112 Cd 113 Cd 114 Cd 116 Cd 113 In 115 In 112 Sn 114 Sn 115 Sn 116 Sn 117 Sn 118 Sn 119 Sn 120 Sn 122 Sn 95
Abundance
Standard Atomic Weight
9.25(12) 15.92(13) 16.68(2) 9.55(8) 24.13(31) 9.63(23) [98] 5.54(14) 1.87(3) 12.76(14) 12.60(7) 17.06(2) 31.55(14) 18.62(27) 100 1.02(1) 11.14(8) 22.33(8) 27.33(3) 26.46(9) 11.72(9) 51.839(8) 48.161(8) 1.25(6) 0.89(3) 12.49(18) 12.80(12) 24.13(21) 12.22(12) 28.73(42) 7.49(18) 4.29(5) 95.71(5) 0.97(1) 0.66(1) 0.34(1) 14.54(9) 7.68(7) 24.22(9) 8.59(4) 32.58(9) 4.63(3)
101.07(2)
102.90550(2) 106.42(1)
107.8682(2) 112.411(8)
114.818(3) 118.710(7)
ATOMIC WEIGHTS AND ISOTOPIC COMPOSITIONSâ•…â•… 321
Symbol
Relative Atomic Mass
124
123.9052746(15) 120.9038180(24) 122.9042157(22) 119.904020(11) 121.9030471(20) 122.9042730(19) 123.9028195(16) 124.9044247(20) 125.9033055(20) 127.9044614(19) 129.9062228(21) 126.904468(4) 123.9058958(21) 125.904269(7) 127.9035304(15) 128.9047795(9) 129.9035079(10) 130.9050819(10) 131.9041545(12) 133.9053945(9) 135.907220(8) 132.905447(3) 129.906310(7) 131.905056(3) 133.904503(3) 134.905683(3) 135.904570(3) 136.905821(3) 137.905241(3) 137.907107(4) 138.906348(3) 135.907140(50) 137.905986(11) 139.905434(3) 141.909240(4) 140.907648(3) 141.907719(3) 142.909810(3) 143.910083(3) 144.912569(3) 145.913112(3) 147.916889(3) 149.920887(4) 144.912744(4)
Sn Sb 123 Sb 120 Te 122 Te 123 Te 124 Te 125 Te 126 Te 127 Te 130 Te 127 I 124 Xe 126 Xe 128 Xe 129 Xe 130 Xe 131 Xe 132 Xe 134 Xe 136 Xe 133 Cs 130 Ba 132 Ba 134 Ba 135 Ba 136 Ba 137 Ba 138 Ba 138 La 139 La 136 Ce 136 Ce 136 Ce 136 Ce 141 Pr 142 Nd 143 Nd 144 Nd 145 Nd 146 Nd 148 Nd 150 Nd 145 Pm 121
Abundance
5.79(5) 57.21(5) 42.79(5) 0.09(1) 2.55(12) 0.89(3) 4.74(14) 7.07(15) 18.84(25) 31.74(8) 34.08(62) 100 0.09(1) 0.09(1) 1.92(3) 26.44(24) 4.08(2) 21.18(3) 26.89(6) 10.44(10) 8.87(16) 100 0.106(1) 0.101(1) 2.417(18) 6.592(12) 7.854(24) 11.232(24) 71.698(42) 0.090(1) 99.910(1) 0.185(2) 0.251(2) 88.450(51) 11.114(51) 100 27.2(5) 12.2(2) 23.8(3) 8.3(1) 17.2(3) 5.7(1) 5.6(2) [145]
Standard Atomic Weight
121.760(1) 127.60(3)
126.90447(3) 131.293(6)
132.90545(2) 137.327(7)
138.9055(2) 140.116(1)
140.90765(2) 144.24(3)
322â•…â•… ATOMIC WEIGHTS AND ISOTOPIC COMPOSITIONS Symbol
Relative Atomic Mass
147
146.915134(3) 143.911995(4) 146.914893(3) 147.914818(3) 148.917180(3) 149.917271(3) 151.919728(3) 153.922205(3) 150.919846(3) 152.921226(3) 151.919788(3) 153.920862(3) 154.922619(3) 155.922120(3) 156.923957(3) 157.924101(3) 159.927051(3) 158.925343(3) 155.924278(7) 157.924405(4) 159.925194(3) 160.926930(3) 161.926795(3) 163162.928728(3) 163.929171(3) 164.930319(3) 161.928775(4) 163.929197(4) 165.930290(3) 166.932045(3) 167.932368(3) 169.935460(3) 168.934211(3) 167.933894(5) 169.934759(3) 170.936322(3) 171.9363777(30) 172.9382068(30) 173.9388581(30) 175.942568(3) 174.9407679(28) 175.9426824(28) 173.940040(3) 175.9414018(29)
Pm Sm 147 Sm 148 Sm 149 Sm 150 Sm 152 Sm 154 Sm 151 Eu 153 Eu 152 Gd 154 Gd 155 Gd 156 Gd 157 Gd 158 Gd 160 Gd 159 Tb 156 Dy 158 Dy 160 Dy 161 Dy 162 Dy 163 Dy 164 Dy 165 Ho 162 Er 164 Er 166 Er 167 Er 168 Er 170 Er 169 Tm 168 Yb 170 Yb 171 Yb 172 Yb 173 Yb 174 Yb 176 Yb 175 Lu 176 Lu 174 Hf 176 Hf 144
Abundance
3.07(7) 14.99(18) 11.24(10) 13.82(7) 7.38(1) 26.75(16) 22.75(29) 47.81(3) 52.19(3) 0.20(1) 2.18(3) 14.80(12) 20.47(9) 15.65(2) 24.84(7) 21.86(19) 100 0.06(1) 0.10(1) 2.34(8) 18.91(24) 25.51(26) 24.90(16) 28.18(37) 100 0.14(1) 1.61(3) 33.61(35) 22.93(17) 26.78(26) 14.93(27) 100 0.13(1) 3.04(15) 14.28(57) 21.83(67) 16.13(27) 31.83(92) 12.76(41) 97.41(2) 2.59(2) 0.16(1) 5.26(7)
Standard Atomic Weight
150.36(3)
151.964(1) 157.25(3)
158.92534(2) 162.500(1)
164.93032(2) 167.259(3)
168.93421(2) 173.04(3)
174.967(1) 178.49(2)
ATOMIC WEIGHTS AND ISOTOPIC COMPOSITIONSâ•…â•… 323
Symbol
Relative Atomic Mass
177
176.9432200(27) 177.9436977(27) 178.9458151(27) 179.9465488(27) 179.947466(3) 180.947996(3) 179.946706(5) 181.948206(3) 182.9502245(29) 183.9509326(29) 185.954362(3) 184.9529557(30) 186.9557508(30) 183.952491(3) 185.953838(3) 186.9557479(30) 187.9558360(30) 188.9581449(30) 189.958445(3) 191.961479(4) 190.960591(3) 192.962924(3) 189.959930(7) 191.961035(4) 193.962664(3) 194.964774(3) 195.964935(3) 197.967876(4) 196.966552(3) 195.965815(4) 197.966752(3) 198.968262(3) 199.968309(3) 200.970285(3) 201.970626(3) 203.973476(3) 202.972329(3) 204.974412(3) 203.973029(3) 205.974449(3) 206.975881(3) 207.976636(3) 208.980383(3) 208.982416(3)
Hf Hf 179 Hf 180 Hf 180 Ta 181 Ta 180 W 182 W 183 W 184 W 186 W 185 Re 187 Re 184 Os 186 Os 187 Os 188 Os 189 Os 190 Os 192 Os 191 Ir 193 Ir 190 Pt 192 Pt 194 Pt 195 Pt 196 Pt 198 Pt 197 Au 196 Hg 198 Hg 199 Hg 200 Hg 201 Hg 202 Hg 204 Hg 203 Tl 205 Tl 204 Pb 206 Pb 207 Pb 208 Pb 209 Bi 209 Po 178
Abundance
18.60(9) 27.28(7) 13.62(2) 35.08(16) 0.012(2) 99.988(2) 0.12(1) 26.50(16) 14.31(4) 30.64(2) 28.43(19) 37.40(2) 62.60(2) 0.02(1) 1.59(3) 1.96(2) 13.24(8) 16.15(5) 26.26(2) 40.78(19) 37.3(2) 62.7(2) 0.014(1) 0.782(7) 32.967(99) 33.832(10) 25.242(41) 7.163(55) 100 0.15(1) 9.97(20) 16.87(22) 23.10(19) 13.18(9) 29.86(26) 6.87(15) 29.524(14) 70.476(14) 1.4(1) 24.1(1) 22.1(1) 52.4(1) 100
Standard Atomic Weight
180.9479(1) 183.84(1)
186.207(1) 190.23(3)
192.217(3) 195.078(2)
196.96655(2) 200.59(2)
204.3833(2) 207.2(1)
208.98038(2) [209]
324â•…â•… ATOMIC WEIGHTS AND ISOTOPIC COMPOSITIONS Symbol
Relative Atomic Mass
210
209.982857(3) 209.987131(9) 210.987481(4) 210.990585(8) 220.0113841(29) 222.0175705(27) 223.0197307(29) 223.018497(3) 224.0202020(29) 226.0254026(27) 228.0310641(27) 227.0277470(29) 230.0331266(22) 232.0380504(22) 231.0358789(28) 233.039628(3) 234.0409456(21) 235.0439231(21) 236.0455619(21) 238.0507826(21) 237.0481673(21) 239.0529314(23) 238.0495534(21) 239.0521565(21) 240.0538075(21) 241.0568453(21) 242.0587368(21) 244244.064198(5) 241.0568229(21) 243.0613727(23)
Po At 211 At 211 Rn 220 Rn 222 Rn 223 Fr 223 Ra 224 Ra 226 Ra 228 Ra 227 Ac 230 Th 232 Th 231 Pa 233 U 234 U 235 U 236 U 238 U 237 Np 239 Np 238 Pu 239 Pu 240 Pu 241 Pu 242 Pu 244 Pu 241 Am 243 Am 210
Abundance
Standard Atomic Weight
[210] [222] [223] [226]
[227] 232.0381(1) 100 100
231.03588(2) 238.02891(3)
0.0055(2) 0.7200(51) 99.2745(106) [237] [244]
[243]
APPENDIX IIâ•… Periodic Table of the Elements
Proteomics of Biological Systems: Protein Phosphorylation Using Mass Spectrometry Techniques, First Edition. Bryan M. Ham. © 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 325
326
APPENDIX IIIâ•… Fundamental Physical Constants
Quantity
Symbol
Value
Electron mass Proton mass Proton–electron mass ratio Speed of light in vacuum Magnetic constant Rydberg constant α2mec/2h Avogadro constant Faraday constant NAe Molar gas constant Boltzmann constant R/NA Stefan–Boltzmann constant (π2/60)k4/ħ3c2 Electric constant 1/μ0c2 Newtonian gravitation constant Electron volt (e/C) J Unified atomic mass unit Planck constant Planck constant h/2π Elementary charge Magnetic flux quantum h/2e Conductance quantum 2e2/h
me mp mp/me c μ0 R∞ NA, L F R k σ
9.10938215(45)╯×╯10−31 1.672621637(83)╯×╯10−27 1836.15267247(80) 2.99792458╯×╯108 4π╯×╯10−7 10,973,731.568527(73) 6.02214179(30)╯×╯1023 96,485.3399(24) 8.314472(15) 1.3806504(24)╯×╯10−23 5.670400(40)╯×╯10−8
€0 G
8.854187817╯×╯10−12 6.67428(67)╯×╯10−11
eV u h ħ e Φ0
1.602176487(40)╯×╯10−19 1.660538782(83)╯×╯10−27 6.62606896(33)╯×╯10−34 1.054571628(53)╯×╯10−34 1.602176487(40)╯×╯10−19 2.067833667(52)╯×╯10−15
G0
7.7480917004(53)╯×╯10−5
Unit
kg kg m/s N/A2 m−1 mol−1 C/mol J/(mol·K) J/K W/(m2·K4) F/m m3/(kg·s2) J kg J·s J·s C Wb S
Proteomics of Biological Systems: Protein Phosphorylation Using Mass Spectrometry Techniques, First Edition. Bryan M. Ham. © 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 327
GLOSSARY Accelerating voltage.╇ In a mass spectrometer, this is the electrical potential that is used to impart kinetic energy to ions. Accurate mass.╇ An experimentally determined mass used to estimate a species elemental formula, usually using high-mass accuracy instrumentation such as the Q-time-of-flight (TOF), electric–magnetic sector, and Fourier transform ion cyclotron (FTICR) mass spectrometers. A measurement with 5╯ppm accuracy is sufficient to determine the elemental composition for ions less than m/z 200. Adduct ion.╇ An ion formed through a noncovalent attraction between a neutral analyte and a cation or anion. For small molecules, the stoichiometry is typically 1:1, but for proteins and nucleic acids, many adducted species may form. Typical cation adducts include H+, Na+, Li+, and K+. Typical anions include I−, Cl−, and SO−4 . Adiabatic ionization.╇ The production of an ion in the ground state through the addition or removal of an electron. Anion.╇ A species that possesses a net negative charge that can be either atomic or molecular. Array collector.╇ A detector that is made up of an array of ion collection devices where each individual component is an ion detector. Associative ionization.╇ The process where an ion is produced through the interaction of two excited species in the gas phase. Atmospheric pressure chemical ionization (APCI).╇ Ionization of analyte species at atmospheric pressure using a reactive reagent such as a gas. Atmospheric pressure ionization (API).╇ General term used to describe a system that produces ionized species in the gas phase at atmospheric pressure. Common API sources used include atmospheric pressure chemical ionization, atmospheric pressure photoionization, and electrospray ionization. Atmospheric pressure photoionization (APPI).╇ Ionization of species at atmospheric pressure using a vacuum ultraviolet (VUV) source and subsequent atmospheric pressure chemical ionization that takes place. 329
330â•…â•… GLOSSARY
Average mass.╇ An ion or molecule’s mass calculated with the average mass of each element weighted for its natural isotopic abundance (see Appendix I). α-Cleavage.╇ A homolytic fragmentation process where an adjacent atom that pairs with the odd-electron one of the pair of electrons between the atom attached to the atom with the odd electron and a radical species is lost. The charge is retained on the atom that originally contained the charge. Base peak.╇ The greatest intensity peak in a mass spectrum that may contain any number of peaks from a single species or numerous species. Often, the mass spectrum is normalized to this peak. Bioinformatics.╇ The application of statistics and computer science to the field of molecular biology. Biomarker.╇ A biologically derived molecule in the body that is measured along with many other species present by the omics methods. Blackbody infrared radiative dissociation (BIRD).╇ The heated surroundings of an analyte (e.g., the vacuum chamber walls) impart infrared photons that are absorbed, causing excitation of the reactant ion, which may subsequently dissociate. Cation.╇ A species that possesses a net positive charge that can be either atomic or molecular. Cationized molecule.╇ A neutral molecule that has undergone a noncovalent addition of a cation to form a charged species that can be measured using mass spectrometry. Often produced during matrix-assisted laser desorption ionization (MALDI) and electrospray ionization (ESI); examples include metal adducts such as [M╯+╯Na]+, [M╯+╯K]+, and [M╯+╯Li]+. Protonation as [M╯+╯H]+ is also a form of cationization, but the term generally refers to a metal adduct. Channeltron.╇ A continuous dynode particle multiplier that is horn shaped. The production of secondary electrons is induced by incoming ions striking the inner surface of the channeltron. The secondary electrons then strike the inner surfaces to produce more secondary electrons. This avalanche effect produces an increase in signal that manifests itself in the final measured current pulse. Charge exchange (charge transfer) ionization.╇ This is an ion–molecule reaction where the charge is transferred from a reactive ionic species to a neutral species without any dissociation taking place for either species involved in the reaction.
GLOSSARYâ•…â•… 331
Charge-induced fragmentation.╇ This is a process where the fragmentation pathway is driven by a localized charge on the species and can take place with both even electron ions and odd-electron ions. Charge-remote fragmentation.╇ This is a process where the fragmentation pathway is driven by a remote charge on the species. This process takes place with even electron ions and is often seen in high-energy collision-induced dissociation processes such as a sector mass spectrometer or the time-of-flight (TOF)-TOF mass spectrometer. Charge transfer reaction.╇ A process where the charge is transferred from a reactive ionic species to a neutral species. These are ion– molecule reactions taking place where an adduct is formed between the neutral analyte species and the surrounding ionized ambient gases. The reactive intermediate ions produced will then interact further with other neutral analyte molecules present producing ionized species. There are a number of ion-molecule reactions that are possible during the APCI process as follows:
R + / − + M → R + M + / − (charge transfer)
MX + R + → M + + X + R (charge transfer with dissociation)
Chemical ionization (CI).╇ CI processes are ion–molecule reactions between the analyte molecules (M) and the ionized reagent gas ions (G) that produce the analyte ions. These are gas-phase acid–base reactions according to the Bronsted–Lowrey theory. In general, these are exothermic reactions taking place in the gas phase. In positive chemical ionization, the three most common reagent gasses used are methane (CH4), isobutane (i-C4H10), and ammonia (NH3). Negative chemical ionization is the production of a negatively charged analyte species through proton exchange or abstraction. In this process, a reactive ion that has a large affinity for a proton is used to remove a proton from the analyte. Some examples of commonly used reactive ions are fluoride (F−), chloride (Cl−), oxide radical (O−), hydroxide (OH−), and methoxide (CH3O−). Chromatography.╇ Chromatography is the separation of analyte species using a combination of a mobile phase and a stationary phase. The analytes will spend different amounts of time in each of the two phases according to their affinities for each phase. For example, if analyte A has a greater affinity for the stationary phase as compared with analyte B, then it will move slower through the chromatography column.
332â•…â•… GLOSSARY
Cluster ion.╇ An ion that has been formed through the addition of either the same neutral component or through other components such as a solvent. Examples include salt clusters such as [(NaI)nNa]+, and solvent clusters such as [(H2O)nH]+. Collisionally activated dissociation (CAD).╇ A process used to produce product ions from a precursor ion where the precursor ion is accelerated into a region that contains a stationary target gas. The subsequent ion/neutral collisions impart energy to the precursor ion, activating it toward dissociation processes when the activation threshold for dissociation has been reached for a certain channel. Collisional excitation.╇ The process of imparting energy to an analyte through collisions with a neutral species thereby increasing the internal energy of the analyte. Collision-induced dissociation (CID).╇ A process used to produce product ions from a precursor ion where the precursor ion is accelerated into a region that contains a stationary target gas. The subsequent ion/neutral collisions impart energy to the precursor ion, activating it toward dissociation processes when the activation threshold for dissociation has been reached for a certain channel. Collision gas.╇ The neutral, inert gas that is introduced into a mass spectrometer used for either collision-induced dissociation processes or as a buffer gas for thermally cooling analyte species through low energy-relaxing collisions. Consecutive reaction monitoring (CRM).╇ Performed with quadrupole mass analyzers, a scan where a multistep reaction path is monitored with three or more stages of mass analysis. Constant neutral loss (or fixed neutral fragment) scans.╇ Performed with quadrupole mass analyzers, the mass filter is scanned to where all of the product ions formed by loss of a preselected neutral fragment from any precursor ions are measured. Constant neutral mass gain scan.╇ Performed with quadrupole mass analyzers, a scan of a preset neutral mass gain is taken of all product ions that may contain the mass gain following the reaction with and addition of a gas in a collision cell. Constant neutral mass loss spectrum (CNML).╇ The spectrum of precursor ion masses that is produced from the constant neutral loss scan of product ions that have been produced by loss of a preselected neutral fragment from any precursor ions. Conventional ion.╇ A radical ion in the form of a cation or anion where the radical site and the charge site can be located in the same atom or group of atoms.
GLOSSARYâ•…â•… 333
Conversion dynode.╇ Used in mass spectrometric instrumentation to reduce the mass discrimination of the detector through increasing the secondary emission characteristics for heavy ions. To attract these ions to the dynode, a high potential of opposite polarity to the ions detected is used. When the ions hit the dynode, secondary electrons are produced and are recorded by the detector. Cyclotron.╇ A device that can accelerate charged species using an oscillating electric field that is parallel to a magnetic field. Cyclotron motion.╇ When an ionized particle enters a strong magnetic field, it will undergo a circular motion that is perpendicular to the magnetic field lines moving at velocity v in a magnetic field B that results from the force qv╯×╯B. The cyclotron motion that the ions exhibit has a resonance frequency that is specific to the ions’ massto-charge ratio (m/z). β-Cleavage.╇ A fragmentation pathway involving the atom that is next to the odd-electron atom where a single electron is moved and charge is retained on the atom that originally possessed the charge. Dalton (Da).╇ A mass unit equal to the unified atomic mass unit of 1.66053886(28)╯×╯10−27╯kg. Named in honor of John Dalton (1766– 1844) who pioneered the consideration of mass in terms of atoms and molecules. Daly detector.╇ When ions impinge on a rounded metal surface held at high potential, secondary electrons will be emitted from the surface and measured with a photomultiplier. Data acquisition.╇ The collection of instrumental data during a measurement. Data processing.╇ Organizing and manipulating data using a set of instructions. Delayed extraction.╇ The correction for initial velocity distributions where the ions produced in a field-free region have an accelerating voltage pulse applied after a preset time to extract the formed ions. In matrix-assisted laser desorption ionization (MALDI), the spatial distribution spread is not a significant problem; therefore, “delayed extraction” of the ions from the field-free source region should correct for the initial energy distribution spread of the formed ions as measured in a time-of-flight mass spectrometer. Desolvation.╇ The removal of solvent molecules from analytes in the gas phase that were ionized from a solvent matrix. Desorption/ionization on silicon (DIOS).╇ A process where a sample that has been deposited on porous silicon is irradiated with a laser
334â•…â•… GLOSSARY
beam and is desorbed due to the transference of energy from the porous surface that has gained energy from the laser. Detection limit.╇ The lowest amount of sample that gives a signal that can be distinguished from the background noise according to a certain signal-to-noise ratio value. Diagnostic ion.╇ A product ion used to indicate or identify a characteristic of its precursor such as structure or composition. Dimeric ion.╇ An ionized dimer that is composed of two molecules of the same species. Dissociative ionization.╇ A gaseous molecule decomposes during the ionization process to form products composed of both ions and neutrals. Distonic ion.╇ A radical ion in the form of a cation or an anion where the radical site and the charge site are not located in the same atom or group of atoms. Double-focusing mass spectrometer.╇ By combining the magnetic sector mass analyzer with the electric sector energy filter, a double-focusing effect can be achieved where the magnetic sector does directional focusing, while the electric sector does energy focusing. Dynamic range.╇ A measure of the range of the smallest to largest detectable signal as a ratio of a detector system. Dynode.╇ One of a photomultiplier tube series of electrodes. Edman sequencing.╇ Process used to identify the amino acid residues making up a protein sequence. Einzel lens.╇ An ion-focusing lens composed of three charged lenses in which the first and third are held at the same voltage. Elastic collision.╇ A collision resulting in elastic scattering. Elastic scattering.╇ An ion/neutral species interaction in which the direction of motion is changed but there is no change in the total translational energy of the collision partners. Electron affinity.╇ The minimum energy required for the removal of an electron from an entity depicted as M−•â•¯→╯M╯+╯e− where all species are in their ground state. Electron capture dissociation (ECD).╇ A protonated molecule with multiple charges combines with an electron of low translational energy (usually less than 3╯eV) in an ion/electron interaction to form an ion with odd-electron sites. For example, peptide cation radicals are produced by passing or exposing the peptides, which are already multiply protonated by electrospray ionization, through low-energy electrons. The mixing of the protonated peptides with the low-energy
GLOSSARYâ•…â•… 335
electrons will result in exothermic ion–electron recombinations. There are a number of dissociations that can take place after the initial peptide cation radical is formed. These include loss of ammonia, loss of H atoms, loss of side-chain fragments, cleavage of disulfide bonds, and most importantly, peptide backbone cleavages. Electron energy.╇ The potential difference used to accelerate electrons for electron ionization, typically at a value of 70╯eV. Electron ionization (EI).╇ In mass spectrometry, this refers to the production of ionized species by electrons accelerated usually at 70╯eV, producing positive molecular ions that are radical cations. Electron multiplier.╇ A device used in detection systems to multiply current derived from a photon or particle beam through incidence of accelerated electrons upon the surface of an electrode through a cascade of collisions producing more secondary electrons. Electron volt.╇ A non-SI unit of energy (eV) defined as the energy acquired by a particle containing one unit of charge through a potential difference of 1╯V. An electron volt is equal to 1.60217653(14)╯×╯10−19╯J. Electrospray ionization (ESI).╇ A process that enables the transfer of compounds in solution phase to the gas phase in an ionized state, thus allowing their measurement by mass spectrometry. The electrospray process is achieved by placing a potential difference between the ESI capillary and a flat counter electrode. The generated electric field will penetrate into the liquid meniscus and create an excess abundance of charge at the surface. The meniscus becomes unstable and protrudes out, forming a Taylor cone. At the end of the Taylor cone, a jet of emitting droplets will form that contain an excess of charge. Enium ion.╇ Nonmetallic positively charged ions of lower valency such as the methenium ion CH2+ (or the methyl cation). Erythrocytes (also referred to as red blood cells).╇ These are the most common type of blood cell and the vertebrate organism’s principal means of delivering oxygen (O2) to the body tissues via the blood flow through the circulatory system. They take up oxygen in the lungs or gills and release it while squeezing through the body’s capillaries. Eukaryote.╇ An organism whose cells contain complex structures enclosed within membranes. Even-electron ion.╇ Usually seen in soft ionization techniques such as matrix-assisted laser desorption ionization (MALDI) and
336â•…â•… GLOSSARY
electrospray ionization (ESI), these are ions containing no unpaired electrons in their ground state, for example, NH4+. Exact mass.╇ A term used for the calculated mass of an ion or molecule that contains a single isotope of each atom. Faraday cup.╇ Measures incoming ions through neutralization by a grounded metal plate producing a small electric current that can be amplified. Fast atom bombardment (FAB).╇ In FAB, the analyte is dissolved and dispersed within a matrix such as glycerol and is bombarded with a high-energy beam of atoms typically composed of neutral argon (Ar) or xenon (Xe) atoms or charged atoms of cesium (Cs+). As the highenergy atom beam (6╯keV) strikes the FAB matrix/analyte mixture, the kinetic energy from the colliding atom is transferred to the matrix and analyte, effectively desorbing them into the gas phase. The analyte can already be in a charged state or may become charged during the desorption process by surrounding ionized matrix. Field desorption (FD).╇ The formation of ions in the gas phase from a material deposited on a solid emitter surface in the presence of an electric field. Field ionization (FI).╇ The removal of electrons from any species by interaction with a high electric field. Field-free region (FFR).╇ A mass spectrometer region where there are no electric or magnetic fields, such as the drift tube in a time-of-flight mass spectrometer. First stability region.╇ Mathieu stability diagram region closest to the origin where ions within this region can traverse the full length of a transmission quadrupole. Fixed precursor ion scans.╇ Used in sector mass spectrometers where a precursor ion is selected by the magnetic sector then all product ions formed from it in the field-free region between the magnetic sector and a flowing electric sector can be identified in an ion kinetic energy spectrum. Fixed product ion scans.╇ Used in sector mass spectrometers where a spectrum is collected of all precursor ions that fragment to yield a preselected product ion. Focal plane collector.╇ Used in magnetic sector mass analyzers for spatially disperse spectra where all ions simultaneously impinge on the detector plane. Fourier transform ion cyclotron resonance (FTICR) mass spectrometer.╇ FTICR mass spectrometers are trapping mass analyzers that use
GLOSSARYâ•…â•… 337
the phenomenon of ion cyclotron resonance in the presence of a homogenous, static magnetic field. When an ionized particle enters a strong magnetic field, it will undergo a circular motion that is perpendicular to the magnetic field lines known as cyclotron motion. The cyclotron motion that the ions exhibit has a resonance frequency that is specific to the ions’ mass-to-charge ratio (m/z). Therefore, mass analysis is achieved in FTICR mass analyzers by detecting the cyclotron frequencies of the trapped ions, which are specifically unique to each m/z value. Fragment ion.╇ An ion that has been formed or transferred to the gas phase. Frequency domain.╇ The representation of data as a function of frequency, mostly associated with Fourier transform (FT) mass spectrometers. Fringing field.╇ The magnetic or electric field that extends from the edge of a sector, lens, or other ion optics element. Genome.╇ The entirety of an organism’s hereditary information. Encoded either in the organism’s DNA or RNA (virus). Glycoproteins.╇ Membrane proteins abundantly found in plasma membranes where they are involved in cell-to-cell recognition processes. Glycosylation.╇ The covalent addition of a carbohydrate chain to amino acid side chains of a protein producing a glycoprotein. The carbohydrate side chains can be anywhere from 1 to 70 sugar units in length, branched or straight-chained, and are most commonly composed of mannose, galactose, N-acetylglucosamine, and sialic acid. Heterolysis (heterolytic cleavage).╇ Symbolized by a double-barbed arrow, the movement of a pair of electrons is between the atom attached to the atom with the charge and an adjacent atom that moves to the site of the charge producing fragmentation where a radical is lost. High-energy collision-induced dissociation.╇ Usually associated with a double-focusing mass spectrometer or a time-of-flight mass spectrometer, this is a collision-induced dissociation process wherein the analyte ion has the translational energy higher than 1╯keV. Homolysis (homolytic cleavage).╇ Symbolized by a single-barbed arrow, the movement of a single electron between two atoms moving to form a pair with the odd electron resulting in fragmentation where a radical is lost and the atom that contains the charge when the ion is formed retains the charge.
338â•…â•… GLOSSARY
Hybrid mass spectrometer.╇ Hybrids are mass analyzers that couple together two separate types of mass analyzers. Hydrogen/deuterium exchange.╇ The surface accessible hydrogens are exchanged with deuterium in studies exploring the conformational structure of analytes. Can be either in the solution phase or the gas phase. i-Cleavage (inductive cleavage).╇ Also known as heterolysis. IMAC.╇ Immobilized metal affinity chromatography. Inelastic collision.╇ A collision that results in inelastic scattering. Infrared multiphoton dissociation (IRMPD).╇ Typically associated with Fourier transform ion cyclotron resonance (FTICR) mass spectrometry. The dissociation of a species by excitation with an infrared CO2 laser causing decomposition and the generation of product ions. Ion.╇ A molecular or atomic species having a net negative or positive electric charge. Ion gate.╇ See mass gate. Ionization cross-section.╇ When an atom or molecule interacts with a photon, this is a measure of the probability that a given ionization process will occur. Ionization efficiency.╇ The ratio of the number of ions formed to the number of electrons or photons used. Ionization energy.╇ The minimum energy required to remove an electron in order to produce a positive ion. Ion/molecule reaction.╇ A reaction where the neutral species is a molecule. Ion-pair formation.╇ An ionization process where a positive fragment ion and a negative fragment ion constitute part of the products. Isotopologue ion.╇ An ion that differs only in its isotopic composition. Laser desorption/ionization.╇ A solid or liquid material is irradiated with a laser beam and the analyte species present absorb the laser energy and are desorbed and ionized in the process. Laser ionization.╇ A process where a species is ionized due to irradiation by a laser through either a single-photon or multiphoton process. Low-energy collision-induced dissociation.╇ A process where collisioninduced dissociation is achieved through multiple collisions of an analyte with translational energy lower than 100╯eV and a target gas. Magnetic sector.╇ A device that utilizes the principle that when charged particles enter a magnetic field, they will possess a circular orbit that is perpendicular to the poles of the magnet. The magnetic field will
GLOSSARYâ•…â•… 339
deflect the charged particles according to the radius of curvature of the flight path (r) that is directly proportional to the mass-to-charge ratio (m/z) of the ion. Mass defect.╇ The calculated difference between the monoisotopic mass and the nominal mass of a molecule or atom. Mass limit.╇ The experimentally determined m/z value above which ions cannot be detected in a mass spectrometer. Mass resolution.╇ Either calculated as m/Δm where m is the m/z value obtained from the spectrum and Δm is the full peak width at half maximum (FWHM), or as the smallest mass difference Δm between two equal magnitude peaks where the valley between them is a specified fraction of the peak height, usually 10%. Mass selective axial ejection.╇ An ion trap mass spectrometer technique where mass selective instability is used to eject ions of selected m/z values. Mathieu stability diagram.╇ A reduced coordinates graphical representation of charged particle motion in a quadrupole mass filter or quadrupole ion trap mass spectrometer. Mass spectrograph.╇ A type of instrument used to separate ions according to their mass-to-charge ratio (m/z) where the ions are directed onto a focal plane detector such as a photographic plate. Mass spectrometer.╇ An instrument that measures the mass-to-charge ratio (m/z) and relative abundances of ions in the gas phase. Mass spectrometry.╇ The science discipline that encompasses all aspects of mass spectrometers and the results obtained with these instruments. Mass spectrometry/mass spectrometry (MS/MS).╇ The study of the decomposition products of a preselected precursor ion. Mass spectrum.╇ Typically a plot of the signal detected from a collection of ions (y-axis) as a function of the mass-to-charge ratio (m/z, x-axis). Mattauch–Herzog geometry.╇ Double-focusing mass spectrometer arrangement where a deflection of π /(4 (2)) radians in a radial electrostatic field is followed by a magnetic deflection of π/2 radians. Matrix-assisted laser desorption ionization (MALDI).╇ An ionization that enables the transfer of compounds in a solid, crystalline phase to the gas phase in an ionized state, thus allowing their measurement by mass spectrometry. The process involves mixing the analyte of interest with a strongly ultraviolet absorbing organic compound, applying the mixture to a target surface (MALDI plate), allowing it
340â•…â•… GLOSSARY
to dry, and then irradiating with a nitrogen laser (337╯nm), or an Nd-YAG laser (266╯nm) desorbing and ionizing the analyte. McLafferty rearrangement.╇ β-Cleavage fragmentation pathway often associated with the decomposition of an overexpressed (OE) molecular ion produced by electron ionization (EI) is gamma-hydrogen (γ-H) rearrangement accompanied by bond dissociation. Metastable ion.╇ A postsource decay product that was produced outside of the source and within a region prior to the detector. Microelectrospray.╇ An electrospray interface that produces flow rates typically less than 1╯µL/min. MOAC.╇ Metal oxide affinity chromatography. Molecular ion.╇ An ion that is formed without a change in mass by the removal from (positive ions) or addition to (negative ions) a molecule of one or more electrons. Molecular mass.╇ The mass of 1╯mol of a molecular substance (6.0221415(10)╯×╯1023 molecules). Monoisotopic mass.╇ The calculated mass of an ion or molecule where the mass of the most abundant isotope of each element was used. Monosaccharides.╇ From the Greek words monos (single) and sacchar (sugar), these are the most basic units of biologically important carbohydrates. They are the simplest form of sugar and are usually colorless, water-soluble, crystalline solids. Some monosaccharides have a sweet taste. Examples of monosaccharides include glucose (dextrose), fructose (levulose), galactose, xylose, and ribose. Monosaccharides are the building blocks of disaccharides such as sucrose and polysaccharides (such as cellulose and starch). MSn.╇ A process of multistage mass spectrometry/mass spectrometry (MS/MS) experiments where n is the number of product ion stages. Mucins.╇ A family of high-molecular-weight, heavily glycosylated proteins produced by epithelial tissues. m/z.╇ A dimensionless quantity formed by dividing the mass number of an ion by its charge number. Nanoflow/nano-electrospray.╇ The electrospray interface which uses sample flow rates that are usually less than 100╯nL/min. Negative ion.╇ The same as anion. Neutral loss.╇ The loss of an electrically uncharged species usually observed during ion dissociation. Nier–Johnson geometry.╇ Double-focusing mass spectrometer arrangement where a deflection of π/2 radians in a radial electrostatic field analyzer is used followed by a magnetic deflection of π/3 radians.
GLOSSARYâ•…â•… 341
Nitrogen rule.╇ The nitrogen rule for organic compounds states that if a molecular formula has a molecular weight that is an even number than the compound contains an even number of nitrogen (0N, 2N, 4N, etc.). The same applies to an odd molecular weight species, which will thus contain an odd number of nitrogen (1N, 3N, 5N, etc.). Nominal mass.╇ The calculated mass of an ion or molecule where the mass of the most abundant isotope of each element rounded to the nearest integer value was used. Nucleic acids.╇ The nucleic acids reside in the nucleus of the cell and are the storage, expression, and transmission of genetic information of living species. The two types of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). Nucleotides.╇ Nucleotides are the monomeric units that make up the nucleic acids. Odd-electron rule.╇ Odd-electron ions may form either odd- or evenelectron product ions, while even-electron ions form even-electron product ions. Paul ion trap.╇ A mass analyzer where application of radio frequency voltages between a ring electrode and two end-cap electrodes allows the ejection of ions with an m/z less than a prescribed value and retention of those with higher mass. Photodissociation.╇ A process where dissociation of an ion into product ions results from the absorption of one or more photons. Photoionization.╇ The process of photoionization involves the absorption of radiant energy from a UV source where the incident energy is greater than the first ionization potential of electron loss from the analyte: M╯+╯hν€→€M+.╯+╯e. Photomultiplier.╇ A device used in detection systems to multiply current derived from photons produced from particle impact of a conversion dynode through incidence of photons upon the surface of a phosphorus screen. The cascade amplification of these photons is similar to that of electron amplification. Positive ion.╇ The same as cation. Postsource decay (PSD).╇ A fragmentation of a metastable ion that undergoes a decay process that happens after the extraction of the analyte ions from the source into the first field-free region. Posttranslational modification (PTM).╇ The chemical modification of a protein after its translation, often one of the later steps in protein biosynthesis.
342â•…â•… GLOSSARY
Precursor ion.╇ An ion that undergoes a reaction (unimolecular dissociation, ion/molecule reaction) that generates product ions. Precursor ion scan.╇ Usually done with quadrupole mass analyzers where the precursor ion(s) will be recorded that are associated with predetermined product ions. Precursor ion spectrum.╇ The mass spectrum recorded from a precursor ion scan. Product ion.╇ An ion that is formed from the precursor ion that has undergone a reaction such as unimolecular dissociation or ion/ molecule reaction. Product ion scan.╇ Usually done with quadrupole mass analyzers where the product ion(s) will be recorded that are associated with predetermined precursor ions. Product ion spectrum.╇ The mass spectrum recorded from a product ion scan. Prokaryote.╇ A group of organisms that lack a cell nucleus or any other membrane-bound organelles. Protease (also termed peptidase or proteinase).╇ This breaks down proteins. A protease is any enzyme that conducts proteolysis; that is, it begins protein catabolism by hydrolysis of the peptide bonds that link amino acids together in the polypeptide chain forming the protein. Proteins.╇ Biological macromolecule second only to the nucleic acids in importance. All living cells contain proteins, and their name is derived from the Greek word proteios, which has the meaning of “first.” There are two broad classifications for proteins related to their structure and functionality: water-insoluble fibrous proteins and watersoluble globular proteins. Proteomics.╇ The large-scale study of proteins including their structures and functions. Often called the protein complement of a system at any one state and time. Proton affinity (PA).╇ The proton affinity is equal to the negative change in enthalpy of the protonation reaction of the analyte:
M + H + → MH +
∆H = − PA molecule
Protonated molecule.╇ An ion formed by interaction of a proton with a neutral molecule represented as [M╯+╯H]+. Purine bases.╇ Adenine (A) and guanine (G). Pyrimidine bases.╇ Cytosine (C), uracil (U), and thymine.
GLOSSARYâ•…â•… 343
Quadrupole mass spectrometer.╇ A quadrupole mass analyzer is made up of four cylindrical rods that are placed precisely parallel to each other. One set of opposite poles have a direct current (DC) voltage (U) supply connected while the other set of opposite poles of the quadrupole have a radio frequency (RF) voltage (V) connected. Ions are accelerated into the quadrupole by a small voltage of 5╯eV, and under the influence of the combination of electric fields, the ions follow a complicated trajectory path. If the oscillation of the ions in the quadrupole has finite amplitude, it will be stable and pass through. If the oscillations are infinite, they will be unstable and the ion will collide with the rods. Quistor.╇ Derived from an abbreviation of the quadrupole ion storage trap equivalent to the term Paul ion trap. Radical ion.╇ An ion, either a cation or anion, containing unpaired electrons in its ground state. Reflectron.╇ An electrostatic mirror used in time-of-flight mass spectrometers to improve mass resolution by assuring that ions of the same m/z but different kinetic energy arrive at the detector at the same time. Resolution.╇ The smallest mass difference Δm between two equal magnitude peaks where the valley between them is a specified fraction of the peak height. Resolving power.╇ Calculated as m/Δm where m is the m/z value obtained from the spectrum and Δm is the full peak width at half maximum (FWHM). Resonance ion ejection.╇ Quadrupole ion trap mode of ion ejection that is obtained through the application of an alternating current (AC) voltage applied to the end caps of the quadrupole ion trap. Resonance of the ion is induced by applying an AC voltage (usually a few hundred millivolts) across the end caps and then adjusting the qz value to match the secular frequency of the ion to the frequency of the applied AC voltage. This effectively uses the axial secular frequencies of the ions to induce resonant excitation. Scintillator.╇ See photomultiplier. SDS-PAGE.╇ Sodium dodecyl sulfate polyacrylamide gel electrophoresis. Soft ionization.╇ Any type of ionization that transfers analytes into the gas phase with minimal fragmentation, such as electrospray ionization (ESI), matrix-assisted laser desorption ionization (MALDI), and fast atom bombardment (FAB).
344â•…â•… GLOSSARY
Space charge effect.╇ The mutual repulsion of particles of like charge will have a mutual repulsion that will limit the current in a charged particle beam and causes the beams or packets of charged particles to expand radially over time. Static field.╇ A nonchanging electric or magnetic field in time. Stable ion.╇ Any ion that does not possess sufficient internal energy to undergo reactive processes such as decay or ion/molecule reactions. Systems biology.╇ The study of the processes and complex biological organizational behavior using information from its molecular constituents Tandem mass spectrometry.╇ The method of sequentially obtaining product ion mass spectra from the dissociation of a precursor ion. Time-of-flight mass spectrometer.╇ A mass spectrometer that separates ions by m/z in a field-free drift tube after acceleration from a source to a contestant kinetic energy. Time lag focusing.╇ Time lag energy focusing where the ions produced in a field-free region have, after a preset time, a pulse applied to the region to extract the formed ions. Total ion current.╇ The total sum of the separate ion currents that are produced by the different ions contributing to the mass spectrum. Transmission.╇ A measurement of the number of ions leaving a region of a mass spectrometer to the number entering that region represented as a ratio. Unified atomic mass unit.╇ Unit of mass (u), non-SI, defined as onetwelfth of 12C in its ground state and equal to 1.66053886(28)╯×╯10−27╯kg. Unimolecular dissociation.╇ Terminology used to describe the decomposition that takes place in a mass spectrometer of a precursor ion species from collision-induced dissociation (CID), or other associated processes.
INDEX α-Proteobacterium, 183, 185, 257, 258 Accurate mass, 126, 149, 233, 246 Accurate mass and time (AMT) tag approach, 233 Acetyl chloride, 105, 154 Adenosine triphosphate (ATP), 82, 100, 217, 250, 256, Adenosylhomocysteinase, 227, 228 Aluminum oxide (Al2O3), 105 Amide bond, 35 Amino acids, 231, 244, 251, 253, 294, 314 aspartate, 30, 182, 183, 188, 196, 213, 217, 224 glutamate, 30, 182, 199, 216, 234 histidine, 17, 18, 23, 24, 38, 89, 117, 181, 182, 184, 223, 224 serine, 23, 24, 38, 60, 82, 99, 102, 149, 181, 249 threonine, 23, 60, 82, 99, 149, 181, 249 tyrosine, 18, 23, 81, 82, 84, 87, 89, 94, 99, 166, 181, 224, 249 Amino-acylium ion, 36 Amino-immonium ion, 36 Ammonia, 41 Anaphase-promoting complex (APC), 150 Anaphylatoxic peptide C3a, 259, 260 Angio II (Sar1Thr8), 261, 262, 269, 270, 287, 288, 289, 290
Angiotensin I, 259, 283, 286, 294 Anion exchange, 42 ANOVA, 237 Apoptosis, 99, 150, 151 Arachidonic acid, 150 Arylesterase-related protein, 199, 236, 240 Ataxia telangiectasia mutated (ATM), 147 ATP, see Adenosine triphosphate a-Type ion, 67 Atmospheric conditions, 5 Atmospheric pressure, 3 Bacillus subtilis, 184, 257 Bath gas, 36 Benzene, 55 β (beta)-elimination, 109 Bioinformatics, 20, 29, 31, 48, 51, 53, 138, 160 Biological replicates, 118 Biomarker, 31, 52, 53, 55 Biomolecules, 64 Blast2GO, 167, 173, 232 Bleomycin, 117, 147, 149, 151, 167, 174 Bottom up proteomics, 20 b-Type ion, 65 C18 RP peptide SPE cartridge, 120 Cancer, 52 Candida utilis, 234 Carbohydrate, 59
Proteomics of Biological Systems: Protein Phosphorylation Using Mass Spectrometry Techniques, First Edition. Bryan M. Ham. © 2011 John Wiley & Sons, Inc. Published 2011 by John Wiley & Sons, Inc. 345
346â•…â•… INDEX
Carbohydrate fragmentation, 86 Caulobacter crescentus, 181, 186, 257, 268 CBS domain protein, 237 Cell cycle histidine kinase CckA, 201, 207, 214 Cellulose, 61 Cerebrospinal fluid, 52 CHAPS, 138 Charge-directed SN2, 274, 275 Charged residue model, 9 Charge state, 13 Chemotaxis, 185, 187, 259 Chromatography, 1, 3, 42, 47, 77, 83, 103, 105, 118, 154, 188, 192 Coefficient of variance (CV), 133 Collisionally activated dissociation (CAD), 43 Collision-induced dissociation (CID), 109, 215 Collision gas, 267 Comprehensive microbial resource (CMR), 225, 304 Condensation reaction, 76 c-Type ion, 40 Cytoplasm, 149, 166, 168, 171, 216, 225, 237, 240, 259, 296, 304 Cytoplasmic membrane, 218, 225, 237, 240, 296
DivL, 199, 216, 217, 224 DNA damage response (DDR), 147, 150, 152 Doxycycline, 147, 148, 152 Drosphila, 259 Dulbecco’s modified Eagle medium (DMEM), 119 Dynamic range, 42, 48 Dynode, 4
1-D SDS-PAGE, 108 Data acquisition, 124 Data-dependent analysis (DDA), 125 Data processing, 3, 6 Decon-2LS, 157, 194 Decoy database, 126, 157, 194, 268 Dehydroalanine, 89, 95, 109, 111, 272 Dehydroaminobutyric acid, 96, 112, 113, 272 Deoxyribonucleic acid (DNA), 99 D-deoxyribose, 16 Diacylglycerol, 100 D-ribose, 16 De novo sequencing, 31 Dithiothrietol (DTT), 120, 154,
False discovery rate (FDR), 125, 156, 193, 268 Fe(III) reductase activity, 233 Fetal bovine serum (FBS), 119 Flagellin protein, 235 Fourier transform ion vyclotron (FT-ICR) mass spectrometer, 13, 31, 48, 76, 94, 115 Fragment ion, 43, 63, 76–78, 311 Free energy of hydrolysis, 249, 258 Fructose, 63 Fused silica capillary columns, 14
Edman sequencing, 27, 103 EDTA, 104 Electron capture dissociation (ECD), 40, 115, 255 Electron multiplier, 4, 5, 7 Electron transfer dissociation (ETD), 40 Electron volt, 36, 327 Electrospray ionization (ESI), 3, 72, 103, 147, 255 Endopeptidase, 263 Enzyme, 20, 28, 29, 81, 82, 87, 99, 100, 103, 140, 151, 158, 185, 216, 234, 249, 257, 268, 307 Erythrocytes, 59 Escherichia coli, 184, 257 Eukaryote, 59, 99, 100, 118, 126, 147, 185, 233, 249, 257, 258 Exact mass, 89
Galactose, 59 Gas chromatograph, 3
INDEXâ•…â•… 347
Gas phase fractionation (GPF), 136 Gel-free approach, 30 Gene ontology (GO), 160 Genome, 1, 50, 99, 193, 225, 268 Glucocorticoid receptor(GR)-hsp90, 150 Glucose, 61 Glucose (carbon) rich environment, 225, 237 Glucose (carbon) starved environment, 225, 237 Glu-C proteinase, 257 Glycoproteins, 54, 59, 60, 72, 81, 82 Glycosaminoglycan, 81 Glycosidic bond, 62 Glycosylation, 2, 20, 40, 59, 60, 72, 75, 99 HeLa Tet-ON, 151 Hexose, 76 High performance liquid chromatography (HPLC), 77 Histidine kinase (HK), 198–201, 207, 223, 224, 252 Human genome project, 1 Human International Protein Index (IPI), 126, 149 Hybrid mass spectrometer, 47, 156, 193 Hydroxyl moiety, 100 Iminodiacetic acid (IDA) resin, 291 Immobilized metal affinity chromatography (IMAC), 103, 154, 188 Immunoprecipitation, 103 Institute for Genomic Research, The (TIGR), 225, 304 Institute for Systems Biology, 50 Integral membrane proteins, 225 International Protein Index (IPI), 126 In vitro, 150, 190, 255, 259–262, 265–267, 270, 281, 287, 293, 294 In vivo, 150, 243, 255, 265–267, 293
Iodoacetamide, 120, 154 Ionization, 54 Ionization process, 64 Ionization source, 64 Iron(III)-iminodiacetate (IDA), 104 Iron(III)-nitrilotriacetate (NTA), 104 Isobaric, 76 Isomer, 83 Isotopic peak, 42 Isotope, 103 Kasil chemical frit, 124, 155, 265 Kinase, 82, 99, 100–103, 106, 149, 150, 167, 172–174, 181, 184, 215–217, 224, 239, 243, 250 Labeling 16 O/18O, 103 2 D-methanol, 103 13 C-SILAC, 103 iTRAQ, 103 Lamin-A/C cytoskeletal protein, 162 Lipidomics, 53 LTQ-Orbitrap, 40, 122, 123, 125, 130, 193 Macrotrap, 123 MALDI time-of-flight (TOF) mass analyzer, 65 Mannose, 59 Mass accuracy, 63 Mass analyzer, 65 Mass spectrometer, 2–7, 11–13, 27, 30 Mass spectrometry, 1 Mass spectrum, 43, 45, 63, 72, 76–78, 89 Mass-to-charge ratio (m/z), 2 Matrix assisted laser desorption/ ionization, 73 MES (2-(N-morpholino) ethanesulfonic acid, 264 Metabolites, 50 Metal oxide affinity chromatography (MOAC), 103, 106 Methanolic HCl, 105
348â•…â•… INDEX
Mini-PROTEAN 3 Cell, 122 Mitochondrial associated localization, 233 Mobile proton model, 35 Molecular formula, 38 Molecular weight, 38, 42 Monoisotopic mass, 75, 109, 112, 158 Monoisotopic peak, 77 Monosaccharides, 61, 62, 75 Mucins, 82, 83, 86 MultiAlign, 126, 157, 158 Multidimensional protein identification technology (MUDPIT), 30 Multistage activation (MSA), 193, 267 Mycobacterium smegmatis, 227 m/z, see Mass-to-charge ratio N-acetylglucosamine, 59 NAD-dependent GDH, 240 NAD-dependent glutamate dehydrogenase, 248 Nanoelectrospray, 11 National Center for Biotechnology Information (NCBI), 31 Negative ion mode, 64 Neutral loss, 16, 83, 89, 93, 109, 112, 113–115, 125 Neutral loss triggered MS3, 270 Niobium oxide (Nb2O5), 105 N-linked glycosylation, 60 N-methylglycine (sarcosine), 287 Nominal mass, see Exact mass Nuclease sensitive element-binding protein 1, 166 Nucleic acids, 1, 15–19, 22, 108, 119, 122, 126, 154 Nucleocytoplasmic, 166, 175 Nucleoside, 16 Nucleotides, 16, 239 Oligonucleotides, 3 Oligosaccharide, 64 O-linked glycosylation, 60
Outer membrane proteins (OMP), 225, 253 Oxonium ions, 76 32
P labeling, 102 Parts per million, 89 Penicillin, 120 Peptide bond, 23 Peptide mass fingerprinting, 27, 140 Periplasm, 225, 227, 238, 240, 296, 305 Phenylphosphate synthase, 259 Phosphatase, 96, 99, 100, 103, 122, 147, 149, 158, 175, 181, 192, 217, 249, 256, 263 Phosphate group, 99 Phosphate neutral loss, 272 Phosphoacceptor, 249, 258 Phosphodonor, 249, 258 Phosphoester bond, 16 Phosphohistidine, 181, 249–251, 253, 257–261, 264–266, 268, 270, 277, 281, 285–287, 291–296, 304–309 Phosphoproteome, 99, 103, 118, 123, 128–133, 143, 147, 151, 160, 184, 188, 192, 196, 204, 205, 216, 225, 243, 265, 296, 308 Phosphoramidate, 181–184, 249, 258–261, 293 Phosphorylated amino acids phosphoserine, 100, 181, 250 phosphotyrosine, 100, 181, 249, 258, 285–287, 311 phosphohistidine, 181, 249–253, 257–261, 264–295 phosphothreonine, 100, 181, 187 phosphoaspartate, 181, 213, 215, 216, 253 phosphoglutamate, 181, 182, 216 stabilities of, 251 Phosphorylation, 20 Phosphotyrosine immunoprecipitation, 192 Phosphorus oxychloride, 259, 261, 262
INDEXâ•…â•… 349
Phosphorylation motifs, 172, 173, 308 Phosphatase, 99, 100, 103, 122, 147, 149, 154, 158, 175, 181, 189, 192, 217, 249, 256 Phosphotransferase, 259, 307 Physiological pH, 23, 249 Polynucleotide, 16 Polysaccharide, 60 Positive ion mode, 8, 64–68, 83, 255 Posttranslational modification (PTM), 1, 54, 60, 81, 95, 99, 181, 249 Potassium phosphoramidate, 259– 261, 293 Precursor ion, 66, 75, 76, 89, 111, 115, 272 Product ion, 16, 18, 27, 34–37, 40 Product ion spectrum, 31, 72 Prokaryote, 29, 184, 185, 249, 256, 305 Protein family (Pfam), 216, 224 Protease, 28, 30, 192, 262, 293, 307 Protease inhibitor cocktail, 189 Protein phosphatase type 5 (PP5), 147 Proteins, 1–3, 16, 20, 22–25 Proteomics, 2, 20–23 Pseudomonas putida, 187 Purine bases adenine, 16, 17 guanine, 16, 17 Pyrimidine bases cytosine, 16, 17 uracil, 16, 17 thymine, 16, 17 QIAShredder, 108 Quadrupole mass spectrometer, 42 Quantitative analysis, 233 Quantitative Western blot, 171 Resolution, 38, 42 Reverse phase (C18), 77 Ribonucleic acid (RNA), 108 Roche Complete Lysis-M, 120
Saccharomyces cerevisiae, 234 Salmonella typhirmurium, 187 Sal operon transcriptional repressor SalR, 234 ScanProsite, 307 SDS-PAGE, 27–29, 108, 120, 122, 139, 263, 293 Sector mass analyzer, 65 Sensor histidine kinase, 198, 199, 208, 223, 302 SEQUEST, 32, 125, 126, 128, 135, 147, 156, 193, 216 Serine, 23, 38, 60, 82 Shewanella alga, 233 Signaling pathways, 100–103, 147, 149, 168, 181, 233, 249, 256 Single-point analysis, 118 SpectrumLook, 128, 143, 147, 175, 196, 205, 243, 311 Split flow, 125, 156, 193, Staphylococcus aureus V-8 protease, 30, 263 Starvation survival response, 187, 233 Streptomycin, 120 Streptomyces clavuligerus, 234 Strong cation exchange (SCX), 128 Structural determination, 3 Swiss-Prot, 31, 89, 232, Systems biology, 50, 52 Tandem mass spectrometry, 43, 82, 266 Targeted label-free, 149 Taylor cone, 8 Technical replicates, 118 Tetradentate complex, 104 Thauera aromatica, 258 Thionyl chloride, 105, 107, 121, 154, 190, 262 Threonine, 23, 38, 60, 82, 99, 149, 181, 249, 265 Time course study, 235, 239
350â•…â•… INDEX
Time-of-flight mass spectrometer (TOF/MS), 8 Titanium dioxide (TiO2), 105 TonB-dependent receptors, 205 Top-down proteomics, 42 Total ion chromatogram (TIC), 269 Trifluoracetic acid (TFA), 105 Trizol, 119 Trypsin, 120 Two-component system, 185, 250, 251, 258 Two-dimensional (2-D) polyacrylamide gel electrophoresis (PAGE), 102 Tyrosine, 23, 25, 38, 81, 87, 94, 102 Tyrosine sulfation, 88 Ultracentrifugation, 108, 122, 126, 127, 147, 154
Undersampling, 118 Urea, 120 Vacuum system, 3 Venn diagrams, 129 Viper, 126, 194 w-type ion, 36 x-type ion, 68 X-type ion, 72 Y box-binding protein 1, 172 y-type ion, 93 Y-type ion, 65 Zirconium dioxide (ZrO2), 105 z-type ion, 115 Z-type ion, 67