M. Tomita T. Nishioka (Eds.) Metabolomics The Frontier of Systems Biology
M. Tomita, T. Nishioka (Eds.)
Metabolomics The Frontier of Systems Biology
With 112 Figures, Including 4 in Color
Springer
Masaru Tomita, Ph.D. Professor and Director Institute for Advanced Biosciences Keio University Tsuruoka 997-0035, Japan Takaaki Nishioka, Ph.D. Professor Graduate School of Agricuhure Kyoto University Kyoto 606-8502, Japan
This book is based on the Japanese original, M. Tomita, T. Nishioka (Eds.), Metabolome Kenkyu no Saizensen, Springer-Verlag Tokyo, 2003. Library of Congress Control Number: 2005928331 ISBN 4-431-25121-9 Springer-Verlag Tokyo Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Product liability: The publishers cannot guarantee the accuracy of any information about dosage and application contained in this book. In every individual case the user must check such information by consulting the relevant literature. Springer is a part of Springer Science+Business Media springeronline.com
© Springer-Vertag Tokyo 2005 Printed in Japan Typesetting: Camera-ready by the editor. Printing and binding: Nikkei Printing, Japan Printed on acid-free paper
Preface The aim of this book is to review metabolomics research. The information is presented in a way that allows the reader to view the subject of metabolomics from a broad perspective. Creative and progressive research on metabolomes began in Japan and Germany in the 1990s and ranged from the development of specialized chemical analytical techniques to the construction of databases and methods for metabolic simulation. The authors have been directly involved in the development of all the subject areas that are discussed in this book, including research related to capillary electrophoresis, liquid chromatography, mass spectrometry, metabolic databases, and metaboUc simulation. As the title suggests, the latest cutting-edge research projects are presented here. In addition, a selected group of applied cases, representative of likely future scenarios, is presented. It is our hope that this book will generate further metabolomic research across a broad range of life-science disciplines, and that hitherto unforeseen applications and innovative technologies will arise from such efforts. It is especially important that medical institutions and venture enterprises should actively participate in metabolomic research, thus ensuring that it matures into a discipline offering practical medical benefits. To promote the application of metabolomic research, a number of key issues will require breakthroughs; these include the popularization of chemical analytical techniques, the development of simple but stable specialized analyzers, the parallel processing and miniaturization of such devices, and the advancement of metabolic systems biology. Researchers, technicians, and university students are urged to take on these challenges to advance metabolomic research. We would like to thank the staff of Springer-Verlag, Tokyo, for their help in bringing this book to fruition. Masaru Tomita Institute for Advanced Biosciences Keio University Takaaki Nishioka Division of Applied Biosciences Graduate School of Agriculture Kyoto University
Contents Preface
V
Color Plates
IX
Part I. Introduction Chapter 1: Overview M. Tomita Part II. Analytical Methods for Metabolome Sciences Chapter 2: Development and Application of Capillary Electrophoresis-Mass Spectrometry Methods to Metabolomics T. Soga
1
7
Chapter 3: Application of Electrospray Ionization Mass Spectrometry for Metabolomics R. Taguchi
25
Chapter 4: High-Performance Liquid Chromatography and Liquid Chromatography/Mass Spectrometry Analyses of Metabolites in Microorganisms H. Miyano
37
Chapter 5: Metabolome Profiling of Human Urine with Capillary Gas Chromatography/Mass Spectrometry T.Kuhara
53
Chapter 6: Metabolic Profiling by Fourier-Transform Ion Cyclotron Resonance Mass Spectrometry (FT-ICR-MS) and Electrospray Ionization Quadrupole Time-Of-Flight Mass Spectrometry (ESI-Q-TOF-MS) K. Hirayama
75
Chapter 7: Metabolome Analysis by Capillary Electrophoresis L. Jiaand S. Terabe
91
Chapter 8: High-Performance Liquid Chromatography for Metabolomics: High-Efficiency Separations Utilizing Monolithic Silica Columns T. Ikegami, H. Kobayashi, H. Kimura, V.V. Tolstikov, O. Fiehn, and N. Tanaka 107
VIII
Contents
Part III. Applications of Metabolome Analysis to Biosciences Chapter 9: Combined Analysis of Metabolome and Transcriptome: Catabolism in Bacillus subtilis T. Nishioka, K. Matsuda, and Y. Fujita
127
Chapter 10: Metabolomics in Arabidopsis thaliana K. Saito
141
Chapter 11: Lipidomics: Metabolic Analysis of Phospholipids R. Taguchi
155
Chapter 12: Chemical Diagnosis of Inborn Errors of Metabolism and Metabolome Analysis of Urine by Capillary Gas Chromatography/Mass Spectrometry T.Kuhara
167
Part IV. Metabolome Informatics Chapter 13: Introduction to the ARM Database: Database on Chemical Transformations in Metabolism for Tracing Pathways M.Arita 193 Chapter 14: The Genome-Based E-CELL Modeling (GEM) System K. Arakawa, Y. Yamada, K. Shinoda, Y. Nakayama, and M.Tomita
211
Chapter 15: Hybrid Dynamic/Static Method for Large-Scale Simulation of Metabolism and its Implementation to the E-CELL System Y. Nakayama
221
Part V. Metabolomics and Medical Sciences Chapter 16: Metabolomics and Medical Sciences T. Nishioka
233
Index
245
m
U
o 6 D
O
Cd C
O D Ö
S C
o
or cd X)
c o
o
c o c o CI. =3
C
2 CI.
'o cd
%
I 3
ci
oo
^
s
o
X
C -TD C
o
o o OH
Ö
a
I i
Color Plate 3. Three-dimensional picture of B. subtilis cell extract by the two-dimensional separation system. (See p. 103)
Color Plate 4. Hierarchical clustering analysis of transcriptome and metabolome data under nutritionally stressed conditions. (See p. 146)
Part I. Introduction
Chapter 1: Overview Masaru Tomita Institute for Advanced Biosciences, Keio University, 403-1 Daihoji, Tsuruoka, Yamagata 997-0017, Japan
1. Introduction A large number of metabolites, including sugars, organic acids, and amino acids, are present in living organisms. Several thousand such molecules, in addition to macromolecules such as nucleic acids and proteins, are involved in the life processes of organisms. Although some of these substances are externally acquired, most are the products and intermediates of metabolic reactions. Comprehensive analysis of the metabolome, which is the complete set of metabolites in an organism or cell, is crucial to the understanding of cellular function. This large-scale analysis of metabolites is an important addition to extensive studies of DNA sequences (genome) and proteins (proteome). We cannot understand the dynamic behavior of metabolism without in-depth knowledge of the type and quantities of substances that exist, and the conditions under which they are present in living organisms and cells. Metabolome analysis is applicable to various fields of biotechnology in the postgenomic era. However, the comprehensive analysis of metabolites is a relatively recent concept—even the term "metabolome" is new. There are several reasons as to why this is the case. For example, advanced methodology is needed to measure large numbers of metabolites over a short time period and there are many technical challenges. In the past, studies have therefore tended to pinpoint measurements of the levels of individual preselected substances only. Moreover, even if technology allows the large-scale measurement of metabolites, how can such enormous amounts of data be understood? Previously, metabolic systems were only investigated on a limited scale, and there was no concept of measuring all intracellular metabolites. However, the recent fusion of biotechnology with information technology has generated "data driven" or "-omics" biosciences, in which large amounts of data can be collected in a comprehensive manner and consolidated for analysis using computers. Metabolomics (metabolome science) is poised to have an important role in the postgenomic era.
M. Tomita This book is a summary of the latest trends in metabolomics and encompasses three major subject areas: metabolome measurement techniques, applications to biosciences, and metabolome informatics. Metabolomics is still an emerging discipline, and this book focuses primarily on research projects in which the editors have taken part.
2. Metabolome Measurement Techniques Metabolomics requires the simultaneous measurement of a large number of metabolites, and mass spectrometry (MS) plays a leading role in this field. A mass spectrometer can rapidly and accurately measure the molecular weights and quantities of many substances. However, although substances within a certain range (for example, molecular weight 70-500 and so on) can be measured in a comprehensive manner using MS, this technique alone cannot distinguish between two or more substances with the same molecular weight. What is therefore required is to separate substances using chromatography, for example, before injecting the sample into the mass spectrometer. In simple terms, chromatography involves placing a sample in a pre-established column and moving it. When this happens, the migration rate will differ depending on the substance, so if the mass spectrometer is connected to an outlet, many substances can be injected into the mass spectrometer at different times. This enables two substances having the same molecular weight to be measured separately. The chromatography migration time is fixed for every substance. Therefore, once the migration time for each substance is known beforehand, the substance can be identified from the molecular weight and migration time, thereby enabling it to be measured. Liquid chromatography (LC) and gas chromatography (GC) are both used as standard methods, and when combined with MS these techniques are known as LC/MS and GC/MS, respectively. Our research group has recently developed an analytical technique utilizing capillary electrophoresis (CE) combined with MS (CE/MS). Most of the metabolites in living organisms are charged substances (anions or cations) and CE/MS is highly suitable with these substances, demonstrating a high resolution with only a small amount of sample. As neutral (noncharged) substances cannot be measured using CE/MS, we aim to identify all of the intracellular metabolites (the complete metabolome) by combining this technique with LC/MS.
Introduction
3. Applications to Biosciences Metabolomic information is useful in a wide range of biotechnological fields. For example, samples can be classified on the basis of the metabolic patterns obtained by MS (metabolome profiling). If cancer cells, for instance, are classified and diagnosed using metabolome profiling, it might be possible to develop a specific method of treatment for each different category. Since only the overall pattern is important in metabolome profiling, overlaps of the peaks of two or more substances do not create serious problems; it could be applied even without separation by chromatography. However, the combined use of MS with chromatography or CE allows the identification of substances, which greatly extends the potential applications of the technique. For example, if the profile comparison of two samples reveals a significant difference in a certain peak, the identification of the peak substance allows the clarification of differences in metabolism between the samples. Metabolome comparison between cancer cells and normal cells, for instance, allows the identification of substances that are specific to the former. These might be key substances in the cancer cell and, therefore, potential targets for the treatment of disease. Furthermore, in studies of the effects or toxicity of a drug, the comparison of metabolomes obtained before and after treatment can identify differences in the levels of particular substances. These might be related to metabolic activation or inactivation that is triggered by the drug. Clearly, there is the potential for a wide range of applications of this technique. Metabolome analysis provides important information that can be used to model the dynamic behavior of industrially important metabolic pathways. By measuring quantities of intermediate metabolites in the course of time, dynamics of metabolic pathways can be understood well. In addition, by simultaneously monitoring substances other than the metabolites that are known to be involved in the pathway, it is possible to discover the involvement of unexpected substances. Finally, one of the major goals of biochemistry and cell biology is to model intracellular metabolism in its entirety. Our group is working to meet this challenge by making full use of metabolomic analytical techniques and bioinformatics. Our research comprises three elements: 1. The automatic generation of a "draft" model of the entire metabolic pathway, based on genomic information (top-down approach) 2. The identification of all intracellular metabolites, based on metabolome analysis (bottom-up approach) 3. The integration of resources (1) and (2) using bioinformatics
M. Tomita We have developed a computer program, the GEM System, which automatically constructs a model based on genomic information (nucleotide sequences). As the genomic data include a number of genes with unknown functions, the metabolic pathway generated by this top-down approach is clearly incomplete (a draft model). We are therefore producing a list of all the metabolites that exist in the cell using CE/MS and LC/MS (bottom-up approach). We then supplement the draft (incomplete) model with this metabolomic information in order to produce a complete metabolic model.
4. Metabolome Informatics Liformatics (information science) is as indispensable for metabolomics as it is for genomics and proteomics. We discuss four examples of current computer software in this book. Databases in which large amounts of metabolomic information are consolidated have an essential role in this field, and many can be accessed online. The metabolic pathways are generally displayed in a graphical manner, in which each element is a clickable map that is linked to other databases. The KEGG database of Kyoto University, Japan, is the best known metabolic database worldwide. As mentioned previously, our group has developed the GEM system, which automatically generates a metabolic model from genomic sequence information obtained in a fully automated manner from public resources, such as the Clusters of Orthologous Groups of Proteins (COG), SWISS-PROT, Kyoto Encyclopedia of Genes and Genomes (KEGG), and European Molecular Biology Laboratory (EMBL) databases. As the model generated by the GEM System follows the rule format of the E-CELL system, it is also possible to run the metabolism simulation directly as it is. When two metabolites are specified, the ARM system predicts the metabolic pathways connecting them and displays the pathways in ranked order. This system makes use of molecular structural information of the metabolites. The "draft" pathway models that are automatically generated by the GEM System are incomplete; however, the ARM system can potentially supplement the missing parts of the pathways. A key goal of metabolomics is to create a comprehensive and accurate simulation of metabolism. The E-CELL system is a software package that can simulate the dynamic behavior of metabolisms as a whole, when given the rules for individual metabolic reactions. This system can be used to study the effects of changing the initial value of a metabolite or altering the activity of an enzyme. In addition, this system can be applied in sensitivity
Introduction
analysis that examines which reactions influence overall metabolism and to what extent. It is also possible to let the system automatically optimize the kinetic parameters, so that the simulation results match the actual experimental data.
5. Conclusions In April 2001, Keio University established the Institute for Advanced Biosciences, which specializes in Systems Biology, in Tsuruoka City (Yamagata Prefecture), Japan. The ultimate objective of the research at this institute is to construct a computer model of the cellular metabolism, and it has given high priority to metabolome research since its opening. As of 2005, the institute is equipped with the following world-class instrumentation for metabolome analysis: 19 CE systems, six LC systems, two GC/MS systems, nine quadrupole MS systems, four ion-trap MS systems, two Triple QMS/MS systems, six electrospray ionization time-of-flight mass spectrometers (ESI-TOF-MS), one Q-TOF-MS and one nuclear magnetic resonance (NMR) spectrometer. The major projects being conducted at the Institute include: "£. coli Modeling Project" funded by the New Energy and Industrial Technology Development Organization (NEDO) of the Ministry of Economy, Trade and Industry of Japan, with the ultimate aim of designing useful microorganisms; "Leading Project for Biosimulation" funded by the Ministry of Education, Culture, Sports, Science and Technology (Monbusho), which is performing metabolome analysis and simulation of red blood cells; and the Grants for Scientific Research and Scientific Research of Priority Areas (funded also by the Ministry of Education, Culture, Sports, Science and Technology), and 21st Century COE program at Keio University entitled "The Understanding and Control of Life's Function via Systems Biology," which are developing the basic technologies for metabolomics and cell simulation. In addition, the Faculty of Medicine at the University of Tokyo, Japan, estabUshed the Center for Metabolome in February 2003. This group is focusing primarily on lipid metabolomics using LC/MS technology. Finally, Keio University and its partners invested in the establishment of a "bio-venture" company. Human Metabolome Technologies (HMT) Inc., in Tsuruoka City in July 2003. UtiHzing the metabolome technology of the Institute for Advanced Biosciences of Keio University, HMT is now conducting joint research with major food companies, to understand bacterial metabolism of fermentation used in the food industry. Their future plan is to
M. Tomita apply this technology to the fields of medical sciences through collaborations with drug companies.
Part II. Analytical Methods for Metabolome Sciences
Chapter 2: Development and Application of Capillary Electrophoresis-Mass Spectrometry Methods to Metaboiomics Tomoyoshi Soga Institute for Advanced Biosciences, Keio University, and Human Metabolome Technologies Inc., 403-1 Daihoji, Tsuruoka, Yamagata 9970017,Japan
1. Introduction High-throughput and comprehensive analysis of intracellular metabolites can reveal the connection within biochemical networks and provide a systems-level understanding of cell. Metaboiomics, or the global analysis of cellular metabolites, has become a powerful new tool for gaining insight into functional biology. Proteins and metabolites are the main effectors of phenotype and thus the functional entities within the cell. Measurement of the level of numerous metabolites within a cell, and tracking metabolite changes under different conditions not only provides direct information on metabolic phenotypes but is also complementary to gene expression and proteomic studies [1,2]. Although metabolome analysis is indispensable, unlike other functional genomic approaches, very few methods for a largescale metabolite analysis have been developed. Recently, new methods for the comprehensive analysis of charged metabolites by capillary electrophoresis-mass spectrometry (CE/MS) have been developed. Since CE/MS enables direct and quantitative analysis of most charged metabolites, the methods have attracted a great deal of attention. In this chapter, the principles, applications and prospects of this technology will be discussed.
2. Strategy
2.1. Problems in Metabolome Analysis Despite its importance, only a limited number of methodologies have been developed for metabolome analysis. This is primarily due to the character-
8
T. Soga
istics of most metabolites that display high polarity, non-volatility, poor detectability and overall similar properties. In addition, the fact that over 1000 different metabolic substrates exist in a cell complicates the analysis. 2.2. Recent Analytical Techniques for Metabolome With the increasing interest in metabolomics, recently several methods for exhaustive metabolome analysis have been developed. The pioneering work in large-scale metabolite analysis was originally performed by gas chromatography-mass spectrometry (GC/MS) [3,4]. Analysis by GC/MS offers high resolution, selectivity, and sensitivity, and demonstrates outstanding performance. However, it is somewhat limited by the need for multiple derivatization procedures for each chemical moiety. Moreover, even after derivatization, a considerable number of metabolites are still non-volatile, and thus cannot be determined by GC/MS. High-performance liquid chromatography-mass spectrometry (LC/MS) is also a useful method. However, a large number of charged metabolites are too polar to be significantly retained by the reversed-phase columns that are commonly used in LC/MS. Alternative techniques exist, such as ionexchange chromatography coupled to mass spectrometry, but are scarcely used due to the lack of appropriate mobile phases compatible with MS detection. Recently, direct infusion analysis approaches using nuclear magnetic resonance (NMR) and other mass spectrometry methods such as Fouriertransform ion cyclotron resonance mass spectrometry (FT-ICR-MS) or electrospray ionization-mass spectrometry (ESI-MS) have been developed for metabolome profiling [5-7]. Although these infusion techniques enable the instantaneous acquisition of metabolic snapshots, there are still some important drawbacks. In NMR, sufficient amounts of samples must be prepared, and quantification is difficult. Infusion techniques using MS lack accuracy and precision for quantification due to the ion suppression effect and often cannot separate a number of isomers [8,9]. Therefore, the development of quantitative and high-resolution metabolome analysis methods remains one of the most demanding challenges. 2.3. Principle of CE/MS A new approach for the comprehensive and quantitative analysis of charged metabolites by capillary electrophoresis electrospray ionization mass spectrometry (CE/MS) has recently been proposed [10]. Metabolites are first separated by CE and selectively detected using MS by monitoring
Development and Application of CE/MS 9 ions over a large range of m/z values. This method enabled the determination of several hundred metabolic standards and its utility was demonstrated in the analysis of 1692 metabolites from Bacillus subtilis extracts [10]. The principles of this technique are briefly described below^. Most of intracellular metabolites bear a charge and CE/MS is thus a logical choice because of its powder to separate charged species. In this marriage of techniques, CE confers rapid analysis and efficient resolution, and MS provides high selectivity and sensitivity. The major advantages of CE/MS are that this methodology exhibits extremely high resolution and that almost any charged species can be detected by MS. Figure 1 show^s a simple diagram of CE/MS. In CE, ionic species migrate on the basis of their charge and size; therefore, all cations migrate toward the cathode, whereas all anions move in the opposite direction. Therefore, in principle all charged species could be analyzed using only two CE/MS configurations.
Fig. 1. Basic principle of charged metabolite analysis by capillary electrophoresis-mass spectrometry (CE/MS). When voltage is applied, all cations migrate toward the cathode, whereas all anions move in the opposite direction. All charged species can thus potentially be analyzed using only two CE/MS configurations
10
T. Soga
3. CE/MS for Metabolome Analysis
3.1. Metabolite Extraction The development of efficient metabolite extraction procedures from cells must meet several requirements. First, rapid enzyme inhibition and efficient metabolite extraction are necessary to quantify intercellular metabolites since turnover of some metabolites occurs rapidly. Second, samples should be enriched to facilitate detection. Third, simultaneous extraction of both cationic and anionic metabolites is desirable. Finally, metabolites should be dissolved in low conductivity solutions to achieve maximum performance of CE/MS [11]. A method meeting these requirements for Bacillus subtilis bacteria is illustrated in Fig. 2. A volume of 10 ml of culture medium is passed through a 0.45 |J<m pore size filter. Residual B. subtilis cells on the filter are washed with Milli-Q water and then plunged into 2 ml of methanol, containing internal standards (methionine sulfone for cations and PIPES for anions), to inactivate enzymes. After a short incubation at room temperature, the methanol solution is withdrawn and then chloroform and Milli-Q water are added to the methanol solution. The mixture is thoroughly mixed to remove phospholipids liberated from cell membranes, which can adsorb on the capillary wall and reduce CE performance. The separated methanol-water layer is then centrifugally filtered through a Millipore 5-kDa-cutoff filter to remove proteins. The filtrate is lyophilized and dissolved in 20 |LI1 of Milli-Q water before CE/MS analysis. Overall, this procedure can result in a 500-fold enrichment of metabolites [10].
3.2. Cationic Metabolite Analysis In cation analysis, MS is coupled to the cathode as shown in Fig. 1. Separations are carried out on a fused silica capillary. To analyze most of cations simultaneously a low pH solution (1 M formic acid) is used as the electrolyte to confer a positive charge on every cation [12]. In this manner, cationic metabolites can be efficiently separated by CE and then selectively and sensitively detected by MS. Figure 3 shows the mass spectrum of arginine (molecular weight 174.2), which was acquired by CE/MS in scanning mode from m/z 50 to 350. The monoisotopic protonated molecular ion [M+H]^ at m/z 175.1 dominates the mass spectrum, and most of other cations show similar results [12]. The smaller peak at 176.1 corresponds to a C13 isotopic peak. Electro-
Development and Application of CE/MS 11 spray ionization (ESI) is a soft ionization method, so that the protonated molecular ions are dominant for most compounds. Consequently, cationic metabolites were determined as their protonated molecular ions [M+H]^ in the CE/MS method. Figure 4 shows an example of amino acid analysis by CE/MS. Every amino acid can be selectively determined by this method. Since isomers such as leucine and isoleucine cannot be differentiated by mass in MS, they must be separated by CE first. To separate these isomers efficiently, a 100 cm-long capillary was employed [12]. Using this method, a total of 173 cationic metabolite standards such as amino acids, amines, and nucleosides, which are listed in the COMPOUND section in LIGAND database (http://www.genome.ad.jp/kegg/ligand.html), were successfully and simultaneously determined [10].
Fig. 2. Metabolite extraction procedurefromBacillus subtilis cell. Cells were separated from the culture media using filtration, and then quickly plunged into methanol to inactivate enzymes. After phospholipids and proteins were removed by both liquid-liquid extraction and centrifiigal ultrafiltration through a 5-kDa cutoff filter, the filtrate was lyophilized and dissolved in Milli-Q water before CE/MS analysis
12
T. Soga r
^•3
100
(M+H)^175
80 60 40 -
20
n
CD
1
1
1
100
150
200 m/z
Fig. 3. Positive ion mass spectrum of arginine by CE/MS. The protonated molecular ion [M+H]^ dominates the mass spectrum, and a trace of its ^^C isotope ion can also be observed. Very few fragmentation ions were observed in the electrospray ionization mass spectrometry (ESI-MS) (reproduced from Ref. [12] with permission from the American Chemical Society)
"^IIH
_i±_ -^t^
E HSriSr
£
m/z 76 m/z 96 m/z 106 m/z 116 m/z118 m/z 120 m/z 132 m/z 133 m/z 134 m/z 147 m/z 148 m/z 150 m/z 156 m/z 166 m/z 175 m/z 182 m /z 205 m /z 241
10 Cmin]
Fig. 4. CE/MS electropherograms for a standard mixture (250 jiM each) of 19 amino acids. Each amino acid was selectively detected as its protonated molecular ion. Leucine and isoleucine could be resolved in this method (reproduced from Ref. [12] with permission of the American Chemical Society)
Development and Application of CE/MS
13
Table 1. Quantification of Bacillus subtilis 168 metabolites at r_o.5 phase, and overall reproducibility from metabolite extraction through CE/MS analysis Compound
Mole/cell (amol)°
RSD («=5) (%) Peak area
~Gty L-ß-Ala L-Ala GABA L-Ser L-Pro L-Val L-Homoserine L-Thr Creatine L-Ile L-Leu L-Hydroxyproline L-Omithine L-Asn L-Asp Adenine Tyratnine Spermidine L-Lys L-Gb L-Glu L-Met L-His L-Phe L-Arg L-Citrulline Tyramine L-Camosine Cytidine Adenosine Pyruvate Lactate Fumarate Succinate Malate 2-Oxoglutarate Phosphoenol pyruvate
5.8 0.14 24 0.12 4.5 3.8 4.1 2.1 13 2.5 2.2 6.3 0.17 0.27 0.43 11 0.04 0.37 0.12 0.12 190 350 1.4 0.32 1.2 1.7 1.6 0.3 1.1 0.12 0.06 3.2 23 0.73 3.2 0.16 1.2 1.7
2Ö 28 13 27 23 3.0 19 2.0 6.3 30 24 21 29 60 57 11 63 25 53 33 8.4 4.0 15 16 44 32 8.5 50 50 26 55 33 22 25 6.9 16 24 30
Relative migration time 1.5 2.7 1.3 1.7 0.83 0.48 0.83 0.84 0.33 1.4 0.61 0.74 0.13 2.1 1.3 0.2 1.8 1.6 2.9 2.0 0.49 0.47 0.51 2.0 0.37 2.1 0.55 0.32 2.2 0.91 0.84 0.28 0.18 0.44 0.31 0.32 0.30 0.31
14
T. Soga
0.07 Dihydroxyacetone 2.5 26 phosphate Glycerol 3-phosphate 3.1 22 0.06 3-Phosphoglycerate 9.1 16 0.20 Citrate 0.36 27 0.69 Erythrose 4-phosphate 1.1 12 2.5 Ribulose 5-phosphate 1.7 30 0.10 Ribose 5-phosphate 25 0.32 0.45 Glucose 1-phosphate 1.1 29 0.14 Fractose 6-phosphate 2.6 38 0.17 Glucose 6-phosphate 1.4 49 0.18 6-Phosphogluconate 0.24 28 0.18 0.18 Fructose 1,6-diphosphate 1.9 51 CMP 0.29 23 0.93 AMP 47 0.78 1.3 GMP 42 0.85 0.14 CDP 52 0.89 0.07 ADP 49 0.43 0.72 GDP 0.93 55 0.05 CTP 0.89 37 0.15 ATP 0.61 36 0.83 GTP 0.63 0.12 49 0.72 NAD 8.9 4.5 NADP 0.64 0.26 23 NADPH 0.72 53 0.06 Acetyl CoA 36 0.59 0.67 Reproduced from Ref. [10] RSD, relative standard deviation ^The quantity of each metabolite per cell was calculated using the number of cells per ml culture, which was determmed as 1 .Ox 10^ by colony forming unit on LB plates
3.3. Anionic IVIetabolite Analysis The metabolites of key pathw^ays for cellular energy production such as glycolysis, the tricarboxylic acid (TCA) and pentose phosphate cycles are almost entirely anionic species, e.g., carboxylic acids, phosphorylated carboxylic acids, and phosphorylated sugars. An enormous number of anionic metabolites also exist in other pathways. Therefore, a methodology suitable for anionic metabolite analysis is also very important. However, the analysis of anions by CE/MS using a fused silica capillary in negative mode, where the inlet capillary is at the cathode and the outlet at the anode (Fig. 5a), has been difficult. Since the CE/MS system does not contain an outlet solution vial (Fig. 5a), the electro-osmotic flow (EOF) movement toward the cathode (opposite the MS direction) generates a gap in the liq-
Development and Application of CE/MS 15 uid phase at the capillary exit, resulting in a current drop (Fig. 5b) [13]. For this reason, few reports on anion analysis have been reported by CE/MS such a configuration. To overcome this problem, w^e have designed a way to reversed the EOF toward the anode by employing a SMILE(+) cationic polymer (Polybrene)-coated capillary (Fig. 5c). This technique of EOF reversal enabled successive anion analysis without the deleterious current drop [14,15].
Fig. 5. Schematic of the electro-osmotic flow (EOF) profile in anion analysis by CE/MS in negative mode, a Using a fused silica capillary, the EOF is directed toward the cathode (opposite to MS direction), resulting (b) in a gap in the liquid phase at the capillary outlet, and an associated current drop, c This problem can be overcome by reversing the EOF using a SMILE(+) cationic polymer coated capillary and to enabled successive anion analysis
16
T. Soga
O
>. Ü
Ö.5
m/z87 m/z89
g jl: e) i2 M l i^^S:feg3 tgorou.^
\K.
'"^
m/z191 m/z195 m /z 229 m /z 259
Cmin]
Fig. 6. CE/MS electropherograms for a standard mixture (100 jaM each) of 25 metabolites of the glycolytic, TCA, and pentose phosphate pathways. PEP, phosphoenol pyruvate; DHAP, dihydroxyacetone phosphate; 3PG, 3-phosphoglycerate; E4P, erythrose 4-phosphate; Ru5P, ribulose 5-phosphate; R5P, ribose 5-phosphate; GIP, glucose 1-phosphate; F6P, fructose 6-phosphate; G6P, glucose 6-phosphate; 2,3DPG, 2,3-diphosphoglycerate; F16P, fructose 1,6-diphosphate (reproduced from Ref. [15] with permission from the American Chemical Society) Figure 6 illustrates the electropherograms obtained following the analysis of a 25-anionic standard mixture of glycolytic, TCA, and pentose phosphate pathways obtained by CE/MS. Since the deprotonated molecular ion, [M-H]~, dominated the mass spectrum for each compound, anions were detected at their deprotonated molecular weights [15]. Although the migration times of succinate, malate, 2-oxoglutarate, and phosphoenol pyruvate are very close, they were selectively detected by MS. Even isomers such as ribulose 5-phosphate (Ru5P) and ribose 5-phosphate (R5P), and glucose 1-phosphate (GIP), fructose 6-phosphate (F6P), and glucose 6phosphate (G6P) were resolved. A total of 124 anionic metabolite standards including carboxylic acids, phosphorylated carboxylic acids, phosphorylated sugars, and other phosphorylated compounds were determined bythismethod[10].
Development and Application of CE/MS 17
3.4. Nucleotide- and Coenzyme A (CoA)-Related Compound Analysis When using the above anion analysis configuration, significant adsorption of multivalent ions (e.g., nucleotides and CoA derivatives) on the cationic-coated capillary occurs (Fig. 7a) [16]. To prevent adsorption and allow^ precise quantification, a pressure-assisted CE/MS technique (to counteract the EOF) using a noncharged polymer coated capillary (Fig. 7b) was developed [16]. Although the migration times for most of the compounds were very close, 41 different nucleotides including cyclic nucleotides, deoxynucleotides, and CoA derivatives were simultaneously determined (Fig. 8) [10]. Altogether, the three different CE/MS configurations allowed the analysis of a total of 352 metabolic standards [10].
Fig. 7. Schematic of pressure-assisted CE/MS system for the analysis of nucleotide and coenzyme A (CoA)-related compounds, a Multivalent anions such as nucleotides and CoA-related compounds readily adsorbed on the SMILE(+) cationic polymer coated capillary which was used for anion analysis by CE/MS. b To avoid adsorption, a noncharged polymer-coated capillary was employed instead, and constant flow of mobile phase toward the mass spectrometer was driven by applying air pressure to the capillary inlet vial to prevent current drop
18
T. Soga
J^£B46 !^:^^ ^-^m: dUMP _.^..„ M P P -L-2MP
m/z303 m/z304 m/z306 m/z307 m/z321 m /z 322
.A_yM£__
m/z323
A-CAÜ^P d A M- ^P _ , ,_. AcGMP AAMPdGMP
m/z328 m/z330 \J\J\J m/z344 ^/^346
Ill/ ^
J U M E
m /z 347
-A.2ME
m /z 362
JA dCDP -JQDE i CDP -i> UDP i dADP A ADPdGDP HDP k GDP idCTP i dUTP A TTP i CTP I UTP dATP iiATP dGTP JLJME * GTP j y NAD J NADH J NADP -AJfclADPH -JLCoA^
ni /z 386 m/z401 m /z 402 m /z 403 m/z410 m /z 426 m /z 427 m /z 442 m /z 466 m /z 467 m /z 481 m /z 482 m /z 483 m /z 490 m /z 506 m /z 507 m /z 522 m /z 662 m /z 664 ^ / 3 742 ^ /^ 744 ^ /2 766
-A-EAS
m /z 784
A Acetyl CoA i Succinvl CoA
^ /^ 808 ^n /z 866
10
15
20
Cmin]
Fig. 8. Analysis of a standard mixture (100 |LIM each) of 41 nucleotide and CoA compounds by pressure-assisted CE/MS. The migration time of a number of nucleotides were close, but they could be easily differentiated by MS (reproduced from Ref [10] with permission from the American Chemical Society)
Development and Application of CE/MS 19
3.5. Comprehensive Metabolome Analysis by CE/MS and Application to 6. subtilis Extracts In this section, comprehensive methods for metabolome analysis using CE/MS are described. As mentioned before, CE/MS methods permit infusion of any charged species into MS. Therefore, monitoring eluting ions over a w^ide range of m/z values by MS enables the comprehensive analysis of charged metabolites. The above-described CE/MS methods w^ere applied to the analysis of metabolites ranging between 70 and 1000 m/z values in B. subtilis extracts. Since a large number of metabolites are present at different concentrations in B. subtilis, it was necessary to limit the range in single ion monitoring (SIM) mode to a window^ of 30 different m/z values to maximize detection sensitivity. This narrow m/z scanning technique allowed a several-fold increase in sensitivity and detection of a large number of metabolites. To cover the necessary mass range (701000), each sample was analyzed successively 33 times using an automatic injection sequence while varying the m/z monitoring range between 70 and 1000 in both cation and anion modes [10]. Figure 9 shows the results for cations extracted from exponentially-growing B. subtilis cells (71 0.5) for a 101-130 m/z value range as obtained by this method. The peak contents were elucidated by comparing the components' molecular weight and migration time with those of metabolite standards. Over the range of 101-130 m/z, and among the corresponding 62 peaks, we could identify 18 metabolites such as cadaverine, GAB A, A^,A^-dimethylglycine, diethanolamine and serine. Although complete analysis took over 16 hours, the whole procedure is highly automated and could reveal up to 1053 cationic metabolites including from exponentially-growing B. subtilis cells including 70 clearly identified ions. For anions, the CE/MS method yielded 637 anions including 78 important metabolites involved in glycolysis, the TCA cycle, and pentose phosphate pathways. Finally, several nucleotides and CoA-related compounds were detected using the third CE/MS method. Overall, a total of 1692 metabolites, including 150 positively identified compounds from exponentially growing B. subtilis cell extracts, were determined using the three different CE/MS methods [10]. The sensitivity of these methods was extremely high, revealing the presence of as little as 40 zepto moles of adenine and 350 attomoles of glutamate per cell. The relative standard deviation (%RSD) (n=5) for metabolite quantification for the whole procedure, including metabolite extraction through CE/MS analysis for 63 identified metabolites, was generally between 2% and 40% for peak areas, except for the smallest peaks where the %RSD was larger. For relative migration time, the %RSD for identified metabolites was better than 2.9%
20
T. Soga
(Table 1). These results thus indicate that the relative migration times can help define unidentified peaks.
Ji2 Cadaverinei
0.4797 0 61371 0,4652L. nifithannlamine
, 0 8993 0.8818 110 9067 0.7188 GABA 0 8119 N.N-Dimethylglycine
1 7799/v
JlSer_
iP?7gi.. K (Phenvlenediamine^ 1.7913/> .0.4063
Cvtosine 0.7607 0^.198 Q,3§ja.. t Creatine 0 4801 0.6450 0 6196 0.6789 Pro A 0.9665 (5-Aminopentanamide) 0.4801 Q',jgni'Jinggg?t?fe v^' luLSMl
1.7789 A
1.7758/^
^
L-HorTx?seme
0-4653 b-Phenylethvamine -JQ 3441
, AThf 0 9412
JLCuiinsQ-6279 JLa44M,
-K0344ß 0 6516j. A5-Methvlcvtosine lmidazole-4-acetate i 0.6815 n fi453°kP^QC760^0 8394 • »ilts-LYPjn.el^-l^ct^m) Octylamine 0 8998Ka9071
Taurine ^
m/z m/z m/z m/z m/z m/z m/z m/z m/z m /z m/z m/z m/z m/z m/z m/z m/z m/z m/z m/z m/z m/z m/z m/z m/z m/z m/z m/z m/z m/z
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130
Cmin]
Fig. 9. Selective ion electropherograms for cationic metabolites in exponentially-growing (T-0.5) B. subtilis 168 cells in the range of 101 to 130 m/z. The numbers on top of unidentified peaks are relative migration times normalized with methionine sulfone (internal standard) (reproduced from Ref. [10] with permissionfromthe American Chemical Society)
3.6. Comparative Metabolome Analysis During B. subtilis Sporulation Nutritional limitation leads Bacillus species such as B. subtilis and Bacillus anthracis to produce a dormant, environmentally resistant spore [17]. This phenomenon is universally accepted as a basic model of bacterial dif-
Development and Application of CE/MS 21 ferentiation. The complex morphological changes that occur during sporulation are thought to be tightly controlled by metabolic networks [18]. However, no comprehensive approach has been used to demonstrate alterations in metabolite profiles. We thus took advantage of the abovedescribed CE/MS methods to perform metabolite profiling in B. subtilis cells at different time points before and during spore formation stages The levels of both cationic and anionic metabolites at J-o.s (0.5 h before To\ To (corresponding to the end of exponential growth), and T2 (2 h after To) were measured by CE/MS and changes in metabolite levels were compared. (Fig. 10) [10]. B. subtilis cells undergo sporulation under conditions of glucose deprivation [19]. Interestingly, the level of most metabolites in the glycolytic, pentose phosphate, and TCA pathways markedly decreased in the early stage of sporulation (Fig. 10a). In particular, the level of fructose 1,6disphosphate (F16P) rapidly dropped more than 100-fold during sporulation. Since F16P is a key factor in catabolite repression mediated by the transcriptional factors CcpA and CcpC, it is possible that the decrease in F16P results in suppression of catabolite control and a subsequent expression of catabolic genes involved in sporulation [18]. Sonenshein and coworkers previously showed that all inactivating mutations in the B. subtilis TCA cycle genes cause a defect in sporulation, thus suggesting that activation of the TCA cycle is indispensable for sporulation [18]. In our experiments, both c/5-aconitate and isocitrate, intermediates in the TCA cycle, were shown to accumulate at To (Fig. 10a). Subsequently, the concentration of these metabolites, malate, and 2oxoglutarate decreased, while acetyl CoA and succinyl CoA increased at T2 (Fig. 10b) [10]. These findings are in good agreement with previously reported studies regarding changes in en2yme and metabolite levels in the TCA pathway, demonstrating the power of our CE/MS methods to monitor these metabolic responses simultaneously [20]. Transcriptional alterations of gene expression during sporulation have been previously measured using DNA array techniques [21,22]. The expression of most genes involved in these metabolic pathways was found to decrease during sporulation. On the other hand, the level of several metabolites such as c/5-aconitate, isocitrate, CoA, acetyl CoA, succinyl CoA, lysine and ß-alanine were found to increase in our study. These results suggest that the metabolic consequences of gene expression changes cannot always be correctly predicted from transcriptome analysis, most likely because metabolism may be regulated at other levels such as posttranscriptional control and/or modification of enzyme activity. Further metabolome research will thus be necessary to better characterize these complex biological phenomena.
22
T. Soga
Fig. 10. Metabolic profiling upon onset of sporulation, based on the simultaneous analysis of charged metabolites by CE/MS. See Color Plate 1. a Changes in metabolite levels during the late logarithmic growth phase (To vs Tlo.s). (b) Changes in metabolite levels during the early stage of sporulation (T2 vs Tlo.s)Magenta and red boxes indicate metabolites whose levels increased 2- to 10fold and more than 10-fold, respectively. Light blue and indigo boxes indicate the metabolites whose level was decreased to 0.1—0.5 and less than 0.1 of the original level, respectively. White boxes represent the metabolites whose levels remain approximately the same. Black boxes with white lettering indicate the undetected metabolites. The B. subtilis metabolic map was constructed based on the ARM database [24] (see www.metabolome.jp/arm.html) (reproduced from Ref. [10] with permission from the American Chemical Society)
4. Conclusion The CE/MS techniques described here enabled the comprehensive, direct, and sensitive analysis of charged species, and revealed the presence of
Development and Application of CE/MS 23 1692 compounds in B. subtilis cells including 150 that could be positively identified. The methods were applied not only to bacteria but also more recently to plant and mammalian cells, yielding quantitative values for a considerable number of metabolites [23]. However, about 90% of the metabolites detected in complex extracts could not as yet be identified. It is thus imperative to also develop powerful methods that will allow the identification of uncharacterized compounds. Toward this goal, CE-time-offlight mass (CE/TOF-MS) might help to determine the chemical formulae of unknown compounds and CE/MS/MS could provide structural information. Since the proposed CE/MS methods greatly facilitate the global determination of charged species, they can be used as universal tools for quantitative metabolome analysis. Metabolome data, together with transcriptome and proteome analysis, will provide important new information to elucidate the biological functions of uncharacterized cellular components.
References 1. Ideker T, Thorsson V, Ranish JA, Christmas R, Buhler J, Eng, J K, Bumgamer R, Goodlett DR, Aebersold R, Hood L (2001) Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292:929-934 2. Raamsdonk LM, Teusink B, Broadhurst D, Zhang N, Hayes A, Walsh MC, Berden JA, Brindle KM, Kell DB, Rowland JJ, Westerhoff HV, van Dam K, Oliver SO (2001) A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations. Nat Biotechnol 19:45-50 3. Fiehn O, Kopka J, Dormann P, Altmarm T, Trethewey RN, Willmitzer L (2000) Metabolite profiling for plant fiinctional genomics. Nat Biotechnol 18:1157-1161 4. Fiehn O, Kopka J, Trethewey RN, Wilhnitzer L (2000) Identification of uncommon plant metabolites based on calculation of elemental compositions using gas chromatography and quadrupole mass spectrometry. Anal Chem 72 :3573-3580 5. Reo NV (2002) NMR-based metabolomics. Drug Chem Toxicol 25:375-382 6. Aharoni A, Ric de Vos CH, Verhoeven HA, Maliepaard CA, Kruppa G, Bino R, Goodenowe, DB (2002) Nontargeted metabolome analysis by use of Fourier Transform Ion Cyclotron Mass Spectrometry. OMICS 6:217-234 7. Castrillo JI, Hayes A, Mohammed S, Gaskell SJ, Oliver, SG (2003) An optimized protocol for metabolome analysis in yeast using direct infusion electrospray mass spectrometry. Phytochemistry 62:929-937 8. Stenson AC, Landing WM., Marshall AG, Cooper WT (2002) Ionization and fragmentation of humic substances in electrospray ionization Fourier transform-ion cyclotron resonance mass spectrometry. Anal Chem 74:4397^409
24
T. Soga
9. Avery MJ (2003) Ionization and fragmentation of humic substances in electrospray ionization Fourier transform-ion cyclotron resonance mass spectrometry. Rapid Commun Mass Spectrom 17:197-201 10. Soga T, Ohashi Y, Ueno Y, Naraoka H, Tomita M, Nishioka T (2003) Quantitative metabolome analysis using capillary electrophoresis mass spectrometry. J Proteome Res 2:488-494 11. Li SFY (1993) Capillary electrophoresis—^principles, practice and applications. J Chromatogr Library vol. 52. Elsevier, Amsterdam 12. Soga T, Heiger DN (2000) Amino acid analysis by capillary electrophoresis electrospray ionization mass spectrometry. Anal Chem 72:1236-1241 13. Lukacs KD, Jorgenson JW (1987) Capillary zone electrophoresis: effect of physical parameters on separation efficiency and quantitation. J High Res Chromatogr 10:622-624 14. Katayama H, Ishihama Y, Asakawa N (1998) Stable cationic capillary coating with successive multiple ionic polymer layers for capillary electrophoresis. Anal Chem 70:5272-5277 15. Soga T, Ueno Y, Naraoka H, Ohashi Y, Tomita M, Nishioka T (2002) Simultaneous determination of anionic intermediates for Bacillus subtilis metabolic pathways by capillary electrophoresis electrospray ionization mass spectrometry. Anal Chem 74:2233-2239 16. Soga T, Ueno Y, Naraoka H, Matsuda K, Tomita M, Nishioka T (2002) Pressure-assisted capillary electrophoresis electrospray ionization mass spectrometry for analysis of multivalent anions. Anal Chem 74:6224-6229 17. Errington J (1993) Bacillus subtilis sporulation: regulation of gene expression and control of morphogenesis. Microbiol Rev 57:1-33 18. Sonenshein AL, Hoch JA, Losick R (2002) Bacillus subtilis and its closest relatives. ASM Press, Washington DC, pp 129-162 19. Grossman AD (1995) Genetic networks controlling the initiation of sporulation and the development of genetic competence in Bacillus subtilis. Annu Rev Genet 29:477-508 20. Uratani-Wong B, Lopez JM, Freese E (1981) Induction of citric acid cycle enzymes during initiation of sporulation by guanine nucleotide deprivation. J Bacteriol 146:337-344 21. Fawcett P, Eichenberger P, Losick R, Youngman P (2000) The transcriptional profile of early to middle sporulation in Bacillus subtilis. Proc Natl Acad Sei USA 97:8063-8068 22. Britton RA. Eichenberger P, Gonzalez-Pastor JE, Fawcett P, Monson R, Losick, R, Grossman AD (2002) Genome-wide analysis of the stationaryphase sigma factor (sigma-H) regulon of Bacillus subtilis. J Bacteriol 184:4881-4890 23. Sato S, Soga T, Nishioka T, Tomita M (2004) Simultaneous determination of main metabolites in rice leaf usmg capillary electrophoresis mass spectrometry and capillary electrophoresis diode array detector. Plant J 40:151-163 24. Arita M (2003) In silico atomic tracmg by substrate-product relationships in Escherichia coli intermediary metabolism. Genome Res 13:2455-2466
Chapter 3: Application of Electrospray Ionization Mass Spectrometry for Metabolomics Ryo Taguchi Department of Metabolome, Graduate School of Medicine, The University of Tokyo, 7-3-1 Kongo, Bunkyo-ku, Tokyo 113-0033, Japan
1. Introduction
In parallel with the present progress of genomics and proteomics, metabolomics is also recognized to be very important for post genome studies (Fig. 1). In metabolomics, the technique of mass spectrometry (MS) is considered to be a most important tool. Furthermore, recent advances in mass spectrometry have made it possible to acquire comprehensive analyses of metabolites, and in particular, to elucidate precise functions of individual proteins differently expressed in cells. To understand actual physiological functions of individual proteins, in addition to genomics, transcriptomics, and proteomics, metabolomics are essential to obtain a further understanding of each physiological and biological function of proteins. In this process, studies on comprehensive profiling on metabolites in the cells are inevitable. To identify actual substrates for enzyme proteins, low-molecularweight ligands for receptor proteins, and low-molecular-weight metabolites for their carrier proteins, metabolomics by mass spectrometry is very useful. Another aim of metabolomics is to identify biological metabolites from MS data and obtain profiling patterns of alteration of these metabolites under specific circumstances. As a result of these analytical processes of profiling, elucidation of an unknown pathway or the exact substrate specificity of new enzyme proteins can be obtained (Fig. 1). At times, even a new hypothesis can be verified with this process. Recently, simultaneous studies on genetic alterations and metabolome studies have been recognized to be very effective for linkage analyses of metabolic pathways. Findings of soft ionization in mass spectrometry have induced paradigm changes in the applications of mass spectrometry in research [1]. For example, it is possible to deduce a specific hypothesis about the metabolites
26
R. Taguchi
Fig. 1. Basic techniques used in genome, transcriptome, proteome, and metabolome. Mass spectrometry (MS) has become a most popular method in proteomics and metabolomics
that support the facts from the new discoveries regarding metabolites or their specific alteration in a specific biological phenomenon. Thus, comprehensive analyses of metabolic metabolites under genetically, environmentally, or physiologically different conditions are very important (Fig. 2). Electrospray ionization (ESI) [1]. and matrix-assisted laser desorption/ ionization (MALDI) are very mild ionization methods as compared with previous ionization methods. With respect to metabolites as target metabolites of metabolome, individual molecular structures are typically quite conamon and the structural and metabolic relationships between each metabolite are well studied. Thus we can easily imagine their metabolic linkage based on existing knowledge. On the basis of these circumstances, it is possible to obtain effective results from comprehensive analysis of metabolites by mass spectrometric analyses, for elucidating new functions of enzyme proteins including substrate specificities. ESI-MS allows individual metabolites in a mixture to be effectively analyzed [2-10]. Further, connecting a high-performance liquid chromatography (HPLC) system with a mass spectrometer enables more than several hundred or thousand metabolites to be identified at once [10]. By using Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR-MS), more than several
Application of ESI-MS for Metabolomics
27
Fig. 2. Strategies in metabolomics. Linkage analyses of expression of protein and metabolome are very important in functional studies of enzyme proteins
hundred different metabolites in a mixture eluted at the same retention time can be effectively and separately identified due to its high resolution and accurate mass [4]. By using MS/MS type instruments, individual metabolites of a specific m/z value in the mixture can be identified. LC/ESI-MS and MS/MS are very powerful methods both in basic life science and applicative research such as drug discovery [6]. Features of LC/ESI-MS in metabolomics are that the data from separation by LC and separation with a mass spectrometer can be used in pseudo-two-dimensional analyses, and the construction of databases and search tools such as those already popular in genomics and proteomics. Techniques in this field are progressing very rapidly, with the progress of soft ionization such as MALDI and ESL Recent advances in this field have led to many hybrid systems such as ion trap MS (ITMS) and time-of-flight MS (TOF-MS). The most important feature of these ionization techniques is that individual naturally occurring metabolites can be ionized without any collision [1]. Further, they enable very sensitive measurements of molecular
28
R. Taguchi
amounts at the pico- or femtomolar level. Thus, this method is suitable for determining very small amounts of biological metabolites. By using conventional ionization methods such as electron ionization (EI) and chemical ionization (CI), it is very difficult to acquire molecular-related ions without any collisions. The fragment pattern of each metabolite is basically used for criteria of identification. For this reason these methods were exclusively used for the identification of a purified single metabolite. In the case of mixtures, gas chromatography/mass spectrometry (GC/MS) [5] was used after derivation for separation and sensitivities. Thermospray ionization and atmospheric pressure chemical ionization (APCI) have also been used in combination with HPLC separation. However, for metabolites that are difficult to ionize, GC/MS was not available. As a partially effective method to obtain molecular-related ions, fast atom bombardment (FAB) ionization has been reported up until the common usage of ESI or M ALDI. ESI and MALDI enable researchers to detect very small levels of detectable biological metabolites. As a result of the development of ESI and MALDI, mass spectrometry has become an essential tool for biological researchers. Until very recently, these methods were mainly recognized as tools for confirming newly synthesized metabolites and especially tools for experienced analysts. Within the past ten years, there have been many improvements made such as TOF-MS and ITMS, and the speed of spread of the utilization of these MS techniques is remarkable. MALDI is essentially used as an off-line method. On the other hand, ESI can be used in a flow system, and can be easily combined with on-line separation methods such as HPLC or capillary electrophoresis (CE) [8]. Sensitivity of detection by ESI-MS mainly depends on the concentration of metabolites in the sample solution. Thus, to obtain the highest sensitivity it is very important to use a low elution rate with a small-sized colunm, and for this purpose, a capillary or nana LC system combined with ESI-MS has been used. For proteome studies, using a MALDI system is also very popular in combination with one- or two-dimensional gel electrophoresis. Recently MALDI-TOF-MS and MALDI-IT-TOF-MS have been appUed to metabolic studies in combination with separations on polymer-based or other support membranes. These mass spectrometries allow high-throughput structural identification of individual metabolites by their unique MS/MS method.
Application of ESI-MS for Metabolomics
29
Table 1. Types of mass spectrometer used with an ESI source Triple quadrupole MS Quadrupole ion-trap MS Quadrupole time-of-flight MS FT-ICR-MS
Multiple LC/MS/MS mode Data dependent scan, MS" High resolution and high accuracy Extremely high resolution and high accuracy, MS" ESI, electrospray ionization; LC, liquid chromatography; FT-ICR-MS, Fourier-transform ion cyclotron resonance mass spectrometry
2. Mass Spectrometries Used with ESI In Table 1, several features of individual mass spectrometries which can be used in combination with ESI are listed.
2.1. Single-stage Quadrupole Mass Spectrometry This instrument is essentially used with HPLC, because of lack in MS/MS measurement process. In this system, switching quickly from positive to negative is very important. From the molecular-related ions obtained in each retention time, the fragment ions can be obtained by in-source collision of these metabolites by switching from low- to high-energy ionization conditions.
2.2. Triple-Stage Quadrupole MS (Tandem Mass, MS/MS) By using triple-stage quadrupole MS (Fig. 3), surveys of precursor ions and fragment ions are easily obtained simultaneously. Analyses such as product ion scanning, precursor ion scanning, neutral loss scanning, and multiple reaction monitoring (MRM) can be possible. Among these methods, precursor ion scanning and neutral loss scanning have begun to be utilized in metabolomics, and these methods will become much more popular in the near future. MRM, also called selected reaction monitoring (SRM), is commonly used in combination with HPLC in the field of pharmaceutical studies.
30
R. Taguchi
Fig. 3. A triple-staged quadrupole MS. A figure modeled by Quattro II (from a catalog by Micromass)
2.3. Quadrupole Ion-Trap MS Quadrupole ion trap MS (Fig. 4) is very useful for structural analysis because of its availability of MS/MS/MS adding to MS/MS. Data-dependent MS/MS acquisition is a very popular and effective method for shotgun proteomics. In the area of metabolomics also, data-dependent MS/MS surveillance will become more popular in the near future. 2.4. Single-stage TOF-MS High resolution, high accuracy, and high-speed acquisition should be features of ESI-TOF-MS. However, the relatively slow rate in switching from positive to negative is a weak point of this type of MS. Recently some venture companies in metabolomics have tried using two systems of a single-stage type of TOF-MS with four HPLCs by monitoring positive and negative ions simultaneously. In their system, the eluted samples from four HPLC systems were induced to an ESI source by switching with an interval of one second by the multiplex systems for high throughput screening.
Application of ESI-MS for Metabolomics
31
Fig. 4. A quadrupole ion-trap MS. A figure modeled by Esquire 3000 (from a catalog by Brucker)
2.5. Quadrupole Hybrid TOF-MS Quadrupole TOF-MS (Fig. 5) is a hybrid of quadrupole and TOF-MS. In this system molecular-related ions are firstly detected by TOF-MS. Then, on obtaining MS/MS data, the precursor ions are selected by quadrupole, and their fragment ions are detected in a TOF analyzer. Mass accuracy and resolution is very high in both molecular-related ions and fragments. An especially high s/n ratio in the spectrum of precursor ions is its main feature. 2.6. FT-ICR-MS In the analysis of FT-ICR-MS, more than 1000 substances can be separately analyzed at one flow injection without separation by HPLC due to its high resolution. Namely, two metabolites different within 0.01 amu can be effectively separated and identified. Theoretically more than 100 different substances can be detected within 1 amu, meaning that more than 10 000 peaks can exist within 100 amu. This type of MS has begun to be used in the area of metabolomics due to the fact that the molecular elements of each metabolite can be effectively identified with a mass accuracy of less than 2 ppm.
32
R. Taguchi
Fig. 5. A quadrupole time-of-flight (TOF) hybrid MS. A figure modeled by Q-TOF (from a catalog by Micromass)
2.7. Other Types of MS Recently, several different types of hybrid MS have become available on the open market. Different types of these hybrid MS such as a hybrid of quadrupole and linear ion trap, a hybrid of ion trap and TOF, and a hybrid of linear ion-trap and FT-ICR, have also different kinds of features for targeting, such as highly sensitive identification or in structural studies. On the other hand, even in the similar types of mass spectrometer, the features of each are rather different. Thus, it is very important to choose a mass spectrometer that suits a specific research project.
3. Methods of Sample Injection Used in ESI-MS
3.1. Using a Syringe Pump This system is commonly served from MS companies as a basic sample injection system and is used for introducing standard samples for calibration
Application of ESI-MS for Metabolomics
33
of MS systems. When more than 100 jxl of total volume is available for the analysis, this method is also applicable to the normal sample analysis. 3.2. Flow Injection This method is essentially used with an injector valve, and preliminary injected samples into a sample loop by microsyringe can be induced to MS by the flow from pump systems. With this method less than 1 |LI1 of samples can be effectively analyzed at a flow rate of several micro- or tens of nanoliters per minute. The analysis at nano flow is especially useful to structural study with MS/MS methods such as precursor ion scanning or neutral loss scanning. In this small scale of analysis, contamination within the connecting line of flow injection should be avoided. 3.3. One-Shot Nano Flow Probe This method uses a disposable nano-chip. The sample is loaded in a nano flow chip and the sample introduced into MS with the natural power of aspiration under low vacuum in the MS system. Usually 10 to 100 nl flow can be obtained. In this system, the problem of contamination is greatly decreased, but each sample should be introduced manually. 3.4. Using LC Systems for Sample Injection This system is essentially same as a flow injection system, except that a separation column will be set between MS and the injector. Particularly in a nano flow experiment, it is reconmiended to set a colunm as close to the MS system as possible. To use gradient elution, a gradient LC system will be needed; quadrupole, ITMS, TOF-MS, and FT-ICR-MS systems are possible to be combined with LC systems. By using ITMS or TOFMS, the high throughput analysis has been applied with a short column of 10-60 mm length in proteomics. In the ESI analysis, fewer than 30 samples eluted at the same retention time can be easily and separately analyzed by MS. In the analysis of small amounts of the target metabolites, the most important aim in using the separation with HPLC is to remove nonvolatile ions and the contaminated ions that are likely to be detected with high ionization efficiency and cause ion suppression to target metabolites. In this case, the reproducibility of retention time is less important for identification. Accurate quantitation is very difficult because ionization efficiency is influenced by contaminated ions and their percent contents in the samples
34
R. Taguchi
easily varied [3,5]. But even at a semiquantitative level, the differences in profiles are informative in forming some effective speculation. Colunms used for the ESI-MS have many variations in their size from conventional to nano. As a mechanism of ESI, the optimal flow is 2-10 |Lil/min. But to avoid contamination within the MS system, the flow from several |uJ/min to less than 100 [jj/min is desirable. If the flow is higher than this value a part of the eluent should be introduced into MS after splitting. To analyze samples in proteomics or metabolomics, at least capillary or nano LC is required. To achieve effective analysis with a nano LC/MS system, specified techniques for many stages of nano LC/MS systems, such as pump systems, column, connection with MS, and nano spray tips are required. In these various areas many innovations have been reported. The selection of columns for the separation of metabolites depends on the chemical features of each metabolite. For the analyses of peptides and many drugs, the C18reversed-phase column has been the most popular. In this case, most of nonvolatile salts are eluted at an earlier stage of elution without any adsorption. In the analysis of lipids we commonly use a normal phase column. In such a case, using an ion exchange column, both volatile acid and base should be selected.
4. Conclusion The focused analysis for the metabolites obtained by rough separation with the differences in solubility compared to the extraction solvent is very effective, because further separation such as by LC or CE conditions suitable for the chemical properties of these metabolites is more easily selected. This is true for all metabolomics studies, especially in the detection of minor components. By using the differences in solubility, metabolites can be categorized into different groups of chemical and physiological nature. For each specified category of the metabolites, specified fields of metabolomics such as for peptidome, glycome, and Upidome can exist. It is very important that new strategies of analytical methods and databases for each of these categories of metabolites are created.
References 1. Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM (1989) Electrospray ionization for mass spectrometry of large biomolecules. Science 246:64-71
Application of ESI-MS for Metabolomics
35
2. Mashego MR, Wu L, Van Dam JC, Ras C, Vinke JL, Van Winden WA, Van Gulik WM, Heijnen JJ (2004) MIRACLE: Mass isotopomer ratio analysis of U-13C-labeled extracts. A new method for accurate quantification of changes in concentrations of intracellular metabolites. Biotechnol Bioeng 85(6):620-628 3. Lafaye A, Junot C, Ramounet-Le Gall B, Fritsch P, Tabet JC, Ezan E (2003) Metabolite profiling in rat urine by liquid chromatography/electrospray ion trap mass spectrometry. Application to the study of heavy metal toxicity. Rapid CommunMass Spectrom 17(22):2541-2549 4. Aharoni A, Ric de Vos CH, Verhoeven HA, Maliepaard CA, Kruppa G, Bino R, Goodenowe DB (2002) Nontargeted metabolome analysis by use of Fourier Transform Ion Cyclotron Mass Spectrometry. OMICS 6(3):217-234 5. Katz JE, Dumlao DS, Clarke S, Hau J (2004) A new technique (COMSPARI) to facilitate the identification of minor compounds in complex mixtures by GC/MS and LC/MS: tools for the visualization of matched datasets. J Am Soc Mass Spectrom 15(4):580-584 6. Strauss AW (2004) Tandem mass spectrometry in discovery of disorders of the metabolome. J Clin Invest 113(3):354-356 7. Castrillo JI, Hayes A, Mohammed S, Gaskell SJ, Oliver SG (2003) An optimized protocol for metabolome analysis in yeast using direct infusion electrospray mass spectrometry. Phytochemistry 62(6):929-937 8. Soga T, Ohashi Y, Ueno Y, Naraoka H, Tomita M, Nishioka T (2003) Quantitative metabolome analysis using capillary electrophoresis mass spectrometry. J Proteome Res 2(5):488-494 9. Jiao Z, Baba T, Mori H, Shimizu K (2003) Analysis of metabolic and physiological responses to gnd knockout in Escherichia coli by using C-13 tracer experiment and enzyme activity measurement. FEMS Microbiol Lett 28; 220(2):295-301 10. Taguchi R, Hayakawa J, Takeuchi Y, Ishida M (2000) Two-dimensional analysis of phospholipids by capillary liquid chromatography/electrospray ionization mass spectrometry. J Mass Spectrom 35(8):953-966
Chapter 4: High-Performance Liquid Chromatography and Liquid Chromatography/ IVIass Spectrometry Analyses of IVIetabolites in Microorganisms Hiroshi Miyano Institute of Life Sciences, Ajinomoto Co. Inc., 1-1 Suzuki-cho, Kawasaki-ku, Kawasaki, Kanagawa 210-8681, Japan
1. Introduction The metabolome is defined as the quantitative complement of all of the low molecular weight endogenous metabolites and those intermediates, including substrates, products, and regulatory factors, in a particular physiological or developmental state. Metabolomics is one of the important research areas in the post-genome era, like proteomics and transcriptomics, because endogenous metabolites are the products resulting from intracellular regulation. Thus, the concentrations of intracellular metabolites are reflected in the final response of a biological system, and are derived from inheritable genetic modifications or environmental variations. A prevailing approach of metabolomics is the determination of identified intracellular metabolites, including amino acids, intermediates in the tricarboxylic acid cycle (TCA cycle), intermediates in glycolysis, intermediates in the pentose phosphate pathway, nucleotides, and other endogenous metabolites. The approach also includes the isotopic distribution analysis of metabolites in cells cultured with a substrate bearing a stable isotope, e.g., ^^C-labeled glucose for interpretations of the complex time-related concentration, activity, and flux of intracellular metabolites in cells. The results are useful for metabolic engineering. Another important approach of metabolomics is metabolic profiling, in which all of the responses of intracellular metabolites derived from nuclear magnetic resonance (NMR) and Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICRMS) are observed [1]. The responses derived from both identified and unidentified metabolites are statistically compared by multivariate analysis. The unidentified metabolites characterized by the statistical analysis will be identified as key metabolites.
38
H. Miyano
Genetic research dramatically developed in the 1990s because of the revolutionary progress in analytical methods and improvements in analytical equipment. Comprehensive protein analyses have been significantly developed because of the progress in analytical technology, such as mass spectrometry (MS) and improvements in protein databases. In addition, NMR techniques and protein molecular modeling have assisted structural-functional analyses of proteins. Analyses of intracellular metabolites, however, have not been developed, although many specific analytical methods for certain metabolites were developed over the last century. The reason is that metabolites do not directly bear relevance to the central dogma from DNA to protein via mRNA. Another reason is that the endogenous metabolites in cells have multitudes of properties and functional groups, such as amino acids, organic acids, sugars, sugar phosphates, nucleotides, lipids, coenzymes, and inorganic ions. Furthermore, a wide range of concentration levels of intracellular metabolites exists in cells, and few methods exist for analyses of multiple compounds simultaneously. This chapter describes the analyses of amino acids, organic acids including intermediates in the TCA cycle, and phosphate esters, including intermediates in glycolysis and intermediates in the pentose phosphate pathway, by high-performance liquid chromatography (HPLC) and LC/MS, which comprise the optimal equipment for high-performance methods of separation analysis. Many HPLC applications for biological analyses and drug kinetics have been reported, because the advantage of the HPLC method is that researchers can select an appropriate method from many different separation and detection modes. HPLC is better for structure-related separation and detection than other methods. Methods for the inactivation of metabolism and the extraction of metabolites from cells are also reported, because the preparation procedures before measurement are important for analyses of time-related concentrations and activities of intracellular metabolites.
2. Inactivation of IMetabolism and Extraction of [Metabolites from Cells The measurements of intracellular metabolites are divided into three steps: (1) inactivation of metabohsm in cells, (2) extraction of metabolites from cells, and (3) measurement of metabolites. Researchers have to consider both the biological and chemical stability of individual metabolites.
HPLC and LC/MS analyses for metabolome
39
Metabolic reactions in microorganisms, especially catabolic pathways and energy metabolism reactions, have high turnover rates. Cytosolic glucose is converted at approximately 1 mM s"^ and cytosolic ATP at approximately 1.5 mM s"^ [2]. The turnover times reported for amino acids were also in the range of seconds [3]. For reliable measurements of the intracellular concentrations of metabolites, the metabolism in the sampled cells should be rapidly inactivated as compared to the metabolic reaction rates, to avoid uncontrolled reactions, and the sampling rate should be high enough to study rapid and dynamic metabolic reactions [2]. Buchholz et al. developed equipment for routine rapid sampling and inactivation of metabolism that allows 4-5 samples to be obtained per second, continuously during the fermentation process [4]. hicubation mixtures including microorganisms are sprayed into cold solvent at a temperature of -20°C to ~50°C, which immediately inactivates metabolism. Other groups also developed automated sampling devices coupled to fermentation tank reactors [2,5-7]. Extraction of the metabolites from cells was also carried out at sub-zero temperatures to prevent re-activation of enzymes. Hans et al. demonstrated the necessity of immediate termination of metabolic activities for the analysis of intracellular amino acids. They compared the analytical values of amino acids in exponentially growing yeast cells that were prepared in cold methanol (below -20°C), before cell harvest by centrifugation and subsequent extraction, with those prepared by cooling in cold water (+4°C) [3]. Significant concentration changes were observed between the amino acids prepared in cold methanol and those prepared in cold water (Fig. 1). 100 % 50
Q
-^LJLJLjyi I
-50 l Fig. 1. Comparison of observed intracellular amino acid contents of yeast cells that were extracted after metabolism was inactivated at different temperatures. The vertical axis is the ratio of the observed value of individual amino acid concentrations prepared in cold water (+4°C) to those prepared in cold methanol (-20°C). Data adapted from Hans et al. [3]
40
H. Miyano
Several metabolites are known to be chemically unstable in acidic or basic solutions. Adenine nucleotides, including adenosine triphosphate (ATP), are unstable in acidic solutions at a room temperature. The reduced form of nicotinamide adenine dinucleotide is unstable under acidic conditions, although the oxidized form is degraded under basic conditions [6]. ß-Keto acids are nonenzymatically degraded to the decarboxylate form under both acidic and basic conditions. Oxaloacetic acid in the TCA cycle, for instance, is easily converted to pyruvic acid, which is the final metabolite in glycolysis and is transformed to acetyl CoA, a TCA cycle substrate, by pyruvate dehydrogenase. Although amino acids are known to be comparatively stable metabolites, tryptophan is easily decomposed under basic conditions and glutamine is cyclized by intramolecular condensation. Researchers should not select a procedure of metabolite inactivation and extraction from cells by only considering the largest numbers of intracellular metabolites that can be collected. It is important to select appropriate procedures by which the target metabolites are prevented from catabolism and chemical degradation, by considering the properties of the compounds. Maharjan and Ferenci recently reported the influence of extraction methodology on the metabolome profiling of Escherichia coli [8].
3. Free Amino Acid Analysis 3.1. History of Amino Acid Analysis High-performance liquid chromatography has been widely used for quantitative analyses of amino acids. The ninhydrin reagent is used for the system that follows the principle of the post-column derivative conversion method. Stein, Moore, and Spackmann developed an amino acid analyzer system for the colorimetric determination of amino acids by producing the purple color with a ninhydrin reagent, which first became available by Ruhemann in 1911, after stepwise separation of the amino acids while raising the pH of the citric acid buffer solution with a cation exchange resin. The analyzer is now capable of assaying not only protein hydrolysates but also amino acids in biological fluids, with the progressive improvement of HPLC pumping performance and the development of more effective column resins. Apart from the ninhydrin method, a variety of other techniques for converting amino acids to sensitive analyzable fluorescent derivatives (derivatization) have been developed since the late 1970s [9,10]. Sensitive methods for the detection of intracellular amino acids are useful because
HPLC and LC/MS analyses for metabolome
41
smaller numbers of cells are preferable for the rapid inactivation of metabolism and the efficient extraction of metabolites. The reagents used to produce fluorescent derivatives include or/Äo-phthalaldehyde (OPA) (^ex. 340-345 nm, A.em. 455 nm), 9-amino quinolyl-A^-hydroxysuccinimidyl carbamate (AQC) (Xex. 245 nm, Xem. 395 nm), fluorescein isothiocyanate (FITC) (Xex.490 nm, A.em.515 nm), and 4-fluoro-7-nitrobenzofurazan (NBD-F) (^x. 470 nm, ^em. 530 nm). The detection sensitivity is within the range from sub-picomole to femtomole. To detect the lower amounts of fluorescent derivatives of amino acids, the wavelength of the fluorescence excitation is brought closer to that of laser light. 3.2. Free Intracellular Amino Acid Analysis Quantitative analyses of free intracellular amino acids have been reported in a few papers [3,11-16]. Hans et al. determined the dynamics of intracellular amino acid pools in different growth phases by a precolumn fluorescent derivatization method with AQC [3] and those during autonomous oscillations of Saccharomyces cerevisiae ATCC 32167 in batch cultures by a precolunm fluorescent derivatization method with OPA [16]. After separation of cells from the medium and inactivation of metabolism at temperatures below freezing, the intracellular amino acids were extracted with either boiling buffered ethanol (75% ethanol/0.25 M 4-(2-hydroxyethyl)-lpiperazineethanesulfonic acid (HEPES; pH 7.5)), or boiling water. Extraction with boiling water is preferable to that with boiling buffered ethanol for the precolumn derivatization method, because salts such as HEPES adversely affect the derivatization rate, yield, and reproducibility of the HPLC retention time. Although AQC and OPA are not significantly different as derivatization reagents for primary amino acids, these authors reported that histidine/glutamine and serine/asparagine derivatized with AQC could not be separated. We determined the free intracellular amino acids in E. coli Kl2 MG 1655 by fluorescent detection and mass detection after the derivatization and column separation. The cultivation medium consisted of 8 g of Bacto-Tryptone, 5 g of Bacto-Yeast Extract and 1.25 g of sodium chloride in 250 ml of distilled water at pH 7, adjusted by a 1 M sodium hydroxide solution. Samples were collected after a 4-h incubation (at log phase) and after a 7-h incubation (at stationary phase). A 20-ml sample of the culture broth was drawn from the culture flask and rapidly mixed with 30 ml of 60% of methanol containing 70 mM HEPES, which was pre-cooled at -70°C (inactivation step). The aliquots were centrifuged at -20°C, and the supernatant was discarded. The pellets were resuspended in 1 ml of 50%
42
H. Miyano
methanol, and the intracellular metabolites were extracted from the cells by a freeze-thaw method [17]. The aliquots were extracted with 0.25 ml of chloroform, and the hydrophilic metabolites in aqueous phase, including free intracellular amino acids, were collected (extraction step.) Extracts were lyophilized after ultrafiltration (10 kDa) and were stored at -78°C until analysis. An extraction procedure that prevents salt from being mixed in the sample is also useful for the precolumn derivatization HPLC method. Figure 2 shows the chromatogram of the free intracellular amino acids, extracted from E. colU derivatized with NBD-F and detected by fluorescence with an excitation wavelength of 470 nm and emission of 530 nm. To avoid interfering fluorescence of around 400 nm emitted from the biological substances, selective and sensitive detection of biomolecules was attained by tagging them with benzofurazan reagents such as NBD-F [9,18,19]. Figure 3 shows the LC/MS spectra of the same sample detected by a single ion monitoring (SIM) mode. Since NBD-F specifically reacts with an amino group to produce a fluorescent adduct, almost all amines, including free intracellular amino acids, are detected on the chromatogram (Fig. 2). Although this method is not comprehensively applicable for all metabolites, it is useful for detecting compounds with amine functional groups. In contrast, individual amino acids can be separately detected by LC/MS, which is widely used in pharmacokinetics because of its higher selectivity. The precolumn LC/MS method will be quite useful for the quantification of known amino acids. The concentrations of the intracellular amino acids in E. coli Kl2 at log phase and stationary phase are summarized in Table 1. We have to mention that the values are not absolute, because the intracellular concentrations of metabolites widely vary with the culture conditions.
.J^JU^AMJOJ 30
wm 60
W^u,_^J^ 90
120
Fig. 2. High-performance liquid chromatography (HPLC) chromatogram of Escherichia coli intracellular amino acids at log phase, derivatized with 4-fluoro-7-nitrobenzofurazan (NBD-F), with fluorescent detection
HPLC and LC/MS analyses for metabolome
20
43
min
Fig. 3. Liquid chromatography/mass spectrometry (LC/MS) chromatogram of E. coli intracellular amino acids at log phase, derivatized with NBD-F, by single ion monitoring mode (mass spectrometer: PE Sciex API365)
4. Analysis of TCA Cycle Intermediates There are many different organic acids in cells, like the case with amines. In particular, the organic acids in the TCA cycle (citric acid cycle, Krebs cycle) are essential to cellular activity. The TCA cycle plays two important roles in cells. The one is catabolism, which is concerned with energy production. Adenosine triphosphate is generated in the process of 2-oxoglutaric acid oxidation to succinic acid. The other is anabolism, in which the intermediates in the TCA cycle are the seeds for amino acid biosynthesis. Thus, determination of the intermediates in the TCA cycle is also important in physiological and biochemical studies. In particular, the change of the flux and the absolute levels of intermediates in the TCA cycle may reflect defects in the substrate flow into mitochondria or the discharge of products, and energy metabolism changes of the cellular redox state. Intermediates in the TCA cycle include citric acid, isocitric acid, 2-oxoglutaric acid, succinic acid, fumaric acid, malic acid, and oxaloacetic acid. Pyruvic acid should
44
H. Miyano
also be monitored, because it is a key intermediate between the glycolytic pathway and the TCA cycle. Hydrophilic organic acids, including intermediates in the TCA cycle, can be determined by HPLC. The intracellular concentrations of TCA cycle intermediates in a microbial cell extract can be analyzed by anion-exchange HPLC with a basic solution as an eluent [20,21]. Groussac and colleagues separated organic acids on an lonCarboPac AS 11 (Dionex) column using a 0.5-35 mM NaOH elution [21]. Five organic acids in the TCA cycle, including succinic acid, malic acid, fumaric acid, 2-oxoglutaric, and citric acid, were identified within 20 min. Detection limits in the range of milligrams per liter were achieved by the use of conductivity. They applied the method for a dynamic analysis of TCA cycle intermediates in S. cerevisiae, in response to a glucose pulse. In addition to the amino acid analysis mentioned above, derivatization of a carboxyl group with a fluorescent reagent before column-separation is also useful for selective and sensitive detection [9]. The 4-(A^, A^-dimethylaminosulfonyl)-7-piperazino-2,1,3-benzoxadiazole (DBD-PZ) reagent, which has a benzofurazan skeleton like NBD-F, can emit strong fluorescence, with emission wavelengths around 560 nm [22]. Each intermediate in the TCA cycle has two or more carboxyl groups per molecule, that is, 2-oxoglutaric acid, succinic acid, fumaric acid, malic acid, and oxaloacetic acid have two carboxyl groups, and isocitric acid and citric acid have three carboxyl groups. Thus, it seemed that multiple derivatives could be generated from each intermediate in the TCA cycle by the reaction with DBD-PZ. As it is preferable to detect the analyte as a single peak on the chromatogram, the parameters to be investigated were the types of condensing reagents and base catalysts, the concentrations of condensing reagents, base catalysts and DBD-PZ, the reaction time, the reaction temperature and the dissolving solvent. The structures of these DBD-PZ derivatives were confirmed by LC/MS, which proved that all of the carboxylic groups were completely labeled with DBD-PZ under the optimal conditions, except for oxaloacetic acid, which was converted to pyruvic acid during derivatization [23]. The limits of fluorescence detection for all adducts were between 2 and 100 fmol, at a signal to noise ratio of 3. Among the organic acids examined, the citric acid derivative with DBD-PZ showed the lowest detection limit of 2 fmol, indicating that the method has the merit of high sensitivity. Figure 4 shows the chromatogram of free intracellular organic acids in E. coli Kl2 MG 1655 by fluorescent detection, after the derivatization with DBD-PZ and column separation. Intracellular organic acids, including intermediates in the TCA cycle, are detected on the chromatogram because DBD-PZ specifically reacts with carboxyl groups to produce the fluorescent
HPLC and LC/MS analyses for metabolome
_ju U 10
45
.^-^^....J^^^-J^ 20
30
min 40
50
Fig. 4. HPLC chromatogram of E. coli intracellular organic acids at log phase, derivatized with 4-(A^,7V-dimethylaminosulfonyl)-7-piperazino-2,l ,3-benzoxadiazole, with fluorescent detection
first MS/second MS citric acid/isocitric acid
J 073.3/761.0
2-oxoglutaric acid
__733.1/670.0
succinic acid
TV.
_705.1/394.0 _703.1/640.2
fumaric acid
721.1/410.0 382.1/336.8
malic acid pyruvic acid/oxaloacetic acid 0
mm
10
Fig. 5. LC/MS/MS chromatogram of E. coli intracellular intermediates in tricarboxylic acid (TCA) cycle at log phase, derivatized with NBD-F, by selected reaction-monitoring mode (mass spectrometer: PE Sciex API365). Extracted m/z values of the first mass and the second mass are described on therightside of each chromatogram of the intermediate, respectively adducts. Although the method is not comprehensive for all metabolites, it is useful for detecting compounds with carboxyl functional groups. Figure 5 shows the mass chromatogram of the TCA cycle intermediates that were derivatized with DBD-PZ. Detection was carried out by the selected reaction monitoring (SRM) mode, using a triple-quadrupole mass spectrometer. Based on the mass spectra of the derivatives, the precur-
46
H. Miyano
sor-product transitions are determined. The details are described in Fig. 5. The precolumn LC/MS method combined with precolumn derivatization will be quite useful for the selective detection of not only TCA cycle intermediates but also known organic acids. The intracellular concentrations of TCA cycle intermediates in E. coli Kl2 at log phase and stationary phase are summarized in Table 1. Like the case with the amino acids, we have to mention that the values are not absolute, because the intracellular concentrations of the metabolites change greatly with the culture conditions. Table 1. Intracellular concentrations of amino acids and intermediates in tricarboxylic acid (TCA) cycle in Escherichia coli Kl2 Intracellular concentrations^
Amino acids
Log phase Above 1 mM Above 100 |iM
Above 10 ^M
Above 1 joM
Lysine Glycine, alanine, y-aminobutyric acid, ornithine, glutamic acid, histidine, phenylalanine, arginine Serine, proline, valine, threonine, isoleucine, leucine, aspartic acid, glutamine
Intermediates1 in TCA cycle
Stationary phase Lysine Alanine, y-aminobutyric acid, glutamic acid, histidine
Log phase
Stationary phase
Glycine, valine, isoleucine, leucine, ornithine, aspartic acid, glutamine, methionine, phenylalanine, arginine Serine, proline. threonine
Citric acid
Succinic acid
Fumaric acid. isocitric acid. malic acid, 2-oxoglutaric acid
Citric acid. fumaric acid. isocitric acid malic acid, 2-oxoglutaric acid
Succinic acid
^The values are not absolute, because the intracellular concentrations of metabolites vary greatly under the culture conditions. Culture conditions are described in the text
HPLC and LC/MS analyses for metabolome
47
In the case of carboxyl group derivatization, the carboxyl group of an organic acid could react with the amino group of the reagent by a condensation reaction. A large amount of amino acids would prevent the condensation reaction of an organic acid with the reagent, because the carboxyl group of an amino acid could react with the amino group of another amino acid. Precolumn derivatization methods of organic acids have to accommodate the possibility of a large number of amino acids, especially when determining the TCA cycle intermediates in industrial microorganisms involved in amino acid production.
5. Analysis of Glycolysis and Pentose Phosphate Pathway Intermediates Glycolysis (Embden-Meyerhof pathway, Embden-Meyerhof-Pamas pathway) is the basic energy metabolic system for almost all organisms. In glycolysis, the conversion of one mole of glucose to two moles of pyruvic acid, which is a TCA cycle substrate, is accompanied by the net production of two moles each of ATP and the reduced form of nicotinamide adenine dinucleotide (NADH). The pentose phosphate pathway is also important in glucose metabolism for generating NADPH (reduced NAD phosphate) for biosynthetic reactions and pentose sugars for nucleotide biosynthesis. Intermediates in both pathways are grouped as phosphate esters with sugar alcohol. (Hereafter, they are collectively called sugar phosphates) Buchholz et al. developed a novel LC/MS method for the quantification of intracellular concentrations of phosphate esters, using a cyclodextrin-bonded phase column eluted with aqueous ammonium acetate and methanol [24]. The intracellular intermediates such as glycolysis intermediates, nucleotides, and cofactors in E. coli Kl2 could be quantified, with detection limits from 0.02 to 0.50 mM. Although isobaric substances, such as glucose-6-phosphate/fructose-6-phosphate and 3-phosphoglycerate/ 2-phosphoglycerate, were not separated under these conditions, these intermediates can be separated on a porous graphitic carbon colunm. Sugar phosphates can also be analyzed by using anion-exchange chromatography, as with organic acids [20,21]. Conductometry or pulsed amperometry are generally used for the detection, because sugar phosphates lack a characteristic ultraviolet absorption and there are no specific reagents for phosphate esters. An anion-exchange chromatograph/mass spectrometer can successfully measure sugar phosphates eluted in a solution of sodium hydroxide, with postcolumn removal of sodium ions using a commercially available ion suppressor before sample introduction into the mass spec-
48
H. Miyano
trometer [25,26]. The effect of the suppression is demonstrated in Fig. 6. The flow channel of the eluent is placed between the cation-exchanged membranes, and the other sides of the membranes are placed by the flow channel of the regenerated solution. Electrodes for electrolysis are fixed outside of the flow channels of the regenerated solution, and hydrogen ions are generated from water on the positive electrode. The suppressor behaves like a cation exchanger, and replaces the sodium counterions with hydronium ions. Thus, when the analytes leave the suppressor, they are in a water solution. The analysis of sugar phosphates by an anion-exchange chromatograph/mass spectrometer with an anion suppressor has the useful features of high separation ability derived from the anion-exchange Chromatograph and the high selectivity derived from mass spectrometry. Sugar phosphates are detected as molecular ions [M-H]". After collision activation, the most common first fragmentation step is the removal of a sugar or alcohol moiety, and thus the specific daughter fragment ion [H2PO4]" is observed. The characteristic cleavage is available for the selective detection of sugar phosphates using LC/MS/MS. The chromatograms of E. coli Kl2 extracts obtained by using anion-exchange chromatogram-suppressor-MS/MS are shown in Fig. 7, which also demonstrates a chromatogram with pulsed amperometric detection as a reference. A selected reaction monitoring mode was used for the detection of sugar phosphates. Molecular ions [M-H]" were detected by the first MS, while m/z 97 [H2PO4]" were detected by the second MS. The detection limits are 0.1 to 5 |LiM. The demonstration shows that pulse amperometric detection is useful for the metabolic profiling of intermediates with phosphate esters, while LC/MS/MS is quite useful for the specific detection of sugar phosphates in glycolysis and the pentose phosphate pathway.
Fig. 6. Inner structure and anion exchange mechanism within suppressor
HPLC and LC/MS analyses for metabolome
MS/MS
JK
MS/MS
IPAD
JX^
49
339.1/97.1 259.1/97.1
WJuAllfÄuu-j 10
20
30
Fig. 7. LC/MS/MS chromatogram of E. coli intracellular sugar phosphates, by selected reaction-monitoring mode (mass spectrometer: PE Sciex API365). Extracted m/z=259 of the first mass and extracted m/z=97 of the second mass are detected as galactose-1-phosphate, glucose-1-phosphate, galactose-6-phosphate, glucose-6-phosphate, fructose-6-phosphate, and mannose-6-phosphate by order of elution. Extracted m/z=339 of the first mass and extracted m/z-91 of the second mass are detected as fructose-1,6-bisphosphate. IPAD, integrated pulsed amperometric detection
6. Conclusion This chapter has described the analytical methods of metabolites by HPLC and LC/MS, based on the functional group. Amino acids and organic acids can be detected by using specific derivatization reagents for amino groups and carboxyl groups, respectively. Sugar phosphates are detected by selected reaction monitoring, using the characteristic cleavage between the phosphate and sugar moiety or by pulsed amperometry using a reducing sugar. It is quite difficult to perform comprehensive and simultaneous analyses of a variety of intracellular metabolites by HPLC, as well as by other methods. We propose that the intracellular metabolites should first be classified by functional groups. A hundred amines, several dozen organic acids, and several dozen phosphate esters can be observed as peaks by HPLC. The combination of every profile results in a comprehensive analysis of the intracellular metabolites.
50
H. Miyano
The HPLC method based on functional groups is also useful for the determination of intracellular metabolites, especially by LC/MS/MS. The methods are applicable for the specific determination of amino acids, TCA cycle intermediates, and glycolysis and pentose phosphate pathway intermediates, which are the most fundamental metabolites and intermediates in cells. Integration of the method and instrument development are essential for a significant advancement of metabolomics, and we believe that the methods mentioned in this chapter provide useful tips toward this end.
References 1. Hirayama K (2005) Metabolome profiling by using FT-ICR-MS and ESI-Q-TOFMS. In: Tomita M, Nishioka T (eds) Metabolomics: The Frontier of Systems Biology. Springer, Tokyo, pp 75-90 2. Schaefer U, Boos W, Takors R. Weuster-Botz D (1999) Automated sampling device for monitoring intracellular metabolite dynamics. Anal Biochem 270:88-96 3. Hans M, Heinzle E, Wittmann C (2001) Quantification of intracellular amino acids in batch-cultures of Saccharomyces cerevisiae- Appl Microbiol Biotechnol 56:776-779 4. Buchholz A, Hurlebaus J, Wandrey C, Takors R (2002) Metabolomics: quantification of intracellular metabolite dynamics. Biomol Eng 19:5-15 5. Theobald U, Mailinger W, Reuss M, Rizzi M (1993) In vivo analysis of glucose-induced fast changes in yeast adenine nucleotide pool applying a rapid sampling technique. Anal Biochem 214:31-37 6. Theobald U, Mailinger W, Baltes M, Rizzi M, Reuss M (1997) In vivo analysis of metabolic dynamics in Saccharomyces cerevisiae: I. Experimental observations. Biotechnol Bioeng 55:305-316 7. Weuster-Botz D (1997) Sampling tube device for monitoring intracellular metabolite dynamics. Anal Biochem 246:225-233 8. Maharjan RP, Ferenci T (2003) Global metabolite analysis: the influence of extraction methodology on metabolome profiles of Escherichia coli- Anal Biochem 313:145-154 9. Uchiyama S, Santa T, Okiyama N, Fukushima T, Imai K (2001) Fluorogenic and fluorescent labeling reagents with a benzofurazan skeleton. Biomed Chromatogr 15:295-318 10. Fukushima T, Usui N, Santa T, Imai K (2003) Recent progress in derivatization methods for LC and CE analysis. J Pharm Biomed Anal 30:1655-1687 11. Ohsumi Y, Kitamoto K, Anraku Y (1988) Changes induced in the permeability barrier of the yeast plasma membrane by cupric ion. J Bacteriol 170:2676-2682
HPLC and LC/MS analyses for metabolome
51
12. Amezaga M, Davidson I, McLaggan D, Verbeul A, Abee T, Booth I (1995) The role of peptide metabolism in the growth of Listeria monocytogenes ATCC 23074 at high osmolarity. Microbiology 141:41-49 13. Martinez-Force E, Benitez T (1995) Effects of varying media, temperature, and growth rates on the intracellular concentrations of yeast amino acids. BiotechnolProg 11:386-392 14. Gent D, Slaughter J (1998) Intracellular distribution of amino acids in an slpl vacuole-deficient mutant of the yeast Saccharomyces cerevisiae- J Appl Microbiol 84:752-758 15. Roe A, McLaggan D, Davidson I, O'Byrne C, Booth I (1998) Perturbation of anion balance during inhibition of growth of Escherichia coli by weak acids. J Bacteriol 180:767-772 16. Hans M, Heinzle E, Wittmann C (2003) Free intracellular amino acid pools during autonomous oscillations in Saccharomyces cerevisiae- Biotechnol Bioeng 82:143-151 17. de Koning W, van Dam K (1992) A method for the determination of changes of glycolytic metabolites in yeast on a subsecond time scale using extraction at neutral pH. Anal Biochem 204:118-123 18. Watanabe Y, Imai K (1981) High-performance liquid chromatography and sensitive detection of amino acids derivatized with 7-fluoro-4-nitrobenzo-2oxa-l,3-diazole. Anal Biochem 116:471-^72 19. Watanabe Y, Imai K (1984) Sensitive detection of amino acids in human serum and dried blood disc of 3 mm diameter for diagnosis of inborn errors of metabolism. J Chromatogr 309:279-286 20. Bhattacharya M, Fuhrman L, Ingram A, Nickerson K, Conway T (1995) Single-Run separation and detection of multiple metabolic intermediates by anion-exchange high-performance liquid chromatography and application to cell pool extracts prepared from Escherichia coli. Anal Biochem 232:98-106 21. Groussac E, Ortiz M, Fran9ois J (2000) Improved protocols for quantitative determination of metabolites from biological samples using high performance ionic-exchange chromatography with conductimetric and pulsed amperometric detection. Enzyme Microb Technol 26:715-723 22. Toyo'oka T, Ishibashi M, Takeda Y, Nakashima K, Akiyama S, Uzu S, Imai K (1991) Precolumn fluorescence tagging reagent for carboxylic acids in high-performance liquid chromatography: 4-substituted-7-aminoalkylamino2,1,3-benzoxadiazoles. J Chromatogr 588:61-71 23. Kubota K, Fukushima T, Miyano H, Hirayama K, Imai K (2002) HPLC-fluorescence determination method for carboxylic acids related to TCA cycle as a tool for metabolome. 26th International symposium on high performance liquid phase separations related techniques (HPLC2002) (Montreal) Abstracts, p 55 24. Buchholz A, Takors R, Wandrey C (2001) Quantification of intracellular metabolites in Escherichia coli Kl2 using liquid chromatographic-electrospray ionization tandem mass spectrometric techniques. Anal Biochem 295:129-137
52
H. Miyano
25. Conboy J, Henion J (1992) High-performance anion-exchange chromatography coupled with mass spectrometry for the determination of carbohydrates. Biol Mass Spectrom 21:397-407 26. Gardner M, Voyksner R, Haney C (2000) Analysis of pesticides by LC-electrospray-MS with postcolumn removal of nonvolatile buffers. Anal Chem 72:4659-4666
Chapter 5: Metabolome Profiling of Human Urine with Capillary Gas Chromatography/Mass Spectrometry Tomiko Kuhara Division of Human Genetics, Medical Research Institute, Kanazawa Medical University, 1-1 Daigaku, Uchinada-machi, Kahoku-gun, Ishikawa 920-0293, Japan
1. Urine Can Provide Considerable Biological Information Metabolites are end products of cellular processes, and their levels reflect the response of biological systems, at the systems level. Metabolic profiling is a high-throughput approach to measuring and interpreting complex metabolic parameters in biosamples such as urine, blood, cells, and tissues. Human urine contains many classes of compounds, including organic acids, amino acids, purines, pyrimidines, sugars, sugar alcohols, sugar acids, amines, and other compounds, at a variety of concentrations. Measuring changes in metabolite concentrations is a powerful approach for assessing gene function. In the urine of a patient with a deficiency of an enzyme or its cofactor, the enzyme's substrate accumulates and/or there is a marked increase in metabolites that are formed secondarily via side paths, owing to the accumulation of the substrate. In some cases, instead of the substrates or its secondary metabolites, the level of the substrate precursor increases, owing to the derepression of end-product inhibition. Therefore, human urine can provide the necessary evidence to diagnose inborn errors of metabolism (lEMs). Besides lEMs, acquired metabolic disorders can be detected by metabolome analysis.
2. Urine is Superior to Blood Urine is superior to blood for metabolic profiling. We previously reported the use of urine to identify individuals with methylmalonic acidemia or
54
T. Kuhara
propionic acidemia [1]. Chamberlin and Sweeley reported that urine on filter paper is generally more useful for making diagnoses than a blood-spot on filter paper, except for diseases where very hydrophobic compounds accumulate [2]. We also compared capillary gas chromatography-mass spectrometry (GC/MS) analysis using urine on filter paper with results using serum or blood on filter paper. As shown in Fig. 1, urine from a patient with ornithine transcarbamylase deficiency showed a marked increase in orotate and uracil, but serum did not. Urine is also preferable to blood for screening for pyrimidine degradation disorders [3,4]. For lEMs that cause anuria due to kidney dysfunction, serum or plasma is used after the onset of the dysfunction. Special attention is required for sample preparation for the chemical diagnosis of primary hyperoxaluria types I and II, and for the monitoring after liver transplantation in type I [5].
3. Timing of Sampling is Important for the Chemical Diagnosis of Some lEMs but Not All Metabolite levels reflect the response of biological systems. Most defects in gluconeogenesis and fatty acid ß-oxidation are easily detected during fasting. The gluconeogenesis disorders, glucose-6-phosphatase deficiency, fructose-1,6-diphosphatase deficiency and pyruvate carboxylase deficiency, can only be chemically diagnosed during fasting (see fructose-1,6-diphosphatase deficiency example in Fig. 2). Fructose-1,6-diphosphatase (D-fructose-l,6-diphosphate 1-phosphohydrolase; EC 3.1.3.11) is a key enzyme of gluconeogenesis. Deficiency of fiiictose-1,6-diphosphatase (MIM 229700), originally described in 1970, therefore causes severe lactic acidemia and hypoglycemia during fasting conditions, and increased glycerol excretion during fasting [6-8]. During remissions, the urinary metabolic profiles appear normal, compared with control samples. Figure 2 shows the total ion chromatograms (TIC) of trimethylsilyl (TMS) derivatives of metabolites in urine from a patient with fructose-1,6-disphosphatase deficiency; the simplified urease-pretreatment procedure during an episode (upper panel) and a remission (lower panel). During a hypoglycemic episode, the metabolic profile changes dramatically: lactate, glycerol, and glycerol-3-phosphate levels all markedly increase in the urine. Lactate and glycerol are, however, not specific markers, as the former increases under a variety of disease conditions and the latter as a result of the glycerol infusion treatment (see Section 2.2 in Chapter 13 for details). Glycerol also increases markedly in glycerol kinase deficiency.
Human Urine Metabolome Profiling
11.00
12.00
11.00
12.00 Time(min)
55
Time(min)
m/z 241 X 2 6.00
7.00
8.00
9.00
10.00
Fig. 1. Mass chromatograms of trimethylsilyl (TMS) derivatives of metabolites in urine {upper) and serum {lower)froma patient with ornithine transcarbamylase deficiency. Both samples were prepared by the simplified urease pretreatment. The ions oim/z 241, [M-15]^ at 6.5 min and ofm/z 254, [M-HCOOTMS]"" at 9.43 min are due to uracil (di-TMS) and orotate (tri-TMS), respectively. The ion of m/z 327 [M-15]^ at 11.8 min is due to «-heptadecanoate (mono-TMS) used as an internal standard (75): 0.1 ml urine or serum was spiked with 50 nmol «-heptadecanoate. TIC, total ion current chromatogram
e.OC
700
800
900
10 0C
T(me(min)
1 <*> ,
,1
.
i .. .-J-^
nVz 357 X 6 _ ^ m/z 205 X / m/2 181x1
Fig. 2. Total ion chromatograms {TIC) of trimethylsilyl derivatives of metabolites from the urine of a patient with afructose-1,6-diphosphatasedeficiency. The samples were prepared using the simplified urease pretreatment and were obtained during a hypoglycemic episode {upper) and during a remission {lower). The ions targeted were m/z 231 for 2,2-dimethylsuccinate (2,2-DMS, ISi), m/z 229 for 2-hydroxyundecanoate (2HUD, IS2), m/z 329 for creatinine, m/z 357 for glycerol-3-phosphate {G-3-P\ m/z 205 for glycerol, and m/z 191 for lactate {Lac). During the episode, 3-hydroxybutyrate {ßHB) also increased. Glycerol-3-phosphate was markedly reduced but still detectable in the samples from the patient during remission andfromthe control, even in the scanning mode
56
T. Kuhara
4. Metabolic Profiling of Organic Acids The profiling of urinary organic acids by GC/MS is adopted for diagnosing organic acidemias. Since the discovery of isovaleric acidemia in 1966, lEMs classified as organic acidemias, in w^hich organic acids accumulate in the urine, have been discovered by GC/MS [9]. Due to its high chromatographic performance, sensitive and specific identification, and quantification, GC/MS is indispensable for the chemical diagnoses of these disorders. For GC/MS analysis, urinary organic acids are extracted with ethyl ether and/or ethyl acetate under acidic conditions w^ith or w^ithout adding sodium chloride, and are then dehydrated with sodium sulfate and evaporated to dryness; the residues are derivatized to increase their volatility and therefore their suitability for GC/MS analyses. The derivatization and/or silylation is performed with or without prior oximation [10-13]. Some polar acids are useful for making diagnoses: orotate is the most useful target for the screening of six primary hyperammonemias and orotic aciduria; methylcitrate is the key target for propionic acidemia; and glycerol-3-phosphate is the target for diagnosing fructose-1,6-diphosphatase deficiency [14]. To measure these polar acids quantitatively, they were extracted by DEAE-Sephadex ion exchange [13,15-17]. Glycerol3-phosphate is poorly recovered by solvent extraction. Quantitative analysis is difficult without the respective stable isotope-labeled internal standard (IS). Extraction with DEAE-Sephadex significantly improves the recovery of glycerol-3-phosphate, but this procedure takes several hours and extracts inorganic acids such as phosphate or sulfate, which interfere with the subsequent GC/MS analysis. Furthermore glycerol is not recovered by DEAE-Sephadex extraction [16,18]. Metabolic profiling of the organic acids in urine is not a novel technology. The procedure described by Hoffmann et al. is more quantitative but not widely applied, because it requires extensive sample preparation [19]. Most often, laboratories measure organic acids neither quantitatively nor semiquantitatively in terms of IS equivalents, because measurement varies from laboratory to laboratory. Currently, errors in quantitative results as great as 50% are acceptable for the diagnosis of inherited disorders, but in follow-up, the error for organic acids of clinical interest should be <20% [20]. However, lower analytical errors are desirable for comparing the excretion values reported from different laboratories, for monitoring therapy, for some differential diagnoses, and for the diagnosis of patients who have moderate hyperexcretions or are not in an acute episode. Moreover, the knowledge of essential biological variables such as the normal excretion
Human Urine Metabolome Profiling
57
concentrations and their variation according to age, genetic factors, and nutritional status also depend on better analytical precision and accuracy. Duez et al. reported that the tuning of the mass detector strongly affects the calibration factors, which are critical to achieve quantitative results, and proposed a practical procedure for reproducible tuning [21]. They used 2-ketocaproic acid to quantify urinary keto acids in their organic acid profiling; for other acids they recommend the use of a polar acid, tropate. We used 3-hydroxymyristate and w-heptadecanoate in our earlier organic acid profiling.
5. Metabolome Profiling
5.1. Introduction It v^as impossible to simultaneously analyze and quantify organic acids, purines, pyrimidines, amino acids, sugars, polyols, amines, and other compounds using a single-step fractionation procedure until urease pretreatment, silylation, and GC/MS became available in the early 1990s. Shoemaker and Elliott reported simultaneous analysis of urinary organic acids, amino acids, and sugars [22]. They degraded the excessive urea in urine samples w^ith urease and developed a method for preparing urinary metabolites for derivatization without solvent extraction or column clean-up. Removing urea, which is the major organic constituent, from the samples, renders the minor metabolites in urine samples accessible to derivatizing agents. A total of 103 compounds was quantitatively measured relative to endogenous urinary creatinine [22]. Their procedure, however, takes several hours, requires skilled technicians, and is less practical for the purpose of multiple-sample analysis. We drastically simplified their procedure [23], based on our experience of more than two decades with the chemical diagnosis of lEMs using GC/MS [24-33], and devised a procedure for multiple sample analysis with a potential use in neonatal screening. Our procedure takes 1 h for the pretreatment of one sample or 3 h for a batch of 30 samples, plus 15 min (at intervals of 30 min) per sample for the GC/MS measurement, as shown in Table 1. In the simplified urease pretreatment procedure, the recovery of polar acids is very high, and the diagnosis of diseases becomes remarkably rapid, accurate, and easy, as has been reported [14,23,34,35]. Figure 3 shows an example of analysis.
58
T. Kuhara
Table 1. Pretreatment of urine and GC/MS measurement Fractionation procedure
GC/MS measurement (minutes)
Pretreatment (hours)
Simultaneous analysis of metabolites The present method 1^ Shoemaker's method 5 Organic acids analysis Solvent-extraction 3-5 Sweetman's method 10 DEAE-sephadex 14 GC/MS, gas chromatography/mass spectrometry; ^ 3 h for a batch of 30 samples
15 60 30-60 30-60 30-60
Table 2. Accuracy of diagnostic screenings of phenylketonuria
Method
1 Point analysis BIA
2 Line analysis GC/MS (organic acids)
3 Line analysis MS/MS
4 Planar analysis GC/MS (urease pretreatment procedure) Urine
Sample
Blood
Urine
Blood
Target
One
A series of organic acids
A series of acylcamitine and amino acids
Multiple categories of compounds
Phenylket onuria
Phe
PL, PP, OHPA
Phe
Phe, PL, PP, OHPA
Positive PL, PP, OHPA Phe by No PL,PP, cases by method 2 or method 4 or OHPA by examination 4 should be method 2 or 4 necessary AA analysis further examined for Amino acids analysis by HPLC or other chromatography should be classified as method 2, but these methods are not listed here because the specificity is lower than that of mass spectrometry. BIA, bacteria inhibition assay
Human Urine Metabolome Profiling
.JJUL.
59
UÜJL.
m/z 327 X 1 m/2 329 X 2 m/2 241 X 2 m/2 243 X 2 m/2 254 X 1
12.00
m/z 256 X 1 Time(min)
JbL-Ä-JLjLA
m/z 327 X 1 m/z 241 x2 _ m/z 243 X 2 _ m/z 254 X 1 m/z 256 X 1
Fig. 3. Total ion chromatograms (TIC) of trimethylsilyl derivatives of metabolitesfromthe urine of a patient with ornithine transcarbamylase deficiency. The samples were processed using the simplified urease pretreatment (a) or the solvent extraction method (b). The ions targeted were m/z 327 for IS3 (external standard heptadecanoate; HD A), m/z 329 for creatinine, m/z 241 and m/z 243 for uracil (cold and labeled), and m/z 254 and m/z 256 for orotate (cold and labeled)
In the procedure, the metabolites characteristic of each lEM are targeted. Table 2 compares methods for diagnostic screening of phenylketonuria. In phenylketonuria, phenylalanine is not metabolized to tyrosine but degraded to aromatic acids due to phenylalanine hydroxylase deficiency. The urease pretreatment enables almost definitive chemical diagnosis of the disease,
60
T. Kuhara
because it permits the simultaneous measurement of amino acids including phenylalanine, and organic acids includmg aromatic acids [23]. In Methods 1-3 in Table 2, a secondary or additional examination is required to make a diagnosis. 5.2. Filter Paper Urine or Liquid Urine Urine is collected into a clean plastic or glass bottle. After collection it is deep-fi-ozen as soon as possible at -20°C or below until analysis. Liquid urine should always be transported in a deep-fi*ozen state. To prepare dried urine on filter paper, urine is poured onto a 3x8-cm piece of absorbent filter paper (UA-5 from Toyo Roshi, Tokyo, Japan, for example) then dried in room air. Dried urine on filter paper can be sent via mail or air-mail to a specialized laboratory. The paper is kept in a disposable tube and the soluble urinary components are eluted with 1 ml of distilled water. A volume of 0.7 ml of eluate is recovered and then processed as described below for liquid urine. 5.3. Sample Preparation The sample preparation of liquid urine specimens or the eluates from the filter paper urine [23,36] is different from conventional solvent extraction (see Fig. 4). It includes no fractionation, and requires urease pretreatment. A volume of 100 |J.1 of urine is used, but, depending on the concentration of creatinine, urine volumes of 10 to 50 )Ld are often preferred, and volumes of 200 iLdare very rarely needed. Thus, the size of the absorbent filter paper and the scale of the following elution can be reduced by half The urine is incubated with 30 units of urease at 37°C for 10 min to remove excess urea in the urine. For accurate measurement, the urine is spiked with fixed amounts of stable isotope-labeled compounds as internal standards (ISs) at 100 nmol for creatinine, 4 nmol each for uracil and orotate, 5 nmol for methylcitrate, 10 nmol each for methionine, homocystine, leucine, phenylalanine, tyrosine, and cystine, and 50 nmol each for glycine and lysine. Twenty-five nanomoles each of 2,2-dimethylsuccinate and 2-hydroxyundecanoate, as ISs, and heptadecanoate, are also added. After precipitating proteins with 0.9 ml of ethanol, centrifugation to remove precipitate, and evaporation to dryness under mildly reduced pressure (100 cmHg) below 30°C, the residue is trimethylsilylated by adding 100 |LI1 of a mixture of bistrimethylsilyltrifluoroacetamide (BSTFA) and trimethylchlorosilane (TMCS) (10:1, v/v), and heating it at 80°C for 30 min.
Human Urine Metabolome Profiling
61
Urine
t New metabolome profiling Urease-pretreatment
\ Deproteinization of urease with ethanol
\
f
Add Internal standards
t Conventional metabolic profiling Acidification to pH1
i
Extraction with ether and/or ethyl acetate
1
Evaporation of the supernatant
Evaporation to dryness
i
Trimethylsilylation
Trimetliylsiiylation
\ GC/MS Multiple categories of compounds
i
1
GC/MS Organic acids
organic acid, amino acid, purine, pyrimidine, sugar, sugar alcohol, sugar acid, amine etc.
Fig. 4. Sample preparations for urinary metabolic profiling. Left, Simplified urease-pretreatment [23]. Right, Organic solvent extraction, conventional method
5.4. Creatinine Determination For quantitative measurements it is necessary to normalize the metabolite concentration to the concentration of a reference metabolite, because of the different flow^ rates of urine. Creatinine is generally used as a reference. With the above method of sample preparation, creatinine in urine is recovered quantitatively. How^ever, urinary creatine is almost completely converted to creatinine during the procedure (Fig. 5). Therefore, ds-creatine is used as an IS [22]. By using either ds-creatine or ds-creatinine as an IS, endogenous creatinine plus creatine is estimated [23,36]. We also separately determined creatinine and creatine enzymatically by auto-analyzer (total creatinine is equal to creatine plus creatinine). Trimethylsilylation of creatinine produces its tri-TMS (major) and di-TMS (minor) derivatives, and the ratio of the tw^o is not constant. Therefore, w^e used ds-creatinine as an IS to quantify endogenous total creatinine but did not use it as the reference metabolite to directly quantify all the metabolites [23,36]. The evaluation based on total creatinine will be helpful w^hen only a limited volume of urine specimen is available and when it is impossible to determine creatinine content. The evaluation of metabolite levels relative to the total creatinine in urine was reported to be especially useful during clinical episodes of patients with metabolic disorders [37].
62
T. Kuhara
CD3(D3 -creatinine)
CH3
/
CH3
CH3 ^NH
O=C
NH2
•H2O
HO - C
-NH
NH
Creatinine
Creatine
Trimethylsilylation CHs
CHs
H^C^'^-C^NTMS O=C
HCr^-C^NTMS
NTMS
di-TMS
TMSO - C «
NTMS
tri-TMS
Fig. 5. Creatine is converted to creatinine during the pretreatment procedure. Therefore, the sum of endogenous creatinine and creatine is measurable, and either ds-creatinine or ds-creatine are used to obtain the value for total creatinine Table 3. Trimethylsilylation -OH -COOH
~ -^
-NH2
-^
-NH-SH -P-OH -C(=N-OH)-CH2CO-
-^ -^ -^ -^ ^
-CO-NH-
->
-OTMS -COOTMS -NHTMS or -N(TMS)2 -N(TMS)-STMS -P-OTMS -C(=N-OTMS)-CH=C(OTMS)-C(OTMS)=N- or -CO-N(TMS)-
TMS, -Si (CH3)3
5.5. Derivatization For the analysis w^ith GC, organic acids must be converted to derivatives that are thermally stable, chemically inert, and volatile at temperatures below about 300°C. Silylating agents mcluding TMS and /er/-butyldimethylsilyl donors stabilize a w^ide variety of compounds for volatilization during GC. Trimethylsilylation has been used for organic acid profiling. The reactions of common functional groups w^ith silylated reagents are shown in Table 3. Among silylation reagents, BSTFA is preferred for organic acid
Human Urine Metabolome Profiling
63
analysis because of its reactivity, volatility, excellent solvent properties, and availability in pure form [38]. In combination with TMCS, which acts as an acid catalyst, BSTFA fully derivatizes most compounds in 10 min at 60°C. The derivatized samples are stored in vials with Teflon-lined screw caps until analysis. Furthermore, they remain stable for months when kept frozen in sealed glass micropipettes [39]. A^-Methyl-A^-trimethylsilyl-trifluoroacetamide has also been used as a reagent for trimethylsilylation [22]. In a report, this simplified urease method is termed the urease/direct method and instead of trimethylsilylation, /^r^butyldimethylsilylation has been recommended [40]. Polyols and sugars are not fully silylated and give a mixture of derivatives with ^er^butyldimethylsilylation. As the tert-hutyldimethylsilyl moiety is bulkier than the trimethylsilyl moiety, the tendency not to be fully silylated and to give several derivatives may be higher for ^er/-butyldimethylsilylation, especially for polyols and sugars. Carbonyl moieties in keto acids are not completely modified to the enol-TMS derivative, but remain partly underivatized. Furthermore, the ratio of the derivatized (the enol-TMS) to underivatized varies. Oximation of short-chain keto acids before solvent extraction or ion-exchange chromatography will prevent the loss of these volatile and unstable compounds during the preparation of the acid fraction [39,41]. The oximation occurs only with keto and aldehyde functional groups, and thus does not affect most of the other acids in urine. Therefore, for the measurement of a-ketoand ß-keto acids, they are oximated prior to trimethylsilylation. However, in screening, oximation is not necessary, except when a single keto acid is the sole indicator of a disease. Capillary gas chromatography (GC) has been employed to analyze metabolites levels in biological samples [42,43]. Only a few methods, however, enable simultaneous analysis by a single GC/MS run [22,23,35,44,45]. Recently, amino and nonamino organic acids were simultaneously analyzed as methyl chloroformate derivatives using GC/MS [46]. The derivatization to the alkyl chloroformates is very rapid and simple, because it is unnecessary to isolate the analytes from biological samples. However, few applications of the method to urine samples have been reported, and the mass spectral library is small at present.
6. A Typical Example of Chemical Diagnosis by Metabolome Profiling: Multiple Cytochrome Deficiency Fatal infantile mitochondrial myopathy with lactic acidemia and a De Toni-Fanconi-Debre syndrome (MIM 220110) are characterized by flop-
64
T. Kuhara
piness, failure to thrive, and lactic acidosis associated with the biochemical defect of mitochondrial oxidative phosphorylation [47]. Urinary findings consistent with De Toni-Fanconi-Debre syndrome, lactic aciduria, and an increased level of intermediary metabolites of the tricarboxylic acid cycle in a floppy baby should suggest the possibility of this disorder. Urine specimens from a 3-month-old floppy infant with severe lactic acidemia were examined to determine whether the patient had an lEM. The pathological examination had suggested two possibilities: either carnitine deficiency or multiple cytochrome deficiency. In those days, the urinary metabolites analyzed by GC/MS were limited to the organic acids extractable with an organic solvent under acidic conditions; amino acids were analyzed separately by an automatic amino acid analyzer. Gross lactic aciduria, phosphaturia, and ketonuria, and increased excretion of TCA cycle intermediates were observed. In addition, the excretion of fumarate and the ratio of fumarate to succinate were dramatically increased. Amino acids were generally increased. The profile differed from that observed in Hartnup disease, iminoglycinemia, dibasic amino aciduria, or cystinuria, but were similar to that reported earlier by Van Biervliet et al. [47]. On the basis of profiles of both organic acids and amino acids, the infant was suspected of having a multiple cytochrome deficiency from these metabolic profiles and was later confirmed to have a cytochrome aa^^lb deficiency [48]. Three diagnosed cases from two families including the above case had the characteristic clinical symptoms, severe lactic aciduria, generalized amino aciduria, ketonuria, glucosuria, and phosphaturia [49]. This typical profile could be identified by single analysis when the simplified urease pretreatment is applied [44].
7. Evaluation
7.1. More than Two Ions from One Derivative Should Be Used for Quantification To determine the level of metabolites by GC/MS, it is necessary to use mass chromatography. Because the numerous kinds of metabolites are present in metabolome analysis, not only the sufficient separation is required by capillary GC, but also more than two ions should be, and can be, used for quantification in GC/MS analysis. The probability of overlap of ions having the same m/z in low-resolution mass measurement is high.
Human Urine Metabolome Profiling
65
For amino acids, trimethylsilylation is not always quantitative. Therefore, stable isotope-labeled amino acid is used as IS for amino acid. One labeled omega-amino acid is used as the representative IS for omega-amino acids. However, the stable isotope dilution is not always effective. For example, the use of ds-leucine that coelutes with phosphate-3TMS is not always quantitative; in patients with phosphaturia or when phosphate is a contaminant, the huge phosphate peak affects the ionization and interferes with that of the ions chosen for da-leucine. 7.2. Creatinine To obtain quantitative values of the metabolites relative to the total creatinine in urine, the correct total creatinine (/) value should be determined. By considering the contribution of da-creatinine to creatinine and of creatinine to ds-creatinine, the correction was made as follows. The theoretical intensity of the ion at three mass units higher than molecular ion P, [P+3], is 0.0313, which can be calculated by summing the probabilities of observing the minor isotopes for CnHsiNsOSia. The measured value in the scan was 0.0309. Therefore, the contribution by endogenous derivatized creatinine of CnHsiNsOSis (mw=329) to the ion at m/z 332 was set as 0.031. The contribution to the ion at m/z 329 due to the exogenous ds-creatinine was observed to be 0.14%. Thus, the value of 0.0014 was set. In the present study, we spiked the sample with 100 nmol ds-creatinine. A: observed intensity of the ion at m/z 329: ^+0.0014 IS B: observed intensity of the ion at m/z 332: IS+0.03 It i?=A/B=(^+0. 14)7(100+0.0310 /=(1007?-0.14)/(l-0.031if) Using this correction, the total creatinine concentrations were equal to the sum of creatinine and creatine determined separately by autoanalyzer for wider ranges. Creatinine value determined by the stable isotope dilution method was used only to determine the total creatinine content in exactly the same urine specimen used to express the levels of other metabolites. In Fig. 6, the mass spectra of trimethylsilyl derivatives of creatinine tri-TMS and creatinine (methyl-ds) tri-TMS are shown.
66
T. Kuhara
r I T j I IT11 f ^ r i ^ i I n {'it I I p''i I f I I I y 1 fiTtT]";
60
80
100 120 140 160 180 200 220 240 260 280 300 320 340 m/z
118
100
CD3
NTMS
HC/•^V.^'
II
73
TMSO—C
-NTMS
332
146
[M-CHa] 317
171
100 59 86 h I'rn'f 11 K^i'i-Ti f*t
60
80
I
131
147
M159 ^Q k
201 I 201
227 244^-2AA^^\274287301 227
np
100 120 140 160 180 200 220 240 260 280 300 320 340
m/z
Fig. 6. Mass spectra of trimethylsilyl derivatives of creatinine tri-TMS and creatinine (methyl-ds) tri-TMS
Human Urine Metabolome Profiling
67
Table 4. Molecular diagnosis of oxalosis type II and its stability based on an abnormality n for two urine samplesfroma single case «* HR1660.D HR1660.R HR1867.D HR1867.R 8.6 8.4 8.2 8.2 Glycerate 7.9 1.1 7.5 7.7 a. «is the number of abnormality in [mean above «xSD]. The data were logio-transformed, and an age-matched control was used. Upper values were calculated per creatinine, lower values were per total creatinine (from Kuhara et al. 2002). Simultaneous analysis of glycolate and oxalate showed that the former did not increase at all and the latter increased only moderately (mean + 2SD). The lack of an increase in glycolate excludes the possibility of hyperoxaluria type I. In D-glyceric aciduria, neither oxalate nor glycolate increase, and stone formation does not occur. The D-form of glycerate is the important entity to be measured in order to detect D-glyceric aciduria. (See text for details.) Target
7.3 Distribution of Healthy Groups, Age-IVIatched Controls, and Evaluation of Abnormality Elevated metabolite levels are expressed as being «xSD above the mean of the normal level, wehere n is called the "abnormality" in our diagnostic procedure. In Table 4, the abnormality n of the indicator for three urine samples from a patient with hyperoxaluria type II is show^n. The level of a metabolite is estimated first as the peak area of the ions; target such as glycerate tri-TMS relative to 2,2-dimethylsuccinate di-TMS, the early eluting IS, or 2-hydroxyundecanoate di-TMS, the later eluting IS [14]. In Fig. 7, the mass spectra of trimethylsilyl derivatives of tw^o internal standards, 2,2-dimethylsuccinate and 2-hydroxyundecanoate, are shov^n. Estimated values were finally expressed as relative to the amounts of creatinine or total creatinine obtained from exactly the same urine specimen. The levels of glycerate for age-matched healthy individuals were not normally distributed. Consequently, the data were logio-transformed before statistical analysis to give a mean of-0.64 and a SD of 0.35 for the creatinine-based results and a mean of-0.83 and a SD of 0.32 for the total creatinine-based results. The glycerate level after logio-transformation in a patient with primary hyperoxalosis type II is the mean plus «xSD, and the abnormality n was defined as follows: Glycerate level=mean + « SD Abnormality n=(glycerate level minus mean)/SD
68
T. Kuhara
The stability and sensitivity of the three indicators in the metabolome analysis of urine from a patient with primary hyperoxaluria type II at two different sampling times was determined. As shown in Table 4, the reproducibility and sensitivity of the analysis of patients' urine are very high for glycerate. The number n, where the metabolite level is «xSD above the mean, is 7 or 8 for glycerate, and it is easy to determine that the level is highly abnormal and the abnormality n is quite constant. Primary hyperoxaluria types I and II are simultaneously screened or diagnosed. If glycerate alone increases in urine, and if it is further shown to have the D-form, then D-glyceric aciduria (MIM 220120), in which D-glycerate kinase is deficient, should be considered [50]. The D-form of glycerate is the important entity to be measured in order to detect D-glyceric aciduria. Both direct and indirect methods employ capillary GC/MS; in the direct method, the chiral stationary phase is used for the GC , and in the indirect method, volatile diastereomeric derivatives are separated on conventional achiral stationary phases [51,52].
TMSOOC—CH2-
TMSOOC —{CH2), —CH —COOTMS OTMS
20-f40'l'6Ö I'^O iOQ 'ill '{ÄÖ 2'6b ^80 '^00 3'2b 340
Fig. 7. Mass spectra of trimethylsilyl derivatives of two intemal standards, 2,2-dimethylsuccinate {upper) and 2-hydroxyundecanoate {lower). The 2,2-dimethylsuccinate di-TMS ion at m/z 231 and the 2-hydroxyundecanoate di-TMS ion at m/z 229 were respectively used to measure metabolites that eluted earlier or later on the chromatogram. The ratio of 231 to 275 in the former and 229 to 331 in the latter were also evaluated
Human Urine Metabolome Profiling
(D3C)
69
COOH 3
HX—
-H
HOOC-
i^ 361
&
217 171^,23725^71| 150 100
200
250
«?
c Bc
300
147
COOH 2R, 3S H-^TMS
450
350 400
500 m/z MW 497
(D3C)
u_ .^.u.Uli
COOH 3
4
_
1.
S
HoC— HO-
III
5"
1.
274
1. 482
-COOH
COOH 2S, 3S
438
.t3J7^.^,
-H
-H
57
364 5 1 380 , 392 2oo221
-H
479 435
2 30
73
1.
FT
•f
-OH
H-
H-TMS
Fig. 8. The structures (right) and mass spectra of methylcitrate tetra-TMS (upper) and ds-methylcitrate tetra-TMS (lower) 254 IM-HCOOTMS]+ UIM
[M-CHif 357
TMSO'^N'^COOTMS
174 I 200
269 I 329 [ 283 313,
M^
• *-r"--i
Mr 372
*"i»i
256 [M-HCOOTMSr
100 OTMS 3
L
1
•
c
[M-CH3]* 3 59
1' "55
147
59 n
Up
101 1 133 i,^.J^.,.«^J^i>
Y ^ , 215 24l' 1 199 1
271 Mr 1 331 374 [ 2 8 5 315^ ^
.i
Fig. 9. Mass spectra of orotate tri-TMS (upper) and ^^N2-orotate tri TMS {},ower)
70
T. Kuhara
Methyleitrate exists as two diastereoisomers. Figure 8 shows the chemical structure and mass spectra of unlabeled methylcitrate (upper) and deuterium-labeled ds-methylcitrate, which was used as an IS (lower). They had the same retention time. More than two ion peaks should be used for automatic quantification, because urine samples often contain unknown compounds that interfere with the methylcitrate quantification. Ions at m/z 364/361 and m/z 4S2/479 from ds-methylcitrate and unlabeled methylcitrate, respectively, were not interfered with by unknown compounds. Routinely, the use of the ions including at m/z 287 was appropriate when the ion at m/z 229 from 2HUD di-TMS was used as an IS. The isotope dilution method is more quantitative and gives a higher abnormality value [53]. Orotate is an indicator of primary hyperammonemias and hereditary orotic aciduria. The recovery was low in a method using organic solvent for extraction, but it was markedly improved when we used the simplified urease pretreatment (see Fig. 3). ^^N2-labeled orotate was used for more precise chemical diagnosis. In Fig. 9, the mass spectra of orotate tri-TMS and ^^N2-orotate tri TMS are shown. With this stable isotope dilution method, the ions at m/z 254 and m/z 357 are used for endogenous orotate and the ions at m/z 256 and m/z 359 for exogenous orotate (IS\ As the difference is only two mass units, the contributions of orotate to ^N2-orotate and of ^^N2-orotate to orotate should be considered for ion pairs at m/z 254 and m/z 256, and m/z 357 and m/z 359, and corrected as described for creatinine (see Section 7.2). Both corrected values are expressed as relative to creatine determined separately. The stable isotope dilution procedure described here will provide a valuable tool for screening and/or the chemical diagnosis of more than 130 target diseases, and could also be used to monitor patients of different ages. This procedure is technically practical yet comprehensive from the metabolic point of view. This method should also be useful for the screening, diagnosis, and evaluation of acquired metabolic disorders in humans and animals that are caused by either exogenous factors or interactions between exogenous and genetic factors [23].
References 1. Kuhara T, Shinka T, Inoue Y, Zhen-Wei X, Ohse M, Yoshida I, Inokuchi T, Yamaguchi S, Takayanagi M, Matsumoto I (1999) Pilot study of gas chromatography-mass spectrometric screening of newbom urine for inborn errors of metabolism after treatment with urease. J Chromatogr B 731:141-147 2. Chamberlin B, Sweeley CC (1987) Metabolic profiles of urinary organic acids recoveredfromabsorbent filter paper. Clin Chem 33:572-576
Human Urine Metabolome Profiling
71
3. Van Gennip AH, Abeling NGGM, Vreken P, Van Kuilenburg ABP (1997) Inborn errors of pyrimidine degradation: clinical, biochemical and molecular aspects. J Inherit Metab Dis 20:203-213 4. Kuhara T, Ohdoi C, Ohse M (2001) Simple gas chromatographic-mass spectrometric procedure for diagnosing pyrimidine degradation defects for prevention of severe anticancer side effects. J Chromatogr B 758:61-74 5. Inoue Y, Masuyama H, Ikawa H, Mitsubuchi H, Kuhara T (2003) Monitormg method for pre- and post-liver transplantation in patients with primary hyperoxaluria type I. J Chromatogr B 792:89-97 6. Baker L, Winegrad AI (1970) Fasting hypoglycaemia and metabolic acidosis associated with deficiency of hepatic fructose-1,6-diphosphatase activity. Lancet 2:13-16 7. Berghe GV (1996) Disorders of gluconeogenesis. J Inherit Metab Dis 19:470-477 8. Dremsek PA, Sacher M, Stögmann W, Gitzelmann R, Bachmann C (1985) Fructose-1,6-diphosphatase deficiency: glycerol excretion during fasting test. Eur J Pediatr 144:203-204 9. Tanaka K, Budd MA, Efront ML, Isselbacher KJ (1966) Isovaleric acidemia: A new genetic defect of leucine metabolism. Proc Natl Acad Sei USA 56:236-242 10. Dalgliesh CE, Homing EC, Homing MG, Knox KL, Yarger K (1966) A gas-liquid-chromatographic procedure for separating a wide range of metabolites occurring in urine or tissue extracts. Biochem J 101:792-810 11. Homing MG (1968) Gas phase analytical methods for the study of urinary acids. In: Szymanski A (ed) Biomedical applications of gas chromatography, vol. 2. Plenum, New York, pp 53-86 12. Goodman SI, Markey SP (1981) Diagnosis of organic acidemias by gas chromatography-mass spectrometry. Liss, New York 13. Chalmers RA, Lawson AM (1982) Organic acids in man. Analytical chemistry, biochemistry and diagnosis of organic acidurias. Chapman and Hall, London 14. Kuhara T (2001) Diagnosis of inbom errors of metabolism using filter paper urine, urease treatment, isotope dilution and gas chromatography-mass spectrometry. J Chromatogr B 758:3-25 15. Jaakonmaki PI, Knox KL, Homing EC, Homing MG (1967) The characterization by gas-liquid chromatography of ethyl ß-D-glucosiduronic acid as a metabolite of ethanol in rat and man. Eur J Pharmacol 1:63-70 16. Chalmers RA, Watts RWE (1972) The quantitative extraction and gas-liquid chromatographic determination of organic acids in urine. Analyst 97:958-967 17. Gates SC, Sweeley CC, Krivit W, De Witt D, Blaisdell BE (1978) Automated metabolic profiling of organic acids in human urine. II. Analysis of urine samples from "healthy" adults, sick children, and children with neuroblastoma. Clin Chem 24:1680-1689 18. Nakai A, Shigematsu Y, Lin YY, Kikawa Y, Sudo M (1993) Urinary sugar phosphates and related organic acids in fructose-1,6-diphosphatase deficiency. J. Inherit Metab Dis 16:408^18
72
T. Kuhara
19. Hoffinann G, Aramaki S, Blum-Hoffinann E, Nyhan WL, Sweetman L (1989) Quantitative analysis for organic acids in biological samples: batch isolation followed by gas chromatographic-mass spectrometric analysis. Clin Chem 35:587-595 20. Sweetman L (1991) Organic acid analysis. In: Hommes FA (ed) Techniques in diagnostic human biochemical genetics. A laboratory manual. Wiley-Liss, New York, pp 143-176 21. Duez P, Kumps A, Mardens Y (1996) GC-MS profiling of urinary organic acids evaluated as a quantitative method. Clin Chem 42:1609-1615 22. Shoemaker JD, Elliott WH (1991) Automated screenmg of urine samples for carbohydrates, organic and amino acids after treatment with urease. J Chromatogr 562:125-138 23. Matsumoto I, Kuhara T (1996) A new chemical diagnostic method for inborn errors of metabolism by mass spectrometry. Mass Spectrom Rev 15:43-57 24. Matsumoto I, Shinka T, Kuhara T, Ooura T, Yamamoto H, Hase Y, Aoki H, Issiki G, Tada K (1978) Investigation of unusual metabolites in the urine of a patient with propionic acidemia. In: Frigerio A (ed) Recent developments in mass spectrometry. Biochem Med 1. Plenum, New York, pp 203-216 25. Kuhara T, Matsumoto I (1980) Studies on the urinary acidic metabolites from three patients with methyhnalonic aciduria. Biomed Mass Spectrom 7:424-428 26. Matsumoto I (1981) Metabolic profiling of biological materials by a gas chromatography-mass spectrometry-computer system. In: Frigerio A (ed) Recent developments in mass spectrometry. Biochem Med Envu-on Res 7. Elsevier, Amsterdam, pp 57-68 27. Kuhara T, Shinka T, Matsuo M, Matsumoto I (1982) Increased excretion of lactate, glutarate, 3-hydroxyisovalerate and 3-methylglutaconate during clinical episodes of propionic acidemia. Clin Chim Acta 123:101-109 28. Kuhara T, Shinka T, Inoue Y, Matsumoto M, Yoshino M, Sakaguchi Y, Matsumoto I (1983) Studies of urinary organic acid profiles of a patient with dihydrolipoyl dehydrogenase deficiency. Clin Chim Acta 133:133-140 29. Matsumoto I, Kuhara T (1987) Gas chromatography-mass spectrometry for chemical diagnosis of the inherited metabolic diseases-differential chemical diagnosis of lactic acidosis. Mass Spectrom Rev 6:77-134 30. Matsumoto I, Kuhara T (1993) Inborn errors of amino acid and organic acid metabolism. In: Desiderio DM (ed) Clinical mass spectrometry, vol. 1: Clinical and biomedical applications. Plenum, New York, pp 259-298 31. Matsumoto I (ed) (1993) Advances in chemical diagnosis and treatment of metabolic disorders, vol. 1. Wiley, Chichester, pp 1-162 32. Matsumoto I, Kuhara T, Mamer OA, Sweetman L, Calderhead RG (eds) (1994) Advances in chemical diagnosis and treatment of metabolic disorders, vol. 2. Kanazawa Medical University Press, Kanazawa, pp 1-199 33. Matsumoto I, Sakamoto S, Kuhara T, Sudo M, Yoshino M (eds) (1995) GC/MS practical chemical diagnosis. Soft Science, Tokyo, pp 1-455 34. Kuhara T (2002) Diagnosis and monitoring of mbom errors of metabolism using urease—^pretreatment of urine, isotope dilution, and gas chromatography-mass spectrometry. J Chromatogr B 781:497-517
Human Urine Metabolome Profiling
73
35. Kuhara T (2004) Gas chromatographic-mass spectrometric urinary metabolome analysis to study mutations of inborn errors of metabolism. Mass Spectrom Rev on-line. Accessed Sep 16,2004. 36. Kuhara T, Matsumoto I (1995) A simultaneous gas chromatographic mass spectrometric analysis urinary metabolites-application to the neonatal mass screening. Proc Jap Soc Biomed Mass Spectrom 20:45-51 37. Davies SEC, lies RA, Stacey TE, Chalmers RA (1990) Creatine metabolism during metabolic perturbations in patients with organic acidurias. Clin Chim Acta 194:203-217 38. Knapp DR (1979) Handbook of analytical derivatization reactions. Wiley, New York, pp 1-21 39. Gates SC, Dendramis N, Sweeley CC (1978b) Automated metabolic profiling of organic acids in human urine. I. Description methods. Clin Chem 24:1674-1679 40. Ohie T, Fu X, Iga M, Kimura M, Yamaguchi S (2000) Gas chromatography-mass spectrometry with tert-butyldimethylsilyl derivation: use of the simplified sample preparations and the automated date system to screen for organic acidemias. J Chromatogr B 746:63-73 41. Thompson JA, Markey SP (1975) Quantitative metabolic profiling of urinary organic acids by gas chromatography-mass spectrometry: Comparison of isolation methods. Anal Chem 47:1313-1321 42. Adams MA, Chen Z, Landman P, Colmer TD (1999) Simultaneous determination by capillary gas chromatography of organic acids, sugars, and sugar alcohols in plant tissue extracts as their trimethylsilyl derivatives. Anal Biochem 266:77-84 43. Husek P (1995) Simultaneous profile analysis of plasma amino and organic acids by capillary gas chromatography. J Chromatogr B 669:352-357 44. Ning C, Kuhara T, Inoue Y, Zhang C, Matsumoto M, Shinka T, Furumoto T, Yokota K, Matsumoto I (1996) Gas chromatographic-mass spectrometric metabolic profiling of patients with fatal infantile mitochondrial myopathy—de Toni-Fanconi-Debr^ syndrome. Acta Paediatr Jpn 3S:66l-666 45. Roessner U, Luedemann A, Brust d, Fiehn O, Linke T, Wilhnitzer L, Femie AR (2001) Metabolic profiling allows comprehensive phenotyping of genetically or environmentally modified plant systems. Plant Cell 13:11-29 46. Villas-Boas SG, Delicado DG, Akesson M, Nielsen J (2003) Simultaneous analysis of amino and nonamino organic acids as methyl chloroformate derivatives using gas chromatography-mass spectrometry. Anal Biochem 322:134-138 47. Van Biervliet JPGM, Bruinvis L, Ketting D, De Bree PK, Van Der Heiden C, Wadaman SK (1977) Hereditary mitochondrial myopathy with lactic acidemia, a De Toni-Franconi-Debr^ syndrome, and a defective respiratory chain in voluntary striated muscles. Pediatr Res 11:1088-1093 48. Tanaka M, Nishikimi M, Suzuki H, Ozawa T, Okmo E, Takahashi H (1986) Multiple cytochrome deficiency and deteriorated mitochondrial polypeptide composition in fatal infantile mitochondrial myopathy and renal dysfunction. Biochem Biophys Res Commun 137:911-916
74
T. Kuhara
49. Yokota K, Kuhara T, Matsumoto I (1994) Abnormal metabolism of carbohydrate and fatty acids in mitochondrial disorders. In: Matsumoto I, Kuhara T, Mamer OA, Sweetman L, Calderhead RG (eds) Advances in chemical diagnosis and treatment of metabolic disorders, vol. 2. Kanazawa Medical University Press, Kanazawa, pp 143-152 50. Duran M, Beemer FA, Bruinvis L, Ketting D, Wadman SK (1987) d-Glyceric acidemia: an inborn error associated with fructose metabolism. Pediatr Res 21:502-506 51. Kaunzinger A, Rechner A, Beck T, Mosandl A, Sewell AC, Bohles H (1996) Chiral compounds as indicators of inherited metabolic disease. Simultaneous Stereodifferentiation of lactic-, 2-hydroxyglutaric- and glyceric acid by enantioselective cGC. Enantiomer 1:177-182 52. Kamerling JP, Gerwig GJ, Vliegenthart JF (1977) Determination of the configurations of lactic and glyceric acids from human serum and urine by capillary gas-liquid chromatography. J Chromatogr 143:117-123 53. Kuhara T, Ohse M, Inoue Y, Yorifuji T, Sakura N, Mitsubuchi H, Endo F, Ishimatu J (2002) Gas chromatographic-mass spectrometric newborn screening for propionic acidaemia by targetmg methylcitrate in driedfilter-paperurine samples. J Inherit Metab Dis 25:98-106
Chapter 6: Metabolic Profiling by Fourier-Transform Ion Cyclotron Resonance Mass Spectrometry (FT-ICR-MS) and Electrospray Ionization Quadrupole Time-Of-Flight Mass Spectrometry (ESI-Q-TOF-MS) Kazuo Hirayama Institute of Life Sciences, Ajinomoto Co. Inc., 1-1 Suzuki-cho, Kawasaki-ku, Kawasaki, Kanagawa 210-8681, Japan
1. Introduction Mass spectrometry (MS) is very useful for analyzing metabolites in a living body. Generally, MS is used in metabolome analysis in combination with high-performance liquid chromatography (HPLC) or capillary electrophoresis (CE) as liquid chromatography/mass spectrometry (LC/MS) or capillary electrophoresis/mass spectrometry (CE/MS). Since the advent of metabolomic studies using Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) and electrospray ionization quadrupole time-of-flight mass spectrometry (ESI-Q-TOF-MS), research employing FT-ICR-MS has been performed with plant samples, since they are easier to manipulate than microorganisms and animal cells [1]. Advantages and disadvantages of LC/MS and CE/MS, as compared with FT-ICR-MS and ESI-Q-TOF-MS, for metabolomic analyses are as follows. LC/MS and CE/MS: 1. Sodium and potassium salts, which interfere with the acquisition of a high-quality mass spectrum, can be excluded by column chromatography or capillary electrophoresis. 2. Mixed with other metabolites, some metabolites are hardly ionized in the ion source. After separation by LC or CE, they are ionized and analyzed on MS. 3. The minor components can be observed in the spectrum because they are separated beforehand. 4. It is easy to analyze the spectrum, because that of the metabolite separated previously is available.
76
K. Hirayama 5. LC and CE separation are time-consuming.
FT-ICR-MS and ESI-Q-TOF-MS: 1. Only samples with low concentrations of inorganic salts can be analyzed. 2. Since multiple components are ionized at the same time, the spectrum of only the easily ionized compound is obtained. 3. Since multiple components are ionized at the same time, the detection of minor components is difficult. 4. Since many components appear on the spectrum, it is not easy to analyze. 5. The time required for the measurement is short. These features of FT-ICR-MS and ESI-Q-TOF-MS analyses highlight the advantage of the short time required for the measurement, but it is apparent that considerable limitations still exist, as compared with the case analyzed with LC/MS and CE/MS. It is important to understand that the above-mentioned background exists with FT-ICR-MS and ESI-Q-TOF-MS analyses. The intracellular elements were extracted from the Escherichia coli K-12 cells, and based on the data actually measured with FT-ICR-MS and ESI-Q-TOF-MS, we now provide an example of metabolomic profiles.
2. Procedure for Metabolic Profiling Using FT-ICR-MS and ESI-Q-TOF-MS The object of metabolomic profiling is to analyze the internal organs and tissues of an animal as well as cultures of microorganisms and animal cells. The purpose of metabolomic profiling is to understand metabolism deeply by finding differences between samples in various states with different culture conditions and nourishment, etc., then identifying whether the difference is related to a metabolite. Figure 1 shows the metabolomic profiling procedure using FT-ICR-MS and ESI-Q-TOF-MS.
Metabolomic Profiling by FT-ICR-MS and ESI-Q-TOF-MS
77
Sampling
I I I
Pretreatment
Mass Spectrometry
Profiling data
1
Identification
Fig. 1, Procedure for metabolomic profiling with Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) and electrospray ionization quadrupole time-of-flight mass spectrometry (ESI-Q-TOF-MS)
3. Sample Preparation 3.1. Escherichia coli K-12 Culture Conditions The E. coli K-12 strain was cultured under the following conditions. 1. 8 g of Bacto-Tryptone, 5 g of Bacto-Yeast Extract and 1.25 g of NaCl were dissolved in 250 ml of H2O, and the pH was adjusted to 7.0 with 1 N NaOH. 2. 2.5 g of Bacto-Tryptone, 1.25 g of Bacto-Yeast Extract, and 2.5 g of NaCl were dissolved in 250 ml of H2O, and the pH was adjusted to 7.0 with 1 N NaOH. The bacteria were cultured in 20 ml of medium in a 300 ml flask at 37°C with shaking at 120 times/min. Aliquots of the culture were removed for analysis at 3.8 and 7.2 h after the culture was initiated. The OD590 values of the culture medium (1) were 7.3 and 21.0 at 3.8 and 7.2 h, respectively, and those of the culture medium (2) were 5.2 and 5.3 at 3.8 and 7.2 h, respectively.
78
K. Hirayama
3.2. Sample Treatment For analyzing metabolites in living organisms, it is necessary to clarify when the sample was collected. In other words, the time from the initiation of the culture, and the time since the addition of ^^C-glucose are important factors in the case of microorganisms. The most important factor affecting metabolomic analysis is that the amounts of metabolites and metabolic intermediates are always changing, due to enzyme reactions in vivo. Therefore, quick deactivation is necessary to clarify when the enzyme reaction stopped. Moreover, metabolic intermediates are unstable in acidic or alkaline conditions, or at high temperature. For these reasons, hot ethanol, cold methanol, hot methanol, perchloric acid, alkaline and methanol-chloroform are used for extraction of metabolites [2]. The preparation procedure for the sample analyzed with FT-ICR-MS and ESI-Q-TOF-MS was as follows.
Quenching A 20-ml aliquot of the culture was immediately added to 30 ml of 60% MeOH (-78°C) containing 70 mM Hepes and centrifuged for 5 min at -20°C at 10 000 rpm, and the bacterial cells were collected. The obtained pellet was washed twice with 10 ml of 25% MeOH containing 40 mM Hepes. Afterwards, to remove the Hepes buffer solution, the pellet was washed with 10 ml of Milli-Q water.
Extraction of intracellular components The obtained bacterial pellet was suspended in 1 ml of 50% MeOH, and was subjected to two cycles of freezing and thawing. The extraction was performed on ice, and the suspension was vortexed vigorously every 15 min. A 0.25-ml aliquot of CHCI3 was added, and the mixture was vortexed vigorously and centrifuged for five min at -20°C and 10 000 rpm. The upper layer was collected, and the CHCI3 layer was extracted with 0.5 ml of H2O. The extract was filtered with an ultrafiltration filter with a 10 K-Da cutoff, to remove the protein, and was freeze-dried and stored at -78°C until analysis.
Metabolomic Profiling by FT-ICR-MS and ESI-Q-TOF-MS
79
4. Instrumentation 4,1. FT-ICR-MS The frequency of the signal from each ion, which undergoes an ion cyclotron reaction in a magnetic field, is converted to the m/z value. It is well known that the motion of an ion can be manipulated as it travels through a magnetic field. Ion cyclotron resonance takes advantage of this principle not only to manipulate the motion of the ion through a transient trajectory, but also to trap it for an extended period. The frequency/of the circular motion of ions is shown by the following expression, when an ion of valence z rotates around a magnetic field of flux density B [T] with a mass m [u]. f=15.4zB/m [MHz] Therefore, if the frequency/is measured, then the mass m of the ion can be obtained. To measure the frequency/, the high-frequency wave is applied to the excitation plate, shown in Fig. 2. Then, the ion is accelerated, and the ion of the same m/z starts the rotation with the same phase. At this time, an electrical current is conducted whenever the ion approaches the detection plate, and the signal of the sine wave of frequency/is detected by this plate. In general, because ions with various m/z values exist in the cell, all of the sine waves that these ions induced become overlapped, and a complex shape of waves is observed. The mass spectrum is obtained by separating this shape of the waves of each frequency by processing, i.e., the Fourier transform, and converting this into the mass by the above-mentioned expression. In this device, a superconducting magnet cooled by liquid helium is generally used. This is because a large flux density B is needed to measure a big m, as understood from the expression above. Figure 3 shows the external features of the device. The features of this device are outlined below. 1. The resolution ability is extremely high (500 000). 2. The determination of the mass is quite accurate. 3. The ion can be measured without destroying it. 4. MS/MS is possible in the same cell. To determine the structure of an ion with a specific m/z after the profile data are measured with this device, MS/MS is performed. Figure 4 shows the scheme of the device. Many metabolites in the sample are introduced into the device to obtain the profile data. These metabolites are ionized by the electrospray ionization method, and their ions enter the cell through the capillary and hexapole. The ions collide freely with the inert gas Ar introduced by the pulse in the cell, which causes the degradation of the ions. This
80
K. Hirayama
is called SORI-CID. The spectrum obtained by SORI-CID is analyzed, and the structure of each ion is presumed. For confirmation of the predicted chemical structure, an authentic sample is also subjected to SORI-CID, and the data sets should be matched with each other. Excitation plate
Fourier-transform >- Mass Spectrum
Fig. 2. Principles of FT-ICR-MS
Fig. 3. Extemal appearance of FT-ICR-MS
Fig. 4. Scheme of FT-ICR-MS
Metabolomic Profiling by FT-ICR-MS and ESI-Q-TOF-MS
81
4.2. ESI-Q-TOF-MS This is an MS/MS system connected directly to a quadrupole mass spectrometer and a time-of-flight mass spectrometer. Ions are generated in the ion source, as shown in Fig. 5, and are separated with the quadrupole. The ion with a specific m/z is selected and introduced into the collision chamber. The selected ion is degraded into product ions by collision with an inert gas such as Ar or He. The products are analyzed with TOF-MS. Figure 6 shows the appearance of the device. The features of this instrument are described below. 1. The resolution is comparatively high (10 000). 2. The mass accuracy is comparatively high. 3. MS/MS can be measured.
Fig. 5. Principles of ESI-Q-TOF-MS
Fig. 6. Extemal appearance of ESI-Q-TOF-MS
82
K. Hirayama
5. Conditions We describe the measurement conditions of the intracellular metabolites of the E. coli K-12 strain, by using FT-ICR-MS and ESI-Q-TOF-MS. 5.1.FT-ICR-MS Mass spectrometer: ApexII 7T (Active Shielded) (Bruker Daltonics) Mass calibration: Mixture of PEG200, PEG400, and PEG600 Ionization: ESI Measurement ion: Positive ion Flow rate: 100 \\[l\\ Ion source temperature: 150°C Sample: 500-fold dilution with 1% acetic acid/50% methanol Sample introduction method: Infusion Range of scanning: w/zlOO-500 and //i/zSOO-lOOO 5.2. ESI-Q-TOF-MS 5.2.1. Profile Data Mass spectrometer: Q-Tof-2^^ (Micromass) Mass calibration: Mixture of PEG200, PEG400, and PEG600 Ionization: ESI Measurement ion: Positive ion Flow rate: 300 jtu/h Ion source temperature: 80°C Sample: 500-fold dilution with 0.1% formic acid/50% acetonitrile Sample introduction method: Infusion Range of scanning: m/z 50-600 5.2.2. Collision-Induced Dissociation (CID) Using Ar Gas Capillary voltage: 3200 V Cone voltage: 20 V Tempe;rature of source block: 80° °C Desolvation gas temperature: 120°°C Collision voltage: 12 V RF setting: 0.50
Metabolomic Profiling by FT-ICR-MS and ESI-Q-TOF-MS
83
(a)
r.i.
239. 1064
2.0
1.0
1
477.2046
308.0920
147.1134
0.0
1
1
150
200
. L.
1
..! i.I
. ;
i
I
1
250
(b)
0.05
0,00
ijUi
k
,ii,iillii|:J
MM-
ki
477)2070 J 450
I
m/z
Fig. 7. FT-ICR mass spectra (a) and expanded section (b) of the Escherichia coli K-12 culture. After culturing the E, coli for 7.2 h, the sample is obtained through the sampling —> quenching —>extraction procedure
6. Metabolomic Profiling by the Infusion Method 6.1. Analysis of the Intracellular Metabolites of the £. coli K-12 Strain by FT-ICR-MS After the E. coli K-12 strain is cultured for 7.2 h, the sample is obtained by the procedure of sampling -^ quenching -> extraction. The FT-ICR mass spectrum of the sample and its close-up are shown in Fig. 7a and b, respectively. The m/z 239.1064 and m/z 477.2046 ions originated from the Hepes used for the pretreatment. The theoretical [M+H]"*" value of Hepes is 239.1065, which was calculated by adding 1.0078 (mass of hydrogen) to 238.0987, calculated from the molecular formula of Hepes, C8H18N2O4S. The difference (only 0.0001 u mass ) between the observed and theoretical values proves the high mass accuracy of FT-ICR-MS. Some of the me-
84
K. Hirayama
tabolite ions observed in Fig. 7(b) can be identified because of the high mass accuracy of FT-ICR-MS. Here, we explain the identification procedure of two metabolites that yielded m/z 147.1134 and m/z 308.0920 in Fig. 7b. When the jc-axis is expanded further around the ions of m/z 147.1134 and m/z 308.0920, it is apparent that these two are singly charged ions, because the corresponding isotopes are above the peaks by one mass unit. Tables 1 and 2 show the molecular formulae estimated by using the accurate mass values of m/z 147.1134 and m/z 308.0920, calculated by the data processing software of the mass spectrometer. It is possible to search the metabolites from the estimated chemical formula by using KEGG, on the Internet (http://genome.jp /kegg/). The method is easy. After opening KEGG (URL: http://genome.jp /kegg/) -^ Open KEGG (Table of Contents) -> 1-2. Hierarchical Classification -^ Structure Search —» COMPOUND, the compounds within the KEGG database can be searched by submitting the molecular formula. At this time, because the molecular formula of [M+H]^ is described in the table, it is necessary to search it by using the molecular formula of M. When the compounds in the KEGG database were searched by using the molecular formula of M, corresponding to the three high ranks in Table 1, C6H14N2O2, C4H12N5O, and C2H10N8, it was understood that the compounds that corresponded to C4H12N5O and C2H10N8 did not exist, but five compounds in Fig. 8 existed as C6H14N2O2. Among them, the m/z \A1 A\?>A observed by FT-ICR-MS can be concluded to be the [M+H]"^ of L-lysine (Lys), because it is present in vivo. Moreover, in the three high ranks in Table 2, the compounds that corresponded to C18H13NO4 and C3H17N9O4S2 did not exist, and only reduced GSH (glutathione) of Fig. 9 corresponded to C10H17N3O6S, when the compounds in the KEGG database were searched by using the molecular formula of M. Since a lot of reduced GSH exists in vivo, m/z 308.0920 can be concluded to be the [M+H]"" of reduced GSH. Figure 10 shows the FT-ICR mass spectra of the region of m/z 250-350 of the intracellular elements of E. coli K-12, after 7.2 h (a) and 3.8 h (b) of cultivation. It is understood that different amounts of elements exist at 7.2 and 3.8 h. The differential spectrum is effective for clarifying the differences in the elements between the incubation times. Figure 11 shows the differential spectra from m/z 250 to 290. Spectrum (a) is that of 3.8 h minus 7.2 h, and (b) is that of 7.2 h minus 3.8 h.
Metabolomic Profiling by FT-ICR-MS and ESI-Q-TOF-MS
85
Table 1.
147.1134
147.1134 147.1120 147.1107 147.1174 147.1082 147.1207
C6H15N2O2 C4H13N5O C2HnN8 CnHi5 CyHnNS CgHipS
308.0923 308.0923 308.0916 308.0916 308.0925 308.0914 308.0928 308.0912
C18H14NO4 C3H18N9O4S2 CioHigNsO^S C9Hi2N,oOS CnH22N30S3 C2H14N9O9 C4H16N6O10 C9H20N6S3
Table 2. 308.092
Fig. 8. Result of the KEGG database search with the structural formula of C6H14N2O2. Only part is published due to space constraints, although five compounds were identified
86
K. Hirayama
Fig. 9. Results of the KEGG database search with the structural formula C10H17N3O6S
r. i.
(a) 308.0920
0.15
i
0.10 ., 1. ,.. . 1.1. 1,
LL_ .i^Lili
,. 1,1. •
1
.1.
1 .
1 (b)
0.05 308. 0927
0 L.L.i; Ji.l II 1
1.
1 iJ.I. ,ii,l
1
1
260
280
.. I l h . ,, .1 ,ll 1
i
320
.. . il - -
-
—
1, +
340
J , 1,1. 1
m/2
Fig. 10. FT-ICR mass spectra at the m/z 250-350 region of the intracellular elements of E. coli K-12. a After 7.2 h of cultivation, b After 3.8 h of cultivation
Metabolomic Profiling by FT-ICR-MS and ESI-Q-TOF-MS
87
(a)
r.i. 0.10 282.2805 284.2964 0.05
256. 2649 254.1634 i
269.1095
_____
j
-0.00
1 ,,
1J 1... II i i . .
.
1, J , 1 i _,.,.
f..>
•' 1 '
i i '
, „J 1 i,Ll 1 11 J . 1 , \"J *-'-f-'f""^" '•-! T— 1
(b)
-0.05
-0 10
1
-
1
Fig. 11. Differential spectra from m/z 250 to 290 of the intracellular elements of E. coli K-12. Differential spectra of a 3.8 h minus 7.2 h, and b 7.2 h minus 3.8 h
6.2. Analysis of the Intracellular Elements of E. coli K-12 by ESI-Q-TOF-MS After the E. coli K-12 strain is cultured for 7.2 h, the sample is obtained through the sampling -^ quenching -^ extraction procedure. The ESI-Q-TOF mass spectrum of the sample and its close up are shown in Fig. 12a and b, respectively. Ions m/z 239.1152, m/z 268.1189, m/z 477.2323, and m/z 506.2333 originated from the Hepes used for the pretreatment. Figures 13a and b are expansions of m/z 140-160 and m/z 300-320 of Fig. 12. Since they are intracellular components of E. coli K-12, m/z 147.1222 and m/z 308.1067 are thought to be Lys and reduced GSH, respectively. When the accurate masses of these ions are requested, they only have to be calculated based on the ions that exist in the same spectrum, that is, the theoretical [M+H]^ value of Hepes. When the mass of H, 1.0078, is added to the calculated mass from the molecular formula C8H18N2O4S of Hepes, 238.0987, it becomes 239.1065. This value becomes the theoretical m/z of [M+H]"^ value of Hepes. If the observed [M+H]^ values of Lys and reduced GSH are calculated based on this value, then they become 147.1146 and 308.0939, respectively. On the other hand, the [M+H]^ values calculated from the molecular formulae C6H14N2O2 of Lys and C10H18N3O6S of re-
88
K. Hirayama
duced GSH become 147.1134 and 308.0916, respectively. Then, the m/z \A1AT12 and 308.1067 values observed on the spectrum are presumed to be [M+H]^ values of Lys and reduced GSH, respectively. To confirm this, one only has to measure the CID (collision-induced dissociation) spectrum (It is called the MS/MS spectrum in colloquial language) that obtains each [M+H]"^ as a precursor ion. The CID spectrum from the [M+H]^ of the authentic Lys sample is shown in Fig. 14a, and the CID spectrum from m/z 147.1222 of the intracellular elements of E. coli K-12 strain is shown in Fig. 14b. Moreover, the CID spectrum from the [M+H]^ of the authentic reduced GSH sample is shown in Fig. 15a, and the CID spectrum from m/z 308.1067 of the intracellular elements of E. coli K-12 strain is shown in Fig. 15b. It can be concluded that each compound is Lys and reduced GSH from the spectra in Figs. 14 and 15, respectively. The ESI-Q-TOF mass spectra of the region of m/z 100-200 of the intracellular elements of E. coli K-12, after 3.8 h (a) and 7.2 h (b) of cultivation, are shown in Fig. 16. When a quantitative comparison of the intracellular elements is performed, in general, an internal standard substance is added to the analyte. For the analysis shown in Fig. 16, the amount of Lys, which gives an m/z 147.1222 in the culture at 7.2 h is twice as abundant as that at 3.8 h, even though there is no internal standard.
405,2203 50
•^'
100 r
100
150
200
250
,5062333
[•"•I' 'I I" i' I' I ' I — t ^'' I' ' i — Y — ^ ™ ' 300 350 400 4 5 0 500 550 6 0 0
173.0982 239.1152
268.1189 405.2203 308.1067
477.2323
506.2333
i
385.23041 2331?!
\i± 50
100
150
2C0
250
546.2128
321 15941
u iMilj
300
350
400
450
576.4730
4liJ(ni 500
550 600
Fig. 12. ESI-TOF mass spectra (a) and its expanded version (b) of the E. coli K-12 culture. After 7.2 h of culture, the sample was obtained through the sampling —»quenching -^extraction procedure
Metabolomic Profiling by FT-ICR-MS and ESI-Q-TOF-MS
(a)
89
(b) 151.1037
100 r
152.0696 309.1188 318.1860
156.0965
146.1821 •,
...VL
-io^ m/z 160
i ^
/
~^,
i-l^-.
..,L..i.i.l. 315
Fig. 13. ESI-TOF mass spectra of the intracellular elements of E. coli K-12. After 7.2 h of culture, the sample was obtained through the sampling —» quenching -^extraction procedure, a m/z 140-160. b m/z 300-320
100,
% 147.1249 131.0999 1
r :
(b)
84.0843
%
147.1173 84.0468
\
0
I.,
130. D918
00
h
1
1
50
60
70
102.0578
130.0498 131.0957 148.0903
1—'
80
1
90
1
r
100 110 120 130 140 150 160 170 180 190 200
Fig. 14. a Collision-induced dissociation (CID) spectrum from [M+H]^ of Lys and b that from m/z 147.1222 of the intracellular elements of E. coli K-12
179.0243
162.0027 3.0150 I .
130.0391
[180.0369
233.0280 |234,0 1234,0456
308.0621 291.0454 1 n—I—I—I—r-
162.0328 179.0654
76.0292 60
80
291.0917308.1067
130.0545
f" ! ' I — r " I " f" "I''" i'—r^-1—'I
(b)
I' '"I ' 'i""i ''Z*"!
4^
f 'I !•" 'I" 'I • i f
i"*i'
I" !• f
100 120 140 160 180 200 220 240 260 280 300 320 340
Fig. 15. a CID spectrum from [M+H]^ of reduced glutathione (GSH) and b that from m/z 308.1067 of the intracellular elements of E. coli K-12
90
K. Hirayama
(a)
" -
147.1222
-
136.0682 M.
0
jlSI.108 7156.0914
74.0921
189.0649 1
1.
100
124.0447
147.1222151.1037 "l 1152.0696 1 ^
189.0649 168.0746 [174.0921 ,.1
1194.1495
120 125 130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 205
Fig. 16. ESI-TOF mass spectra at the region of m/z 100-200 of the intracellular elements ofE. coli K-12. a After 3.8 h of cultivation, b After 7.2 h of cultivation
7. Conclusion Metabolic profiling by FT-ICR-MS and ESI-Q-TOF-MS is an effective method to capture a rough image of metabolism. It should be performed after the accuracy is confirmed by using the authentic sample after an elaborate calibration is done, when an individual metabolite is identified from the accurate mass for FT-ICR-MS. Moreover, when ESI-Q-TOF-MS is used, the CID spectrum should also be measured, and compared with that of the authentic sample.
References
2.
Aharoni A, Ric De Vos CH, Verhoeven HA, Maliepaard CA, Kruppa G, Bino R, Goodenowe DB (2002) Nontargeted metabolome analysis by use of Fourier transform ion cyclotron mass spectrometry. J Integr Biol 6:217-234 Maharjan R, Ferenci T (2003) Global metabolite analysis: the influence of extraction methodology on metabolome profiles of Escherichia coli- Anal Biochem 313:145-154
Chapter 7: Metabolome Analysis by Capillary Electrophoresis Li Jia and Shigeru Terabe Graduate School of Material Science, University of Hyogo, Kamigori, Hyogo 678-1297, Japan
1. Introduction Separations by capillary electrophoresis (CE) are based on the differences in electrophoretic mobilities of ions in electrophoretic media inside narrow-bore capillaries (less than 100 |Lim i.d.). The ability to obtain high separation efficiencies by CE was highlighted in the early 1980s. The late 1980s and early 1990s saw the advent of commercially available equipment. In late 1990s CE made a broadening of the range of separation mechanisms and instrumentation developments aimed at addressing practitioner's demands. Capillary electrophoresis also expanded applications in a wide range of fields in late 1990s. Compared to high-performance liquid chromatography (HPLC), CE offers a number of advantages, which include reduced method development time, reduced running costs and almost no solvent consumption, and up to two orders of magnitude higher separation efficiency. The principal disadvantage of CE is its relatively low concentration sensitivity with conventional absorbance detectors. However, now several approaches have been developed to overcome the difficulty, which include modified capillary dimensions and on-line sample preconcentration techniques [stacking, sweeping, transient-isotachophoresis (tITP), dynamic pH junction, and dynamic pH junction-sweeping]. There are several separation modes in CE, i.e., capillary zone electrophoresis (CZE), capillary gel electrophoresis, micellar electrokinetic chromatography (MEKC), capillary electrochromatography, capillary isoelectric focusing, and capillary isotachophoresis, in which CZE and MEKC are the most popular modes and are most suitable for the metabolome analysis since the metabolome usually consists of small molecules. Hence, in this chapter we introduce just the three modes of CE and their applications in metabolome analysis. For those who want to learn CE, a textbook is cited [1].
92
L. Jia and S. Terabe
2. Instrumentation Figure 1 shows a schematic diagram of the basic instrumental set-up of CE, which consists of an injection system, a high-voltage power supply, two buffer reservoirs, a capillary and a detector. Commercial CE instruments are additionally equipped with an autosampler allowing series analysis and capillary thermostating, and a computer for instrumental control and data acquisition. Different modes of CE separations can be performed using the same CE instrument. The typical voltages used are in the range of 5-30 kV, which results in currents in the range of 10-100 |LIA. The capillary is a key element of the CE separation. Cylindrical polyimide-coated fused silica capillaries with a narrow diameter (50-75 |am) are the most often used today. The external polyimide protective coating increases the mechanical strength of a capillary as bare fused silica is extremely fragile. The widespread use of fused silica is due to its intrinsic properties, which include transparency over a wide range of wavelength and a high thermal conductance. The narrow capillary diameter facilitates the dissipation of Joule heating. Electro-osmotic flow (EOF) is generated inside the capillary when a voltage is applied between the both ends of the capillary filled with a running solution. Electro-osmotic flow originates from the negative charges caused by the ionization of the silanol groups on the inner wall of the capillary. A key feature of EOF is that it has a flat flow profile, which favorably minimizes zone broadening, leading to high separation efficiencies. The strength of EOF depends on several factors: surface charge on the capillary wall, viscosity and permittivity of the solution, and temperature. It should be noted that the surface charge is significantly affected by the pH. That is, EOF is strong under alkaline or neutral conditions, reduced under pH 5, and almost suppressed below pH 2. Sample injection is performed by temporarily replacing one of buffer reservoirs with a sample vial. Typical injection volumes range from picoliters to nanoliters. There are two commonly used injection methods for CE: hydrodynamic and electrokinetic. Hydrodynamic injection is accomplished by the application of a pressure difference between the two ends of a capillary. The amount of sample injected can be manipulated by varying the injection time and the pressure difference. A major limitation of the hydrodynamic injection is that it is not suitable for the injection of highly viscous samples. Electrokinetic injection is performed by applying a voltage at the sample vial for a certain period of time, resulting in the transport of sample into the capillary by electromigration, which includes contributions from both electrophoretic migration of sample ions and electro-
Metabolome Analysis by Capillary Electrophoresis
93
osmotic flow. The amount of sample injected can be controlled by varying the injection time and the applied voltage. There are two biases occurring in electrokinetic injection. One is a discrimination of the injected sample components due to the mobility differences of the analytes. The other is the change in the absolute amount injected into the capillary due to the difference in the conductivity of the sample solution. With some modifications, most HPLC detection modes can be applied to CE. Among them, on-colunm UV absorption and fluorescence detection, and mass spectrometry (MS) are very useful for metabolome analysis. Capillary electrophoresis/MS is not described in this chapter (see Chapter 2). On-column UV absorbance is the most widely accepted detector currently due to its relatively universal detection capability, simple adaptation, and low cost. The capillary itself acts as the on-column detector cell, which was made by removing the protective polyimide coating from a small section of the fused silica capillary. One of the main issues with UV absorbance detection is that of insufficient detection sensitivity owing to the limitation of the small inside diameter of the capillary and low injection volume. Generally, the concentration detection limits are of the order of 10~^ M for most analytes with chromophores. There are two ways to enhance performance in absorption detection. One is to increase the optical path length. Z-shaped or bubble cells are commercially available extended path length absorbance detectors. The other is on-column preconcentration technique by increasing the injection volume, which will be discussed below. The majority of instruments also have UV diode array detectors available, which is beneficial in the identification of unknown compounds and examination of peak purity by providing spectral information. Laser-induced fluorescence (LIF) detection is a highly sensitive detection method in CE. The concentration detection limits of LIF detection are of the order of 10'^ M for analytes with fluorophores. Unfortunately, laser sources are expensive and the excitation wavelengths available are rather limited. Moreover, very few compounds are native fluorescent. Hence, pre- or postcolumn derivatization with some types of fluorophore is needed to extend the application of LIF. Indirect detection can be employed for UV or fluorescently inactive compounds. In indirect detection, the background electrolyte contains either a UV-absorbing or fluorescent constituent that provides a stable baseline signal. As analyte zones migrate through the detector, they effectively displace the absorbing constituent to reduce the background signal, leading typically to negative peaks. The sensitivity for indirect detection is slightly less than that for direct detection counterpart. The linear dynamic range is also lower than direct detection counterpart.
94
L. Jia and S. Terabe
Fig. 1. Schematic of capillary electrophoresis instruments. 7, coolant; 2, cassette; i, high-voltage (HV) power supply; 4y capillary; 5, electrodes; 6, detector; 7, reservoirs; 8, carousel for sample solutions and running buffers
3. Separation Modes and Principles 3.1. Capillary Zone Electrophoresis Capillary zone electrophoresis is the simplest and most widely used separation mode in CE. Figure 2 depicts the schematic principle of the CZE separation. The separation is based on the differential electrophoretic mobilities of solutes, which are characteristic properties of analyte ion in a given media and at a given temperature. Therefore, only charged compounds or ions can be separated by this method. An uncoated fused-silica capillary tube is typically used for CE. Separation is optimized by choosing an electrolyte system, with suitable pH, ionic strength, and composition. The pH value of the electrolyte solution is the most important separation parameter since it influences the dissociation of weakly acidic, basic, or zwitter-ionic analytes. The use of additives such as organic solvents and complexing agents (cyclodextrins, crown ethers) is also an effective method to improve resolution.
Metabolome Analysis by Capillary Electrophoresis
95
Fig. 2. Separation principle of capillary zone electrophoresis. +, cation; -, anion; N, neutral analyte; EOF, electro-osmotic flow
Fig. 3. Separation principle of micellar electrokinetic chromatography (MEKC). S, analyte; ~, anionic surfactant; EOF, electro-osmotic flow
96
L. Jia and S. Terabe
3.2. Micellar Electrokinetic Chromatography Micellar electrokinetic chromatography was first introduced by Terabe and coworkers in 1984, and it is particularly useful for the separation of small molecules including neutral analytes. Figure 3 depicts the schematic principle of MEKC separation. The separation is based on the partitioning of analytes between the micellar phase and the aqueous solution phase. An ionic micellar solution is employed as a separation solution, and under the capillary electrophoretic condition the ionic micelle migrates at a different velocity from the bulk solution because the micelle is subjected to the electrophoretic migration. The micelle corresponds to the stationary phase in chromatography and often is called the pseudostationary phase. A fraction of the analyte incorporated by the micelle migrates at the velocity of the micelle, while the rest of the analyte free from the micelle migrates at the EOF velocity. Under neutral or alkaline conditions, the electro-osmotic velocity is faster than the electrophoretic velocity of the micelle in the opposite direction and hence, the micelle also migrates in the same direction as EOF. When an anionic micelle such as sodium dodecyl sulfate (SDS) is employed, all the neutral analytes migrate toward the cathode due to the strong EOF. The less-incorporated analytes or hydrophilic analytes migrate faster than the more incorporated analytes or hydrophobic analytes. For ionic compounds, charge-to-size ratios, hydrophobicity and charge interactions at the surface of the micelles combine to influence the separation of the analytes. Since MEKC is a chromatographic technique, the separation selectivity is manipulated by the chromatographic considerations. The choice of the surfactant, the pH and composition of the running solution, and the use of additive are important factors to manipulate selectivity. The chemical structure of the surfactant, in particular that of the polar group, affects selectivity significantly. To resolve highly hydrophobic compounds by MEKC, several modifiers (cyclodextrin, organic solvents, urea or glucose) are developed to reduce the fraction of analytes incorporated by the micelle.
4. On-Line Sample Preconcentration Methods An approach to improve the detection sensitivity in terms of concentrations is on-line sample preconcentration, which is performed by injecting a large sample volume and by electrokinetically focusing analyte zones prior to separation. To date, five major on-line preconcentration techniques have been reported in CE: field-enhanced sample stacking, sweeping, dynamic
Metabolome Analysis by Capillary Electrophoresis
97
pH junction, dynamic pH junction-sweeping, and transient isotachophoresis, among which the first three is now explained.
4.1. Field-Enhanced Sample Stacking Field-enhanced sample stacking utilizes a high electric field observed in the sample zone by preparing the sample solution in a low electric conductivity matrix [2]. Since the electrophoretic velocity is proportional to the field strength, analyte ions migrate at a much faster velocity in the sample zone than in the separation zone and stack at the boundary between the sample and separation zones (Fig. 4). Theoretically, the degree of sample stacking is proportional to the ratio of resistivities of the sample solution and background solution. However, the concentration efficiency in the sample stacking is deteriorated by a mismatch of the EOF. Electro-osmotic flow velocity is also proportional to the field strength and must be different between the two zones due to the difference in electric field strength. Owing to the continuity of the solution, the bulk electro-osmotic velocity must be constant throughout the capillary. Therefore, mixing must occur at the boundary of the two zones. This discrepancy is minimized when the EOF is suppressed. It should be noted that although neutral analytes are not concentrated simply by this stacking technique, the technique is available also in MEKC, provided the neutral analytes are incorporated by the micelle.
Fig. 4. Schematic of field enhancing sample stacking. Right-hand side shows the high electrical conductivity (low electric field) zone and left-hand side the low electrical conductivity (high electric field) zone. Charged ions migrate fast in the left zone and slow down at the boundary
98
L. Jia and S. Terabe
4.2. Dynamic pH Junction Dynamic pH junction is an efficient preconcentration technique for the weakly ionic analytes if the difference in pH between the sample matrix and background solution can cause significant changes in their mobilities [3]. Generally, the sample is prepared in a buffer where the mobility of the analyte is zero (about 1 pH unit
pA^). Focusing is hypothesized to be caused by the formation of a transient pH titration within the sample zone, which results in rapid focusing of analytes that undergo velocity changes in the selected pH range (Fig. 5). Single or mixed buffer types can be used to generate an appropriate dynamic pH junction. The sample may consist of the same buffer or different electrolyte type as the background solution to optimize the pH junction range for the focusing of weakly acidic, basic or zwitter-ionic analytes (mobility is pH dependent) based on their pK^ and/or p/. Dynamic pH junction is a powerful technique for the metabolome analysis because most metabolites are weakly acidic or basic.
Fig. 5. Schematic illustration of dynamic pH junction. EOF, electro-osmotic flow
Metabolome Analysis by Capillary Electrophoresis
99
4.3. Sweeping Sweeping is performed by injecting a long plug of the sample solution which is to have a similar electric conductivity as that of the separation solution but devoid of the pseudostationary phase and by applying voltage with the separation solution in the inlet vial [4]. Sweeping is defined as the picking and accumulating of analytes by the charged pseudostationary phase that fills or penetrates the sample zone during application of a voltage (Fig. 6). Sweeping is based on the partitioning mechanism and the concentration efficiency is dependent on the retention factor of an analyte. The higher the retention factor, the higher the concentration efficiency. An advantage of sweeping is that sample matrix can contain relatively high concentrations of electrolytes since low conductivity is not required for the sample matrix. Sweeping is also powerful even in the presence of a strong EOF although concentration efficiency is higher under a suppressed EOF. Both charged and uncharged metabolites that possess high retention factors can be effectively preconcentrated by sweeping. Chemical derivatization of hydrophilic metabolites with a hydrophobic probe can be an effective way to enhance sweeping performance. The combination of different on-line sample preconcentration techniques can enhance concentration efficiency or expand the range of analytes effectively concentrated, e.g., electrokinetic stacking and sweeping, dynamic pH junction and sweeping.
5. Example of Metabolome Analysis by CE 5.1. Analysis of Amino Acids and Amines with LIF Detection Amino acids are important metabolites in the cell, but most amino acids do not have strong chromophores. Therefore, derivatization of amino acids with fluorescent or UV probes is required to enhance detector sensitivity. In our lab, LIF detection with argon ion laser (488 nm) as an excitation source was employed and 4-fluoro-7-nitrobenz-2-oxa-l,3-diazole (NBD-F) was used as an fluorescence-labeling reagent [5]. Due to the high hydrophobicity of the derivatized amino acids, an MEKC method was developed to analyze amino acids in the cell extract of Bacillus subtilis, as shown in Fig. 7. The concentrations of major amino acids in the cell extract were estimated to be from sub |LIM to tens of mM as shown in Table 1.
100
L. Jia and S. Terabe
Fig. 6. Schematic illustration of sweeping
2000
Pro Gin, Lys, R
Leu
Glu
Orn
ß-A\3 Citrulline
Arg
Tyr
Asp
vJUwJ 10
12
14
18
20
22
24
26
28
30
Cmin]
Fig. 7. Separation of NBD-F derivatized amino acids in the cell extract of B. subtiliS' SAH, 5'-adenosylhomocysteine; GABA, y-aminobutyric acid. Conditions: capillary, 50 jim i.d.x56 cm (45 cm to the detector); running solution, 50 mM sodium dodecyl sulfate (SDS)-2 M urea-40 mM Briji 35 in 100 mM borate buffer (pH 9.0); injection, hydrostatic 10 s at 5 kPa; applied voltage, 20 kV; detection, laser-induced fluorescence (LIF) with Ar ion laser (488 nm)
Metabolome Analysis by Capillary Electrophoresis
101
Table 1. Quantitative analysis of some amino acids in Bacillus subtilis cell extract Amino acid Concentration/M (S-6 Concentration/M (S-6 Glucose medium)
Alanine 3.14X10' Glutamic acid 1.85X10^ Aspartic acid 4.2x10^ Capillary electrophoresis conditions are given in Fig.
Malate medium) 1.15X10^ 2.54X10^ 9.1x10^
7
5.2. Analysis of Purines by CE with UV Detection New separation platforms for high-throughput analysis based on multiplexed CE (capillary array format) promise rapid and highly efficient separations, as highlighted by its important role in rapid DNA sequencing used in the Human Genome Project. In our research group, a multiplexed CE system w^ith UV detection in conjunction with dynamic pH junction was demonstrated as a novel method for the sensitive and high-throughput analysis of purine metabolites [6]. The optimization of purine focusing can be rapidly assessed by systematically altering the sample matrix properties, such as the buffer co-ion, pH, and ionic strength using a 96-capillary array format. The method permits focusing of large sample injection volumes, resulting in more than a 50-fold enhancement in concentration sensitivity compared to conventional injections. The technique also demonstrated excellent intercapillary precision and linearity in terms of normalized migration times and peak areas. 5.3. Analysis of Nucleotides by CE with UV Detection The pyridine nucleotides (NAD, NADP, NADH, and NADPH) represent a class of coenzyme involved in a number of critical catabolic and anabolic pathways in living organisms. The adenine nucleotides (AMP, ADP, and ATP) also play an important role as physiological signaling molecules which bind to membrane purine receptors. Figure 8 presents electropherograms in analysis of B. subtilis cell extracts from glucose and malate as culture media [7]. Nanomolar (nM) detectability of analytes by CE with UV photometric detection is achieved through effective focusing of large sample plug (about 10% of capillary length) using sweeping by borate complexation method, reflected by a LODs of about 2x10'^ M. Concentrations of pyridine and adenine nucleotides in a single cell were estimated at
102
L. Jia and S. Terabe
millimolar level. The concentrations of the analytes were also found to be different in cell extracts derived from glucose or malate culture media. 5.4. Analysis of Flavins by CE with LIF Detection The flavins, riboflavin (RF), flavin mononucleotide (FMN), and flavin adenine dinucleotide (FAD), represent an important class of metabolites in the cell, which are natively fluorescent. A CE method with LIF detection was developed to analyze trace amounts of flavins from different types of biological samples (including bacterial cell extracts, recombinant protein, pooled human plasma and urine) using dynamic pH junction-sweeping as an on-line preconcentration technique [8]. Picomolar detectability of flavins by CE-LIF detection was realized with on-line preconcentration (up to 15% capillary length used for injection) by dynamic pH junctionsweeping, resulting in a LOD of about 4.0 pM for the flavin coenzymes FAD and FMN. More than a 1200-fold improvement in concentration sensitivity was demonstrated compared to conventional injections. Submicromolar amounts of flavin coenzymes were measured directly from formic acid cell extracts of B. subtilis. Figure 9 shows electropherograms depicting analysis of flavin coenzymes in cell extracts of B. subtilis- Significant differences in flavin concentration were measured in cell extracts derived from either glucose or malate as the carbon source in the culture media. 1UU 90 - (a) 80 70 5 60 < 50 ^ 40 30 20 10 — 1 0
NAD
AMP
ADP
NADP j I 1 ATPI . Jlt-'^./'-'^f—
10
L
11 12 13 14 15 16 Cmin]
95 85 - (b) 75 65 555 <45 : ^35 25 15 5
NAD
/
Ü
Jl__
NADPAMP
\k 10
ADP 1
I
11 12 13 14 15 16 Cmin]
Fig. 8. Analysis of nucleotides in the cell extract of Bacillus subtilis. a Glucose and b malate as culture media by capillary electrophoresis (CE) with sweeping by borate complexation. NAD, nicotinamide adenine dinucleotide; NADP, nicotinamide adenine dinucleotide phosphate; AMP, ADP, ATP, adenosine mono-, di-, triphosphate. Experimental conditions: buffer, 150 mM borate; applied potential, 20 kV; injection pressure, 50 mbar, 40 s; capillary temperature, 20°C; detection, 200 nm; fused-silica capillary, 56.0 cm (48.6 cm effective length)x50 jam i.d. (from Ref. [7], with permission)
Metabolome Analysis by Capillary Electrophoresis
103
Fig. 9. Electropherograms depicting analysis of submicromolar amounts of flavin coenzymes in cell extracts of B. subtilis by CE-LIF with dynamic pH junction-sweeping using a glucose and b malate as carbon source in culture media. Samples were diluted 25-fold in 75 mM phosphate, pH 6.0, prior to injection. Conditions: BGE, 140 mM borate, 100 mM SDS, 5 mM ß-cyclodextrin, pH 8.5; voltage, 15 kV; capillary length, 57 cm; injection, 60 s (15% capillary length). Analyte peak numbering corresponds to /, flavin mononucleotide (FMN) and 2, flavin adenine dinucleotide (FAD)', asterisk represents system peak (from Ref. [8], with permission)
Fig. 10. Three-dimensional picture of B. subtilis cell extract by the twodimensional separation system. See Color Plate 3. ZC liquid chromatography; CE, capillary electrophoresis (from Ref. [10], with permission)
104
L. Jia and S. Terabe
5.5. Analysis of Carboxylic Acids by CE with Indirect UV Detection The main metabolites in the tricarboxylic acid (TCA) cycle are di- and tricarboxylic acids, most of which have poor absorbance properties. To overcome the difficulty, the indirect UV photometric detection is often employed. In our laboratory, tri- and dicarboxylic acids from TCA cycle as well as carboxylic acid metabolites from other metabolic pathways (e.g., glycolysis, urea cycle, and metabolism of amino compounds) in B. subtilis cells from two different cultures (glucose and malate) were analyzed by CE with indirect detection mode using 2,6-pyridinedicarboxylic acid as a highly UV-absorbing carrier electrolyte [9]. With an electrokinetic injection mode LODs of the analytes in the range of 11-60 pM were achieved. The concentrations of carboxylic acid metabolites in a single bacterium cell were estimated at millimolar and submiUimolar level. Concentrations of the metabolites were determined to be different in cell extracts derived from glucose or malate culture media. 5.6. Comprehensive Analysis of Metabolome by LC/CE TwoDimensional Analysis Recently, our research group developed a two-dimensional separation method for comprehensive analysis B. subtilis metabolites, which hyphenated chromatography and electrophoresis [10]. Bacillus subtilis cell extract was first separated based on its hydrophobicity using a monolithic silicaODS column and a laboratory-assembled micro-LC apparatus operated in the gradient elution mode. The fractions of effluent from the column were collected every minute (2 ijj/vial). After collection, the fractions were dried at room temperature under vacuum, then reconstituted with 10 jul of 50 mM phosphoric acid or 75 mM sodium phosphate (pH 6.0) before CE analysis. The early-eluting fractions were separated by dynamic pH junction CZE based on their charge-to-size ratios, while the late-eluting fractions were separated by sweeping MEKC based on their hydrophobicity. The middle fractions were analyzed using both modes of CE. Concentration strategies, namely dynamic pH junction and sweeping, were employed to interface the two dimensions, which proved to be beneficial for the detection of metabolites. Some important metabolites in the B. subtilis cell were identified. Figure 10 shows a three-dimensional picture of 5. subtilis cell extract by the two-dimensional separation system. This method provided great potential for resolving complex biological samples containing compounds having different characteristics.
Metabolome Analysis by Capillary Electrophoresis
105
6. Conclusion Capillary electrophoresis offers a unique platform for carrying out separations that is useful for metabolomic studies due to its advantages as stated above. In CE, different on-line sample preconcentration approaches have been used to improve the concentration sensitivity. The introduction of a commercial multiplexed CE system provides a convenient platform for rapid method development and high-throughput analysis of hundreds of metabolites present in a cell. Hence, CE would be advantageous for analysis of specific classes of metabolites in a cell. So far, no single chromatography or electrophoresis procedure in one run is likely to resolve a complex mixture of cell metabolites. A multidimensional system, which employs two or more orthogonal separation techniques or separation methods with different separation mechanisms, will significantly improve the chances of resolving such a complex mixture of cell metabolites. A multiplexed CE platform would be extremely useful for high-throughput metabolite profiling. Multiplexed CE can be coupled orthogonally to HPLC for multidimensional separations by collecting fractions of complex samples using 96-microtiter plates with the aid of a microfraction collector. Due to the complexity of the metabolome, it is impossible to find a universal UV-absorbance or fluorescence-labeling reagent. Mass spectrometry is a rational choice owing to its universality, sensitivity, selectivity, and ability in providing structure information of metabolites. Hence, a multidimensional system with MS as detection would be a practical alternative for future metabolomic studies, in which CE would play an active role.
References 1. Khaledi, MG (ed) (1998) High performance capillary electrophoresis: theory, techniques and applications. Wiley-Interscience, New York 2. Chien R-L, Burgi DS (1992) On-column sample concentration using field amplification in CZE. Anal Chem 64:489A-496A 3. Britz-McKibbin P, Kranack AR, Paprica A, Chen DDY (1998) Quantitative assay for epinephrine in dental anesthetic solutions by capillary electrophoresis. Analyst 123:1461-1463 4. Quirino JP, Terabe S (1998) Exceeding 5000-fold concentration of dilute analytes in micellar electrokinetic chromatography. Science 282:465^68 5. Terabe S, Markuszewski MJ, Inoue N, Otsuka K, Nishioka T (2001) Capillary electrophoretic techniques toward the metabolome analysis. Pure Appl Chem 73:1563-1572
106
L. Jia and S. Terabe
6. Britz-McKibbin P, Nishioka T, Terabe S (2003) Sensitive and highthroughput analyses of purine metabolites by dynamic pH junction multiplexed capillary electrophoresis: a new tool for metabolomic studies. Anal Sei 19:99-104 7. Markuszewski MJ, Britz-McKibbin P, Terabe S, Matsuda K, Nishioka T (2003) Determination of pyridine and adenine nucleotide metabolites in Bacillus subtilis cell extract by sweeping borate complexation capillary electrophoresis. J Chromatogr A 989:293-301 8. Britz-McKibbin P, Markuszewski MJ, lyanagi T, Matsuda K, Nishioka T, Terabe S (2003) Picomolar analysis of flavins in biological samples by dynamic pH juntion-sweeping capillary electrophoresis with laser-induced fluorescence detection. Anal Biochem 313:89-96 9. Markuszewski MJ, Otsuka K, Terabe S, Matsuda K, Nishioka T (2003) Analysis of carboxylic acid metabolites from the tricarboxylic acid cycle in Bacillus subtilis cell extract by capillary electrophoresis using an indirect photometric detection method. J Chromatogr A 1010:113-121 10. Jia L, Liu B, Terabe S, Nishioka T (2004) two-dimensional separation method for analysis of Bacillus subtilis metabolites via hyphenation of micro-liquid chromatography and capillary electrophoresis. Anal Chem 76:1419-1428
Chapter 8: High-Performance Liquid Chromatography for Metabolomics: High-Efficiency Separations Utilizing IVIonolithic Silica Columns Tohru Ikegami\ Hiroshi Kobayashi\ Hiroshi Kimura\ Vladimir V. Tolstikov^, Oliver Fiehn^, and Nobuo Tanaka^ ^Department of Polymer Science and Engineering, Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto 606-8585, Japan ^Department Lothar Willmitzer, Max Planck Institute of Molecular Plant Physiology, 14424 Potsdam, Germany
1. Introduction For the separation and identification of metabolites (metabolomics), gas chromatography/mass spectrometry (GC/MS), high-performance liquid chromatography (HPLC)/MS, or capillary electrophoresis (CE)/MS techniques have been mainly employed. Among these techniques, HPLC/MS is most widely applicable to metabolomics. The chromatographic efficiency, however, is generally lower than that of the other separation techniques, GC/MS or CE/MS. Micro HPLC that can be easily coupled to MS through interface has not been used widely. Recently, significant improvement was made to increase the separation capability of HPLC, which will help the analysis of complex metabolite samples. In this chapter, the use of long capillary columns that give high separation efficiencies in a micro HPLC system, and multidimensional HPLC that can provide high separation capacity, will be described. Special attention will be paid to the examples of high efficiency HPLC separations made possible by monolithic silica columns having network-type silica skeletons, and by employing LC/electrospray ionization (ESI)-MS and/or comprehensive two-dimensional HPLC systems. In the field of metabolomics, because of the importance of separation and detection of thousands of small molecules, micro HPLC techniques will become a common method of separation and identification of metabolites in the near future, although there have been limited numbers of application examples of micro HPLC to metabolomics studies [1].
108
T. Ikegami et al.
2. Monolithic Silica Columns for Micro HPLC In most cases, conventional particle-packed capillary columns have been used for separations by micro HPLC. Recently, monolithic silica capillary columns have been reported to show much higher separation efficiencies than particle-packed columns [2,3]. They consist of network silica skeletons, that can be prepared in capillaries by a sol-gel method. Monolithic silica columns that are available at present are listed in Table 1. Here, the features of monolithic silica capillary columns and the optimization of separation conditions will be described. 2.1. The Characteristics of Monolithic Silica Columns So far, most HPLC separations have been carried out by using conventional particle-packed columns. However, the use of monolithic silica columns consisting of network silica skeletons and through-pores was reported recently [4,5]. Monolithic silica capillary columns were reported to provide better separation efficiencies than particle-packed columns, and the use of these columns for proteomics and metabolomics seems to be attractive. Monolithic silica columns are prepared by acid-catalyzed hydrolysis polymerization of alkoxysilanes in the presence of water-soluble polymers such as poly(ethylene glycol) [5]. Figure la shows a scanning electron microscope (SEM) image of monolithic silica prepared in a test tube, while Fig. Ib-e show SEM images of monolithic silica columns prepared in a fused silica capillary with 50-200 jiim internal diameters [6]. At present, monolithic silica columns with 4.6 mm, 200 )am, 100 |Lim, and 50 ^xxn internal diameters are commercially available, and development of columns of other sizes are in progress. When a monolithic silica rod is prepared in a test tube, the material is covered with PEEK resin to prepare an HPLC column. But it is not easy to produce high efficiency columns from silica rods with diameters of 1-2 mm. It is not easy also to prepare monolithic silica in fused silica capillaries with 500 )im or larger internal diameters; due to the shrinkage of silica skeletons during the preparation. Successful bonding of silica skeletons to the capillary wall cannot be achieved. It is possible to pack silica particles into a fused silica capillary, but it is difficult to produce high efficiency and long lasting columns using 1-2 |Lim particles.
High-Performance Liquid Chromatography for Metabolomics
109
Fig. 1. Scanning electron microscope images of monolithic silica prepared from sol-gel methods, a Monolithic silica prepared in a test tube (x2500). b^c Monolithic silica prepared in 50 )Lim fused silica capillary (xl500); d in 100 jim fused silica capillary (x800); e in 200 |Lim fused silica capillary tube (x400)
w 8000
0 1 2 3 4 5 Linear velocity, u, mm/s Fig. 2. Plots of separation impedance against linear velocity of 80% methanol for mobile phase calculated for hexylbenzene as a solute. MS, monolithic silica. Columns: 5 |im silica-CIS particles, Mightysil RP18 {open circles), monolithic silica colunm in capillary, MS(50)-A {downward closed triangles), MS(50)-B {upward closed triangles), MS(50)-C {closed squares), MS(50)-D {closed diamonds), MS-H(50)-I {downward open triangles), MS-H(50)-II {open squares)
no
T. Ikegami et al.
Currently available monolithic capillary columns include organic polymer columns and chemically modified silica columns, and they have the following features. Monolithic polymer columns generally show higher permeability than particle-packed columns, and high efficiency for the separation of macromolecules [7]. In the case of monolithic silica capillary columns, silica skeletons are covalently bonded to the capillary wall. Thus, frits are not necessary to hold the skeletons in a column, and column length can be varied in the range of 5-200 cm after the preparation. Generally, the silica skeleton sizes are in the range of 1 to 2 |Lim. As shown in Fig. la, monolithic columns have 3-10 times bigger (through-pore size/skeleton size) ratio, 1-3, than particle-packed columns with (through-pore size/particle size) ratio, 0.25-0.4. Monolithic silica columns produce similar separation efficiencies to particle-packed columns at a much lower pressure drop. At the same pressure drop, monolithic columns provide higher separation efficiencies than particle-packed columns. Moreover, owing to the small silica skeleton sizes, relatively high separation efficiencies can be expected at higher linear velocities [8,9]. The performances of monolithic columns are better than that of particle-packed columns in the total comparison of back-pressure and separation efficiencies, although there still is a problem in reproducibility of monolithic silica columns in some cases, due to the more complicated preparation steps for monolithic columns than production of particle-packed columns. The separation impedance (E) is given by Eq. (1) where N, AP, fo and TJ stand for number of theoretical plates, column back-pressure, the elution time of an unretained solute, and viscosity of mobile phase, respectively. The value, reciprocal number of theoretical plates per unit pressure drop multiplied by number of theoretical plates per unit time, can be regarded as the total performance of columns, or separation impedance (E) [10]. E=toAP/N'rj=iAP/N) (VTV) (I///)
(1)
Figure 2 shows plots of separation impedance against linear velocity of mobile phase for a particle-packed column and monolithic silica columns modified with CI8 stationary phase. Monolithic silica capillary columns can produce higher separation efficiencies by nearly 10 times better than a particle-packed column (open circles) [6].
High-Performance Liquid Chromatography for Metabolomics
111
Table 1. Column sizes, flow rates, linear velocities, and degrees of sample dilution Inner diameter [mm (jim)]
Colunm volume^ [^il]
Flow rate [|il/min]
to^ [s/10 cm]
Solvent linear velocity [mm/s]
Relative degree of dilution^
4.6 2.0 1.0 0.5 (500) 0.3 (300) 0.2 (200) 0.1 (100) 0.05 (50) 0.025 (25)
1660 314 78 20 7.1 3.1 0.78 0.20 0.05
1000 200 50 12.5 5 2 0.5 0.12 0.03
70 66 66 66 59 66 66 69 69
1.4 1.5 1.5 1.5 1.7 1.5 1.5 1.5 1.5
2100 400 100 25 9 4 1 0.25 0.06
Column type Conventional Semimicro Micro Microcapillary
^Column lengths were 10 cm; total porosity was estimated as 0.70 Volumn of i.d. 100 jam was considered standard
(a) MS-FS(50)-A 50 iimID, 25 cm
(c) MS-FS(50)-C 50|imlD, 130 cm
0.9 mm/s 0.7 kg/cm^
0.9 mm/s 3.7 kg/cm^
2
4
6
8min
1.0 mm/s 23 kg/cm^
10 20 30 40 min
(b) MS-FS(50)-B 50 jimlD, 45 cm
(d) MS-FS(50)-C 50^mlD, 130 cm
1.1 mm/s 2.5 kg/cm^
4.7 mm/s 28 kg/cm^
10
15
20min
h
JUiUUUL 12min
(e) Particle-packed 4.6 mmID, 15 cm
2
4
6
8 min
Fig. 3. Chromatograms obtained for alkylbenzenes (C6H5(CH2)^, n=0-6) by CIS monolithic silica capillary columns (a-d), and e particle-packed colunm (5 |Lim silica-CIS particles, Mightysil RPIS)
112
T. Ikegami et al.
2.2. Column Size and Detection Sensitivity Volume of a silica capillary of 100 |Lim inner diameter is ca. 0.8 |LJ per 10 cm column length. In the case of a particle-packed column (total porosity ca. 0.7) the volume of mobile phase in the column (Vo) is ca. 0.56 |al, while Vo for monolithic columns is around 0.7 |LI1. At a flow rate of 0.5 |Lil/min, to for the monolithic silica column will be 84 s, and the linear velocity (velocity of mobile phase along column axis) can be estimated as ca. 1.2 mm/s (Table 1). High-performance liquid chromatography systems with a monolithic silica capillary column have following advantages: (i) small consumption of stationary and mobile phases, (ii) high detection sensitivity for a certain amount of samples, (iii) high speed separation with low pressure drop, and (iv) the possible use of a long column with 1-2 m that can provide around 100 000-200 000 theoretical plates along with some disadvantages, namely, smaller sample capacities of a monolithic silica column than particle-packed columns, necessity of skill and knowledge to operate a capillary HPLC system to obtain high separation efficiency, and insufficient supply of good columns and instruments for capillary HPLC. Figure 3 shows chromatograms produced by monolithic silica capillary columns of 25-130 cm lengths modified with CI 8 stationary phase [11]. Low column back-pressures are the significant features of these columns. For a certain amount of sample, assuming that band broadening (peak width) and resolutions are similar for various HPLC systems, the dilution factors for analytes are proportional to the internal diameters of columns squared. Sample concentrations after the separation are higher in microcolumns with smaller diameters, and the higher sample concentrations can lead to higher detection sensitivity (Table 1). Because lower flow rates can lead to higher ionization efficiencies and higher detection sensitivity in a LC/ESI-MS system, development of a high-efficiency micro HPLC system is important issue for metabolomics studies [12]. To maintain the stable flow of mobile phases, ionization, and detection in MS, it is necessary to keep optimum (or minimum) flow rates at the interface of LC and MS. On the other hand, efficiencies of a conventional column (number of theoretical plates) are maximum at 0.5-1.0 mm/s linear velocity of mobile phases, and decrease by 40%-50% at 4 mm/s linear velocity of a mobile phase. Samples with higher molecular weights having smaller diffusion coefficients show a greater decrease in column efficiency at high linear velocity.
High-Performance Liquid Chromatography for Metabolomics
113
2.3. Column Efficiencies and the Optimization of Separation Conditions [10,13] The number of theoretical plate A^ is a measure of the quality of a column and elution conditions, and is given by Eq. (2) from the retention time of a peak (^R) and peak width at half height (/wi/2=2.35a; cris standard deviation of a Gaussian peak). Resolution R^ is given by Eq. (4), that includes N, a (Eq. (5), selectivity, the ratio of retention factors of two adjacent peaks), and k (Eq. (3), retention factors, distribution coefficients of a solute between stationary and mobile phases, i.e., the ratio of times to and (^R-ZO), the former stands for time the solute exists in mobile phase, and the latter stands for time the solute exists in stationary phase). For convenient separation and detection, the k values should be in a range of 2-5. N=(tK/sf=5,54(yUy2f=l6(tK/tJ
(2)
k=(tR-to)'to
(3)
Rs=(N'^y4)[(a-l)/a]W(l+k)]
(4)
orki/ki
(5)
AP=(/^7juL/dp^ (u=L/to)
(6)
Nis nearly inversely proportional to dp, particle diameter. AP is proportional to T], u (linear velocity of the mobile phase), and L (column length) while it is inversely proportional to t/p^. Thus, a column packed with particles of small diameter leads to high separation efficiency (greater N) at the expense of high column back-pressure. Due to the drawback, an approach to get high efficiencies by reducing diameter of particles has a limit: since the pressure limit of a pump system is around 300^00 bar with a normal operational pressure 100-200 bar, the limit in particle sizes is in a range of 1-3 |Lim. Flow resistance parameter ^ in Eq. (6) is usually ca. 2000 for particle-packed columns, while ^ values reach to 200^00 in the case of monolithic silica columns. Interstitial porosity for particles (or skeleton) in columns, and total porosity for a particle-packed colunm are ca. 40% and 70%, respectively. On the other hand, those of monolithic silica columns are ca. 60% and 80%, and in the case of monolithic silica capillary colunms those values are ca. 80% and 90%, respectively. As shown in Fig. 2, colunm efficiencies depend on the linear velocity of a mobile phase. Generally, HPLC separations are carried out at higher flow rate than an optimum flow rate for column efficiency. When a particle-packed column (4.6 mm i.d.xlSO mm length) is used at 1 ml/min flow rate, the system gives to value of ca. 110 s, because the volume
114
T. Ikegami et al.
of mobile phase in the column is estimated as ca. 1.8 ml. The linear velocity of mobile phase in the colunm is calculated to be ca. 1.4 mm/s. Under usual conditions for HPLC analysis, linear velocities of mobile phases are in a range of 0.5^.0 mm/s, and these values are 1-5 times faster than the optimum flow rate (0.5-1.0 mm/s) for conventional columns packed with 5 ]Lim particles. Li the case of LC/ESI-MS, the use of an optimum flow rate for ionization is desirable for ESI, but it may be different from the optimum flow rate for HPLC separation. A solute band is broadened when it travels outside a column due to parabolic flow profile in a tube as well as due to slow diffusion in the stagnant mobile phase existing in an injector, a detector, or connection tubing. In particular, for solutes of small retention factors which elute in early part of the chromatogram, sample injection into a capillary column of l%-5% of column volume has significant influence on band spreading, mainly caused by sample diffusion at orifice in injector or by dead volume in all connection parts. Minimizing the sizes of connection tubes and detection cell will lead to higher separation efficiencies, but we have to keep in mind that small-sized tubes are easily clogged by particulate matter in mobile phases or samples. Split-flow injection technique is practical and useful for micro HPLC with monolithic silica capillary columns in order to avoid the peak spreading in the injection step [14,15]. This method can give excellent separation efficiencies with very limited band broadening at injector or connecting tubings. Moreover, the use of weak eluents for sample injection is also effective to increase the separation efficiency: in the case of reversed-phase HPLC, a sample solution can be prepared with water-rich solvent (20% lower organic solvent content than mobile phase).
3. Gradient Elution High-performance liquid chromatography separations in metabolomics are often carried out in gradient mode, to separate huge numbers of compounds. In gradient elution, a peak volume is reduced (solute concentration increases compared to isocratic elution), since the latter part of a peak (a solutes band) is eluted with a stronger solvent in comparison with the former part of the peak. During gradient elution, solute bands hardly move in a mobile phase containing 10%-20% less solvent of higher elution strength than a mobile phase suitable for isocratic elution of the solutes. This effect is more pronounced for high molecular weight solutes to give an impression that such solutes suddenly start moving at a certain concentration of strong eluents [16].
High-Performance Liquid Chromatography for Metabolomics
115
33,92 34.5938.00
ni
0.18 o 16 (0 14 ? 12 ^101< 8 6 4 2
54,79
Monolithic column C18 0.2 X 300mm
^'^
62.8M
10
.
71.03
^ 4 ^ A >V5.57^ ^^^VlO^
6,69 6 2710 4415.
a:
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
Time (min) 39,08 41,35 45,38 49 50
55 47
20 r
Monolithic column C18 0.2 X 600mm
0 18 ^ 16
-g 14 h c 12 E 10 <
8
22.25
113 64
21,1(} 0.42j[ 5.34 6.56
0
0
5
10
15,02^6,57
15 20
25
30
35 40
45
50
55 60
65
70
75
80
85 90
95 100
Time (min)
Monolithic column C18 0.2 X 900mm
Fig. 4. Replicate injections of an Arabidopsis leaf methanol extract on capillary monolithic CI8 columns in positive ionization full-scan mass spectrometry (MS), given as base peak chromatograms. The solvent gradient profile was employed from the starting conditions of 0-100% B (CH3CN). Upper panel 0.2x300 mm; middle panel 0.2x600 mm; lower panel 0.2x900 mm column. tO, void volume
3.1. Applications of Monolithic Silica Columns to Metabolomics Figure 4 show^s chromatograms of leaf extracts of Arabidopsis thaliana by LC/ESI-MS using 30-90 cm monolithic silica capillary columns modified v^ith C I 8 stationary phase under gradient conditions, from aqueous anmionium acetate buffer (pH 5.5) to acetonitrile [17]. Shallow gradient (large fc) vv^ith a long colunm has led to better separation. Improvement in detection sensitivity at MS was observed simultaneously (Fig. 5). Figure 5 shows that
116
T. Ikegami et al.
some compounds showed an increase in the sensitivity with better separation, while some others are not affected. The results indicate that improvement of separation on the longer columns caused the reduction of the ion suppression effect (ion suppression is an undesirable effect of easily ionizable contaminants coexisting with the solute of interest is a droplet of an eluted solute band) by introducing the solute bands separately into the ES ionization interface. In the case of Fig. 4, the peak capacity provided by the long monolithic silica column is still not enough for complete separation, but it shows a feasible approach of using longer monolithic silica capillary columns to achieve higher separation efficiency avoiding ion suppression effect in an LC/ESI-MS system. This approach will result in longer separation time, but the amount and quality of information after the analysis of metabolites would be better than conventional LC/MS systems using particle-packed columns. Preparation of monolithic silica columns of 1-2 m lengths is not difficult, and the use of such a column is a simple way to get higher peak capacities in LC/ESI-MS systems.
30
31
37
38
Fig. 5. Magnification of three positive ionization full-scan MS chromatograms of replicate injections of a single Arabidopsis leaf extract onto a 200-|am-o.d. capillary monolithic CI8 column. Left panel 300 mm column length; middle panel 600 mm column length; right panel 900 mm column length. Gray-shadowedpeaks represent the relative peak abundance of the extracted ion trace m/z 852.2 in comparison with coeluting base peaks m/z 571.6, 734.2, and 760.3. Relative heights are scaled to the maximum height of 1.41x10^ units
High-Performance Liquid Chromatography for Metabolomics
117
3.2. Optimization of Gradient Elution [18,19] In the gradient elution method, with increasing the content of a stronger eluent (mobile phase B), by controlling a gradient range [^ (B% final)-^ (B% initial)], gradient time (to), and flow rate (F), one can manipulate a separation. Three parameters, A^, a, and k are to be considered to optimize the separation conditions as isocratic separations. In the case of the optimization of gradient conditions, the average retention factor yt* (Eq. (7), the retention factor at the middle of a column) should be taken into account. The average retention factor, A:*, is a function of S (influenced by system and mobile phase, the slope of the plot of logA: against the content of solvent B in the mobile phase, 5^4-10 for small molecules less than 1000 molecular weight [MW], 5^ca. 30 for molecules with 10 000 MW, and 5^ca. 100 for macromolecules with 100 000 MW), Vm (mobile phase volume in a column: given in |Lil), A^ (change in B solvent content in mobile phase from ^ to ^: A ^ l .0 for gradient from 0% B to 100% B), F (jLU/min), and fc (min). ;t*=Ffc/1.15FmA(ZiS
(7)
In the case of linear gradient of acetonitrile from 0% to 100% for a separation system such as in Fig. 4, estimation of A:* for monolithic silica capillary columns of 100 cm and 30 cm lengths, 200 jiim i.d. under the conditions of /G=100 min, and F=4 jul/min results in Ä:* vales of 2.5 and 8.3, respectively. Among the three parameters of equation for resolution, the shorter column is 1.25 times more effective in the comparison of k value than long column. The column of 100 cm, however, provides 1.4 times better resolution than column of 30 cm in total, by considering the difference in A^. The longer column is supposed to produce 3.3 times greater number of theoretical plates than the shorter column. In the case of gradient elution of a macromolecule, a large S value results in a small Ic^ value. As a consequence, a short colunm becomes relatively advantageous. Since a greater flow rate will reduce the number of theoretical plates, it generally leads to the decrease in resolution. Larger flow rate, however, can result in better resolution of macromolecules in some cases, where larger Ic^ values give greater influence to the resolutions than the decrease in colunm efficiencies with the increase in F. When one optimizes the separation conditions, (i) yt* should be kept around 5, and (ii) a gradient range, A^ should be minimized judging from the elution range of solutes in a given chromatogram, followed by (iii) fine tuning of to, F, and column lengths, if possible. Longer to and a longer column lead to better resolution, while smaller A:* and shorter to l^ad to smaller resolution and better detection sensitivity, and smaller F results in
118
T. Ikegami et al.
better detection sensitivity. In the range of A:*=5-10, a compromise between an efficient separation and good detection sensitivity is obtained. Selectivity in gradient elution depends on both the nature of a mobile phase and gradient (A^fc) slope, and the optimization of resolution based on k* parameter contains complex factors. A certain level of a flow rate is required in order to provide smooth gradient of a mobile phase. However, the higher flow rate leads to the greater dilution of solute bands, which will result in the decrease in detection sensitivity. Although the use of a longer colunm (increase in Vm) keeping Ic^ and F values constant may lead to the increase in separation time and peak widths, improvement of separation efficiencies due to the increase in the number of theoretical plates, and a decrease in ion suppression are expected. Since solutes are eluted with similar peak widths, in the linear gradient elution (A^fc^constant), it will be very convenient to optimize the peak width for MS analysis by changing to and column lengths.
4. Two-Dimensional HPLC Peak capacity (PC), given by Eq. (8), indicates separation ability, i.e., how many solutes can be potentially separated by a chromatographic system. Retention times of the first solute and the last solute are given as ti and /R, respectively, in Eq. (8). Separation methods such as ultrahigh-pressure liquid chromatography (UHPLC) and supercritical fluid chromatography (SFC) can produce PC=ca. 300 per 1 hour [20,21], while a conventional HPLC system gives PC=100-200 per 1 hour. To achieve far larger PC using conventional HPLC systems, multidimensional separation systems have been examined. When two chromatographic systems with PC^ and PC^ are combined to form a two-dimensional (2D) chromatography system, PC for the total system can be theoretically estimated as a product of two PC values as Eq. (9) [22]. (Equation (9) can be realized when 2D-HPLC separation was carried out ideally, but practically partial re-mixings of solute bands separated in Ist-D occurs to reduce total PC.) PC=l+(A^^'V4)ln(/R/^i)
(8)
PC2D=PC,.PC^
(9)
In comprehensive 2D-HPLC separations every fraction obtained from 1 st-D separation is to be separated in 2nd-D HPLC, while the next fraction is eluted from Ist-D. Therefore the 2nd-D column should ideally be eluted at very high speed to meet the rate of fractionation at the Ist-D separation. The 2nd-D column should possess low pressure drop and reasonable efficiency
High-Performance Liquid Chromatography for Metabolomics
119
at high flow rate. In addition to high efficiency and high permeability, Ist-D and 2nd-D columns must possess adequate difference in selectivity to effect 2D separations. Ideally the Ist-D and 2nd-D should have orthogonal selectivity or different separation mechanisms [23-25]. Ion-exchange mode and reversed-phase mode, or size-exclusion mode and reversed-phase mode, have often been combined to effect 2D separations for peptide mixtures in proteomics. Fractionation by ion-exchange mode separation is followed by trapping and concentration of solutes, which are finally separated by reversed-phase chromatography in 2nd-D separation. Since ion-exchange mode cannot produce efficient separation at a high flow rate, it takes a relatively long time to carry out the 2D separation. The peak capacity of 2D-HPLC is mainly afforded by reversed-phase mode separation. Because a particle-packed column cannot be operated at adequately high flow rate, various approaches were taken in the past: (i) small columns were employed at Ist-D compared to 2nd-D, (ii) the first column was eluted slowly or intermittently, or (iii) two or more sets of chromatographs were used at the 2nd-D. For example Unger and coworkers used four columns in 2nd-D separations to speed up 2D-HPLC for proteome studies [25]. Even with those effects, however, truly "two-dimensional" HPLC is hard to achieve due to the mixing of separation modes. 4.1. Reversed-Phase 2D-HPLC Figures 6-9 show a scheme of reversed-phase 2D-HPLC and its working principle, in which the outlet tubing of Ist-D column was connected to a loop of the 2nd-D injector to couple particle-packed Ist-D column and 2nd-D short monolithic silica column run at higher flow rate [26]. In this case, fraction form Ist-D column is loaded and temporarily kept in a loop of the 2nd-D injector that result in mixing of separated peaks, but the flow rates of two HPLC systems can be controlled independently, and the 2nd-D separation can be carried out at very high flow rate (for example, 10 ml/min) throughout the separation. For a Ist-D column, a fluoroalkyl-bonded silica particle-packed column was employed, while a short monolithic silica column modified with CI8 alkyl groups was used. Fractions from the Ist-D column were injected into the 2nd-D column every 30 s. The sample or the eluent from Ist-D HPLC was loaded into the sample loop of 2nd-D for 28 s, and injected for 2 s, while 30 s was used for separation at 2nd-D as shown in Fig. 6a. This simplest 2D-HPLC produced PC=1000. In this system, ca. 7% of the Ist-D effluent in volume was wasted at the 2nd-D. The maximum possible loss of a component during 2 s injection time is estimated to be less than 20% even
120
T. Ikegami et al.
for the first peak in the Ist-D separation. When two 6-port valves or a 10-port valve is used at the 2nd-D HPLC in Fig. 6b, all fractions can be separated at 2nd-D column to provide a comprehensive 2D-HPLC system. If two monolithic silica columns are used for 2nd-D separations, fractionation at Ist-D every 15 s was possible with 30 s separation time at 2nd-D. Monolithic silica columns could be used under very fast flow conditions (10 mm/s linear velocity), and provided PC2nd-D=17 in 30 s. Pentabromobenzyl (PBB) stationary phase, that possesses higher polarizablility than CI8 stationary phase, showed greater difference in selectivity compared with fluoroalkyl stationary phase. In Fig. 8, the 30-s fraction from Ist-D separation contains many overlapping peaks that were separated into several peaks in the next 30 s at 2nd-D. When the 2nd-D chromatogram was plotted against time axes, a 2D-chromatogram was obtained as shown in Fig. 9. The 2D-chromatogram exhibited so-called group separation, in which solutes of similar structural features appear as a group. Similar to 2D-GC, 2D-HPLC provides structural information of solutes as a result of separation of complex mixtures. In this case, at the fluoroalkyl stationary phase (Fig. 7), hydrophobicity and the lower polarizability of solutes are predominant in retention, while at the CI8 stationary phase, hydrophobicity and high polarizability of solutes dominate the retention process. The difference in the retentive behavior leads to 2D separation [27,28]. Because of the fast flow rate in 2nd-D separation, 2D-HPLC consumes a lot of mobile phases. To reduce the consumption of mobile phases, the sufficiently fast, simple 2D-HPLC using capillary columns is in progress. It is interesting to test the separation abiUty of 2D-HPLC systems for complex mixtures in metabolomics, utilizing normal phase mode, size exclusion mode, or ion-exchange mode as well as reversed-phase separation mode. (b)
(a) 1st-D Column
1st-D Column
1st-D
-HJ==flHi Detector! 2nd-D Column
2nd-D PUMP
2nd-DColumnLoopA
WASTE
• Load
Loop A Load, Loop B Inject
Inject
Loop A Inject, Loop B Load
Fig. 6. a Tubing connection at 2nd-D injector of simple 2D-HPLC. b Tubing connection of two six-port valves used as 2nd-D injector. /), dimension
High-Performance Liquid Chromatography for Metabolomics
121
CF3 CF2
-t -Si,
FR
PBB
Cl8
5 |im silica
Monolith
Monolith
particle-packed
(Throughpore 2.4 mm)
(Throughpore 2.0 mm)
Fig. 7. Structures of the stationary phases used in two-dimensional high-performance liquid chromatography (2D-HPLC). FR, 5,5,6,6,7,7,7-heptafluoro-4,4-bis(trifluoromethyl)heptyl, PBB, pentabromobenzyl
10
20
30 40 Time (min)
50
60
70
b 2nd-D C18 l^-^ime (min)
l l . l l M , l I!
10
20
30 40 Time (min)
50
60
70
Fig, 8. Two-dimensional separation of a mixture of hydrocarbons and benzene derivatives in simple 2D-HPLC. a Chromatogram obtained in the Ist-D on FR column in 60% methanol/water. b Chromatograms obtained in the 2nd-D on CI8 column in 80% methanol/water. The insets a and b are expanded views of panels a and b, respectively
122
T. Ikegami et al.
Fig. 9. A contour plot obtained for 2D-HPLC. Fractionation every 30 s at the Ist-D. Flow rate: 0.4 ml/min for Ist-D and 10 ml/min for 2nd-D
4.2. Combination of Reversed-Phase IHPLC and Other Separation IVIodes Since many compounds of similar properties are to be separated in proteomics, a 2D-HPLC, LC/MS system has been employed combining ion-exchange mode and reversed-phase mode, or size-exclusion mode and reversed-phase mode. In the case of metabolomics, a combination of several different separation modes is preferable to separate a variety of substances. That is why several methods have been utilized including GC/MS, CE/MS, and LC/CE [29-33]. Reversed-phase mode is most often employed in HPLC, where chemically bonded stationary phases (C8, CI8, C30, etc.) have advantages in rapid equilibration with mobile phase, and high separation efficiency and high reproducibility in gradient separations, based on the hydrophobic properties of the stationary phases. On the other hand, adsorption of solutes onto polar (hydrophilic functional groups) groups is used as the driving force of retentions in normal phase chromatography. Since packing materials possessing hydrophilic functional groups bonded to the silica or organic polymer supports can
High-Performance Liquid Chromatography for Metabolomics
123
achieve rapid equilibrium with aqueous mobile phases, they are also useful for gradient elution separation. Such an HPLC mode utilizing the interaction between solutes and hydrophilic functional groups is called a hydrophilic LC (HILIC LC) [34,35]. Capillary-type columns for HILIC mode have become available recently. The selectivities of HILIC colunms are similar to those of a conventional silica column, but HILIC columns have advantages over silica columns in the recovery of samples, and compatibility with mobile phases used in re versed-phase mode. Figure 10 shows a comparison of elution patterns of HILIC mode and reversed-phase mode in the separation of an extract from Arab idops is thaliana [36]. Since the solvent type for HILIC and reversed-phase mode are common, there are possibilities to combine the two separation modes to form multidimensional HPLC, although the composition of mobile phases that controls the retention order is totally opposite in each method.
Fig. 10. Comparison of chromatograms of an Arabidopsis thaliana leaf methanol extract, obtained by hydrophilic interaction chromatography {HILIO-LC mode (a) and reversed-phase {RP) mode (b): Conditions: (A) TSK Gel Amide 80, 4.6x150 mm, gradient elution from acetonitrile (MeCN) to ammonium acetate buffer (6.5 mM, pH 5.5), MeCN content (%) (time, min) 100 -^ 100 (5) -^ 90 (8) -^ 60 (75) -^ 0 (80); (B) C18 column 4.6x150 mm, gradient elution from ammonium acetate buffer (6.5 mM, pH 5.5) to MeCN, MeCN content (%) (time, min) 0 ^ 0 (15) -^ 95 (40) -^ 100 (60) -^ 100 (80)
124
T. Ikegami et al.
5. Summary Routine use of micro HPLC will need the development of several important constituents; the reproducible preparation of high-performance columns, small-volume pumps and gradient systems, and improvement of an injection system. Subjects to be studied are the development of high-performance monolithic silica columns for a variety of separation modes, multidimensional micro LC systems, and optimization of an interface between LC and MS instruments. Large peak capacities realized by highly efficient micro HPLC systems or multidimensional HPLC will greatly contribute to metabolomics studies when coupled with MS instruments.
References 1. Tomita M, Nishioka T (2003) Frontiers in metabolome sciences. Springer, Tokyo 2. Ishizuka N, Minakuchi H, Nakanishi K, Soga N, Nagayama H, Hosoya K, Tanaka N (2000) Performance of a monolithic silica column in a capillary under pressure-driven and electrodriven conditions. Anal Chem 72:1275-1280 3. Tanaka N, Nagayama H, Kobayashi H, Ikegami T, Hosoya K, Ishizuka N, Minakuchi H, Nakanishi K, Cabrera K, Lubda D (2000) Monolithic silica columns for HPLC, micro-HPLC, and CEC. J High Resolut Chromatogr 23:111-116 4. Minakuchi H, Nakanishi K, Soga N, Ishizuka N, Nobuo Tanaka N (1996) Octadecylsilylated porous silica rods as separation media for reversed-phase Hquid chromatography. Anal Chem 68:3498-3501 5. Tanaka N, Kobayashi H, Nakanishi K, Minakuchi H, Ishizuka N (2001) Monolithic LC columns. Anal Chem 73:420A-429A 6. Motokawa M, Kobayashi H, Ishizuka N, Minakuchi H, Nakanishi K, Jinnai H, Hosoya K, Ikegami T, Tanaka N (2002) Monolithic silica columns with various skeleton sizes and through-pore sizes for capillary liquid chromatography. J Chromatogr A 961:53-63 7. Svec F, Peters EC, Sykora D, Yu C, Frechet JMJ (2000) MonoUthic stationary phases for capillary electrochromatography based on synthetic polymers: designs and applications. J High Resolut Chromatogr 23:3-18 8. Minakuchi H, Nakanishi K, Soga N, Ishizuka N, Tanaka N (1997) Effect of skeleton size on the performance of octadecylsilylated continuous porous silica columns in reversed-phase liquid chromatography. J Chromatogr A 762:135-146 9. Minakuchi H, Nakanishi K, Soga N, Ishizuka N, Tanaka N (1998) Effect of domain size on the performance of octadecylsilylated continuous porous silica
High-Performance Liquid Chromatography for Metaboiomics
10. 11.
12.
13. 14. 15. 16.
17.
18. 19. 20.
21. 22. 23.
24.
25.
26.
125
columns in reversed-phase liquid chromatography J Chromatogr A 797:121-131 Bristow PA, Knox JH (1977) Standardization of test conditions for high performance liquid chromatography columns. Chromatographia 10:279-289 Ishizuka N, Kobayashi H, Minakuchi H, Nakanishi K, Hirao K, Hosoya K, Ikegami T, Tanaka N (2002) Monolithic silica columns for high-efficiency separations by high-performance liquid chromatography. J Chromatogr A 960:85-96 Schmidt A, Karas M, Dülcks T (2003) Effect of different solution flow rates on analyte ion signals in nano-ESI MS, or: when does ESI turn into nano-ESI. J Am Soc Mass Spectrom 14:492-500 Giddings JC (1965) Dynamics of chromatography, part 1, principles and theory. Marcel Dekker, New York lonsource home page. Introduction to capillary chromatography, 3) a simple system; http://www.ionsource.com/tutorial/capillary/captoc.htm Taniguchi H, Murata Y, (2002) The newest protocol of proteomics 9, Capillary HPLC. Cell Tech 21:1332-1343 Terabe S, Nishi H, Ando T (1981) Separation of cytochromes c by reversed-phase high-performance liquid chromatography. J Chromatogr 212:295-304 Tolstikov VV, Lommen A, Nakanishi K, Tanaka N, Fiehn O (2003) Monolithic silica-based capillary reversed-phase liquid chromatography/electrospray mass spectrometry for plant metaboiomics. Anal Chem 75:6737-6740 Snyder LR, Grajch JL, Kirkland JJ (1988) Practical HPLC method development. Wiley, New York, Chapters 6 and 9 Snyder LR, Stadalius M, Quarry MA (1983) Gradient elution in reversed-phase HPLC separation of macromolecules. Anal Chem 55:1412A-1430A MacNair JE, Patel KD, Jorgenson JW (1999) Ultrahigh-pressure reversed-phase capillary liquid chromatography: isocratic and gradient elution using columns packed with 1.0 \im particles. Anal Chem 71:700-708 Shen Y, Lee ML (1998) General equation for peak capacity in column chromatography. Anal Chem 70:3853-3856 Giddings JC (1991) Unified separation science. Wiley-Interscience, New York, pp 126-128 Bushey MM, Jorgenson JW (1990) Automated instrumentation for comprehensive two-dimensional high-performance liquid chromatography of proteins. Anal Chem 62:161-167 Köhne AP, Welsch T (1999) Coupling of a microbore column with a column packed with non-porous particles for fast comprehensive two-dimensional high-performance liquid chromatography. J Chromatogr A 845:463^69 Wagner K, Miliotis T, Marko-Varga G, Bischoff R, Unger KK (2002) An automated on-line multidimensional HPLC system for protein and peptide mapping with integrated sample preparation. Anal Chem 74:809-820 Tanaka N, Kimura H, Tokuda D, Hosoya K, Ikegami T, Ishizuka N, Minakuchi H, Nakanishi K, Shintani Y, Furuno M, Cabrera K (2004) Simple and com-
126
27.
28.
29.
30.
31. 32.
33.
34.
35. 36.
T. Ikegami et al. prehensive two-dimensional reversed-phase HPLC using monolithic silica columns. Anal Chem 76:1273-1281 Turowski M, Morimoto T, Kimata K, Monde H, Ikegami T, Hosoya K, Tanaka N (2001) Selectivity of stationary phases in reversed-phase liquid chromatography based on the dispersion interactions. J Chromatogr A 911:177-190 Turowski M, Yamakawa N, Meiler J, Kimata K, Ikegami T, Hosoya K, Tanaka N, Thornton ER (2003) Deuterium isotope effects on hydrophobic interactions: the importance of dispersion interactions in the hydrophobic phase. J Am Chem Soc 125:13836-13849 Fiehn O, Kopka J, Dormann P, Altmann T, Trethewey RN, Willmitzer L (2000) Metabolite profiling for plant functional genomics. Nat Biotechnol 18:1157-1161 Fiehn O, Kopka J, Trethewey RN, Willmitzer L (2000) Identification of uncommon plant metabolites based on calculation of elemental compositions using gas chromatography and quadrupole mass spectrometry. Anal Chem 72:3573-3580 Soga T, Heiger DN (2000) Amino acid analysis by capillary electrophoresis electrospray ionization mass spectrometry. Anal Chem 72:1236-1241 Soga T, Ohashi Y, Ueno Y, Naraoka H, Tomita M, Nishioka T (2003) Quantitative metabolome analysis using capillary electrophoresis mass spectrometry. J Proteome Res 2:488-494 Jia L, Liu BF, Terabe S, Nishioka T (2004) Two-dimensional separation method for analysis of Bacillus subtilis metabolites via hyphenation of micro-liquid chromatography and capillary electrophoresis. Anal Chem 76:1419-1428 Alpert AJ, Shukla M, Shukla AK, Zieske LR, Yuen SW, Ferguson MAJ, Mehlert A, Pauly M, Orlando R (1994) Hydrophilic-interaction chromatography of complex carbohydrates. J Chromatogr A 676:191-202 Yoshida T (1997) Peptide separation in normal phase liquid chromatography. Anal Chem 69:3038-3043 Tolstikov VV, Fiehn O (2002) Analysis of highly polar compounds of plant origin: Combination of hydrophilic interaction chromatography and electrospray ion trap mass spectrometry. Anal Biochem 301:298-307
Part III. Applications of Metabolome Analysis to Biosciences
Chapter 9: Combined Analysis of IVIetabolome and Transcriptome: Catabolism in Bacillus subtilis Takaaki Nishioka , Keiko Matsuda , and Yasutaro Fujita ^Graduate School of Agriculture, K; Kyoto University, Oiwake-cho, Kitashirakawa, Kyoto 606-8502, Japan ^Graduate School of Information Sc Science, Nara Institute of Science and Technology, 8916-5, Takayama-cho, Ikoma, Nara 630-0192, Japan ^Department of Biotechnology, Faculty of Life Science, Fukuyam^ versity, 1-3, Gakuen-cho, Fukuyama, Hiroshima 729-0292, Japan
1. Regulation of Metabolism in Microorganisms Microorganisms regulate metabolism to adapt to frequent changes in nutritional availability and, generally, in environmental conditions so that they can retain the optimum growth under different conditions. Generally microorganisms use two strategies to modify enzyme activities. The first strategy is a local regulation of metabolism by feedback regulation. Catalytic activities of enzymes located upstream on a metabolic pathway are modified by one or more of the metabolites downstream through allosteric interaction. The second strategy is global regulation of metabolism, which simultaneously modifies catalytic activities of various enzymes on a metabolic pathway network by up- or downregulating the transcription of their genes. Catabolite repressions in Escherichia coli and Bacillus subtilis are well-known examples of global regulation. Local regulation has been well studied because reconstitution in vitro is easily achieved by mixing purified enzymes with metabolites in a test tube. In contrast, global regulation has not been studied often, reconstitution in vitro is dufficult, and has been analyzed only on the basis of global changes in metabolites. Studies using DNA microarrays have revealed that catabolite repressions modify the expressions of more than 100 genes coding enzymes and transcriptional regulatory proteins. However, no study has reported how such changes in gene expressions affect on the intracellular amount of metabolites.
128
T. Nishioka et al.
In this chapter, we measure the metabolite profiles of B. subtilis cells growing on five different carbon sources and analyze the biochemical meaning of the differences among the five metabolite profiles. At the same time, by using B. subtilis microarrays we measure the gene expressions in the cells that were growing on malate, which is one of the carbon sources. We show that combining the metabolome with the transcriptome data is more informative than each of the two data sets independently analyzed, and discuss the biochemical mechanism underlying catabolite repression.
Fig. 1. Proposed mechanism of the catabolite repression in Bacillus subtilis. The left part of the figure shows the phosphotransport system (PTS) of glucose and the other part shows the regulation of gene expressions. When glucose concentration is higher in culture medium, the expression of PTS genes, IIABCglc(PtsG) are activated. PTSG phosphorylates glucose to glucose-6-phosphate, which is transported into the cells as the starting metabolite for glucose catabolism. The phosphate used for the phosphorylation of glucose is supplied from phosphoenolpyruvate. The glycolysis of glucose-6-phosphate increases the concentration offructose-1,6-bisphosphate,which stimulates the ATP-dependent-protein kinase, HPr kinase. HPr kinase phosphorylates two proteins, HPr protein and Crh, to give HPr-P and Crh-P, respectively. These phosphorylated proteins are capable of binding with CcpA to form CcpA-HPr-P and CcpA-Crh-P. The complexes can bind to the operator site, ere, which is located upstream catabolite-suppressed or -activated genes, inhibiting or stimulating their expressions
Combined Analysis of Metabolome and Transcriptome
129
Gycerol-6P
Gycerol F16P
•
^*- 1 o JC L_ 1 CD
.^•^
(/)
>»l 0)
o Ü
c
1 1
0
>% "ü O •^ E LU y
Phosphoenol pyruvate
1 Pyruvate Fig. 2. Carbon sources have different access to the metabolic pathway network of Bacillus subtilis. The central carbon flow on the Embden-Meyerhof pathway is glycolytic or gluconeogenetic; glucose,fructose,glycerol, and gluconate is glycolytic, whereas malate is gluconeogenetic. The carbon sources are transported from the culture medium to the cells as glucose-6-phosphate (G-6P),fructose-6-phosphate(F-6P), glycerol-6-phosphate (glycerol-6P), gluconate, and malate. These are the "starting metabolites" for the catabolism
2. Proposed Mechanism of Catabolite Repression in 6. subtilis Microorganisms grow on various organic substances as a carbon source. When a mixture of different organic substances is available in the grovv1:h medium, they prefer some carbon sources. After consuming the favored carbon sources, they catabolize the others. Their cells repress the enzyme activities that are necessary for catabolizing the other carbon sources. This is called "catabolite repression." Glucose is often the most effective carbohydrate causing repression. In E. coli and B. subtilis, catabolite repressions have been studied by measuring gene expressions w^ith DNA microarrays [1,2]. Proposed mechanisms for the repressions are completely different betw^een the tw^o microorganisms.
130
T. Nishioka et al.
In the E. coli cells growing on glucose as a carbon source, the intracellular level of cyclic AMP (cAMP) is quite low. This suppresses the expression of enzyme genes that are necessary for catabolizing the other carbon sources. Growing on lactose as a carbon source, the E. coli cells increase cAMP, which forms a complex with its receptor protein, C-reactive protein (CRP). The camp-CRP complex binds to the promoter region in the lactose operon, expressing the genes of enzymes that are necessary to catabolize lactose [3,4]. This mechanism, however, is inapplicable to B, subtilis, because no cAMP exists in the B. subtilis cells. Fujita et al. [5] proposed a different mechanism for the catabolite repression in B. subtilis (Fig. 1). On glucose or fructose as a carbon source, fructose 1,6-bisphosphate (F16P) in the B. subtilis cells increases and activates HPr kinase to phosphorylate HPr protein. Phosphorylated HPr protein forms a complex with a transcription factor, CcpA. By binding to a catabolite-responsive element ere, the complex suppresses the gene expressions for enzymes that are necessary for metabolism of other carbon sources.
3. Experimental Methods Bacillus subtilis strain 168 was cultured at 37°C on modified 2xSG medium containing either glucose, fructose, gluconate, malate, or glycerol as a carbon source (Fig. 2). The concentration of carbon sources in the culture medium was adjusted to be the same as a total of carbon atoms, 150 mM carbon atom; concentrations in the culture medium were 25 mM for glucose, fructose, and gluconate, 37.5 mM for malate, and 50 mM for glycerol. Yeast extract was not added to the culture medium, because its addition induced the expressions of more genes on DNA microarray chips prepared for B. subtilis. Bacillus subtilis was cultured and its growth monitored using optical density at 600 nm (ODÖOO). When the growth reached the middle of logarithmic phase (0.85 ODeoo) in each culture, we collected B. subtilis cells in 10 ml culture medium on a glass filter and quickly extracted the metabolites in the cells by immersing the glass filter in 1 ml methanol containing methionine sulfone and PIPES as a cationic and an anionic internal standard, respectively. Phospholipids and biopolymers, which are adsorbed on the inner wall of a glass capillary and interfere with analysis, were removed from the methanol extracts by two methods: partitioning with chloroform-water solution and microfiltration. The filtrates from the microfiltration were lyophilized and stored at -80°C. Lyophilized samples were dissolved in water before analysis.
Combined Analysis of Metabolome and Transcriptome
131
Each sample was analyzed with three capillary electrophoresis/mass spectrometry (CE/MS) systems that were conditioned for the separations of cationic and anionic metabolites, and mononucleotides [6,7]. We identified metabolites by matching their migration times on CE and m/z values on MS to those measured with a total of 352 metabolic standards. Among the metabolic intermediates of energy metabolism, glyceraldehyde-3phosphate and oxaloacetate were not measured because they are unstable and decompose during the extraction procedure. The number of cells in a 10-ml sample and volume of a cell were estimated as 4x10^ cells/ml at OD6oo=1.0 and 2.75x10"^^ 1, respectively [8].
4. Results 4.1. Growth Curves We cultured B. subtilis on each of the five different carbon sources: glucose, fructose, gluconate, malate, and glycerol. Glucose and fructose have been reported as carbon sources that suppress the catabolism of the other three carbon sources [5]. Bacillus subtilis cells grew at almost the same rate on the different carbon sources; the mean of doubling times was about 1.4 h. When compared on the basis of the growth curves, the logarithmic phase on glycerol or gluconate was delayed by 1.5 h from that on glucose or fructose. At the middle of the logarithmic phase in each culture, OD6oo=0.85, we collected the cells on a glass filter and quickly extracted metabolites from the cells on the filter with methanol. The intracellular amount of metabolites in the extract was measured with CE/MS. 4.2. Metabolite Profiles of the 8. subtilis Cells Grown on Different Carbon Sources We measured the intracellular amounts of 88 metabolites in the B. subtilis cells grown on five different carbon sources. The measured metabolites include amino acids and metabolic intermediates in glycolytic and Entner-Doudoroff pathways and tricarboxylic acid (TCA) cycles. Figure 3a shows the metabolite profile in the cells grown on glucose. In the profile, circles show metabolites and their sizes are proportional to the concentrations of metabolites. Metabolites are plotted on the metabolic pathway network of B. subtilis predicted by ARM [9,10]. Intracellular amounts of the metabolic intermediates are low and not accumulated in glycolytic and
132
T. Nishioka et al.
Entner-Doudoroff pathways that are a major route of the carbon flux. Citrate and succinate are higher in the TCA cycle. A greater amount of lactate was probably induced under anaerobic culture conditions. Aeration of the culture medium might not have been sufficient near the mid logarithmic phase when they consumed oxygen at the maximum rate. Figure 3b shows the metabolite profile observed in the cells grown on fructose. This profile is quite similar to the one observed in the cells grown on glucose. Fructose and glucose are the favored carbon sources for B. subtilis. Figure 3c-e shows the metabolite profiles in the cells grown on gluconate, glycerol, and malate, respectively, whose catabolism was suppressed by glucose and fructose. These three profiles share almost the same features as the cells grown on glucose and fructose. Although the five metabolite profiles appeared similar, we found a specific and common difference between the profiles of the suppressed and the suppressing carbon sources. In the three profiles of the suppressed carbon sources, gluconate in Fig. 3c, glycerol-3-phosphate in Fig. 3d, and malate in Fig. 3e are have a much greater intracellular amount than the other profiles. These three metabolites are the starting metabolites in the catabolism of gluconate, glycerol, and malate, respectively. In contrast, the starting metabolites in the two profiles of the suppressing carbon sources are glucose-6-phosphate and fructose-1-phosphate, which have a lower amount (Fig. 3a,b). In other words, starting metabolites were accumulated in the cells growing on carbon sources whose catabolism is suppressed but not on those that repress others. The intracellular amount of a starting metabolite theoretically depends on the difference between two fluxes: the uptake of a carbon source as a starting metabolite and the subsequent transformation of the starting metabolite into other metabolic intermediates. Accumulation of a starting metabolite indicates that the uptake might be larger than the transformation. In B. subtilis, the rate-limiting step is most likely the intracellular transformation or the uptake depending on the repressed or the repressing carbon sources, respectively.
Combined Analysis of Metabolome and Transcriptome
133
Fig. 3. Metabolite profiles of Bacillus subtilis cells growing on five different carbon sources, a-e Metabolite profiles of the cells growing on glucose, fructose, glycerol, gluconate, and malate, respectively. Metabolites were extracted from the cells at the maximum growth rate m each culture. Circles are metabolites and are located on the metabolic pathways of glycolysis, pentose phosphate cycle, and tricarboxylic acid (TCA) cycle. The size of each circle corresponds to the intracellular amount of the metabolite. The number shown on each circle is the intracellular concentration of the metabolite in mM
134
T. Nishioka et al.
Fig. 3. (continued)
Combined Analysis of Metabolome and Transcriptome
135
Fig. 3. (continued)
4.3. Physiological Meanings of Similar Metabolite Profiles The metabolite profiles we observed were those of the B. subtilis cells that were grown at the maximum growth rate. While the maximum growth rates observed on different carbon sources were the same, the metabolite profiles were also similar independent of the carbon sources. This suggests that the similar profiles might form the biochemical basis of the same maximum growth rates. In other words, B. subtilis that were cultured on different carbon sources globally regulated the metabolism to maintain the metabolite profile optimized for the maximum growth rate. The optimized metabolite profile must be equal to the average of the five metabolite profiles we observed. Bacillus subtilis might use the profile as a template for metabolic regulation. Every living organism has its own template because they have specific nutritional requirements and metabolic pathway networks.
136
T. Nishioka et al.
4.4. Fructose-1,6-Bisphosphate (F16P), a Key Metabolite in the Uptai^e of Glucose and Fructose Three types of transport systems are known for the uptake of carbon sources in B, subtilis: phosphotransferase systems (PTSs), channels, and active transporters [11]. Glucose and fructose are transported by different PTSs, PtsG, and FruA, respectively. Glycerol and gluconate are transported by a channel protein, GlpF, and an active transporter protein, GntP, respectively. The transporter system for malate has not been identified. The two PTSs first phosphorylate glucose and fructose in culture medium to the corresponding phosphates, glucose-6-phosphate and fructose-1-phosphate, respectively, then transport the phosphorylated sugars into the cell (Fig. 1). The proposed mechanism of the catabolite repression in B. subtilis supposes that the transport of glucose and fructose by PTS and the following catabolism of their phosphates increase the intracellular concentration of fructose-1,6-bisphosphate, F16P. The increase is assumed as a key metabolic intermediate that induces catabolite repression. The measured amount of F16P in the cells growing on the suppressing carbon sources. Fig. 3a,b, was as low as that of F16P in the cells grown on the suppressed carbon sources. Fig. 3c-e. These profiles do not satisfy the proposed mechanism; F16P is not a sole factor that induces catabolite repression in B. subtilis.
4.5. Combining a Metabolite Profile with a Gene Expression Profile Fujita et al. cultured B. subtilis on malate and glucose, extracted mRNAs from the cells at the middle logarithmic phase of each culture, OD6oo=0.8, and measured the gene expressions by using B. subtilis DNA microarrays that contained 4055 protein genes [12]. The complete microarray data of their study are available on the KEGG web site, http://www.genome.jp/ kegg/expression. The ratio (malate/glucose) of the gene expressions in the cells growing on malate to those on glucose contains the sum of the difference between the suppressed and the suppressing carbon sources, and that between the gluconeogenesis and the glycolysis. The ratio (malate/glucose) of enzyme genes are overlaid on the metabolite profile in the cells growing on malate (Fig. 4). In Fig. 4, the names of genes are in boxes when their expression increased or decreased more than twofold on malate. The catabolism of malate and glucose shares the Embden-Meyerhof pathway, although their carbon atoms flow in opposite directions; malate is
Combined Analysis of Metabolome and Transcriptome
137
gluconeogenesis, whereas glucose is glycolysis. In addition, we observed that the metabolite profiles and maximum cell growth rates on the two carbohydrates were similar. These facts prompted us to expect that the gene expression profiles would also be similar growing on either malate or glucose. However, the observed gene transcripts were significantly reduced in the Embden-Meyerhof pathway on malate, because the five enzyme genes in Embden-Meyerhof pathway, eno, pgm, tpiA, pgk, and gapA, form an operon on the B. subtilis genome, whose expressions are regulated by glucose (Fig. 5). In spite of the reduced ratio (malate/glucose) in the central carbon flow pathway, the metabolite profile and maximum cell growth rate on malate remained similar to those on glucose. This suggests that the enzymes in the Embden-Meyerhof pathway might constitute enough for the gluconeogenesis of malate to support the maximum grow1;h rate because a physiological role of the pathway is to carry the central carbon flow. Among the enzyme reactions in the Embden-Meyerhof pathway in B. subtilis, the sole irreversible reaction is the transformation between glycerate-1,3-diphosphate and glyceraldehyde-3-phosphate. GapA and GapB irreversibly catalyze the reactions in glycolysis and gluconeogenesis, respectively. Because GapB is indispensable for the catabolism of malate, GapB was upregulated by 61 times more on malate than on glucose. Malate as the starting metabolite could be catabolized by three possible pathways: from malate to pyruvate, from oxaloacetate to phosphoenolpyruvate, and from acetyl-CoA to pyruvate. The ratio (malate/glucose) of all the enzyme genes in the three pathways was higher than twofold. In spite of the upregulation malate accumulated in the cells, suggesting that the three pathways are rate-limiting in the catabolism of malate. A bypass for the gluconeogenesis of malate is the Entner-Doudoroff pathway that produces 6-phosphogluconate via 2-dehydro-3-deoxygluconate from pyruvate. The enzyme genes in this pathway were upregulated by 2-4 times on malate. DNA microarrays predicted the major metabolic pathways used for the catabolism of malate. The ratio (malate/glucose) was increased at the enzyme genes within the distance of two or three reaction steps from malate.
138
T. Nishioka et al.
Fig. 4. Combined profiles of metabolites and gene expressions of Bacillus subtilis. See Color Plate 2. Gene expression profile was overlayed to the metabolite profile of the cells grown on malate (Fig. 3e). Letters in italic are the gene names of enzymes. Number in the parentheses after each gene name shows the ratio of gene expressions in the cells growing on malate to those on glucose; ratio (malate/glucose). Genes in red were upregulated on malate, whereas those in blue were downregulated. Malate increased the gene expression of the enzymes for gluconeogenesis
Fig. 5. Operon composed of the five enzyme genes on Embden-Meyerhof pathway. From the KEGG database
Combined Analysis of Metabolome and Transcriptome
13 9
5. Conclusions The metabolite profiles of the B. subtilis cells are similarly independent on the carbon sources regardless of whether they suppress others or are suppressed by others. All the similar profiles were measured at the maximum growth rate, suggesting that B. subtilis has a predetermined metabolite profile optimized for the maximum growth rate. Differences in carbon sources induced local perturbations in the predetermined profile. One of such perturbations was the accumulation of the starting metabolites in the suppressed carbon sources. Combined analysis of the metabolite profile and DNA microarrays revealed that the first reaction in the catabolism was rate-limiting when B. subtilis was grown on suppressed carbon sources, although the enzyme genes of the reactions were upregulated. The present analysis suggests that the decrease or increase in the gene expression of an enzyme does not always result in the accumulation or decrease in its substrates or products, because of the multiplicity of metabolic pathway networks. Metabolome and transcriptome data that supplement each other provide much informatiion to study the global regulation of metabolism.
References 1. Yoshida K, Kobayashi K, Miwa Y, Kang CM, Matsunaga M, Yamaguchi H, Tojo S, Yamamoto M, Nishi R, Ogasawara N, Nakayama, T, Fujita Y (2001) Combined transcriptome and proteome analysis as a powerful approach to study genes under glucose repression in Bacillus subtilis. Nucleic Acids Res 29:683-692 2. Zheng D, Constantinidou C, Hobman JL, Minchin SD (2004) Identification of the CRP regulon using in vitro and in vivo transcriptional profiling. Nucleic Acids Res 32:5874-5893 3. Kolb A, Busby S, Buc H, Garges, S, Adhya S (1993) Transcriptional regulation by cAMP and its receptor protein. Annu Rev Biochem 62:749-795 4. Saier M, Ramseier, T, Reizer J (1996) Regulation of carbon utilization. In: Neidhardt F, Curtiss R, Ingraham J, et al (eds) Escherichia coli and Salmonella: cellular and molecular biology, vol. 1. ASM Press, Washington, DC, pp 1325-1343 5. Fujita Y, Miwa Y, Galinier, A, Deutscher J (1995) Specific recognition of the Bacillus subtilis gnt cis-acting catabolite-responsive element by a protein complex formed between CcpA and seryl-phosphorylated HPr. Mol Microbiol 17:953-960 6. Soga T, Ueno Y, Naraoka H, Matsuda K, Tomita, M, Nishioka T (2002) Pressure-assisted capillary electrophoresis electrospray ionization mass spectrometry for analysis of multivalent anions. Anal Chem 74:6224-6229
140
T. Nishioka et al.
7. Soga T, Ueno Y, Naraoka H, Ohashi Y, Tomita, M, Nishioka T (2002b) Smultaneous determination of anionic intermediates for Bacillus subtilis metabolic pathways by capillary electrophoresis electrospray ionization mass spectrometry. Anal Chem 74:2233-2239. 8. Fujita, Y, Freese E (1979) Purification and properties of fructose-1,6bisphosphatase of Bacillus subtilis. J Biol Chem 254:5340-5349 9. Arita M (2003) In silico atomic tracing by substrate-product relationships in Escherichia coli intermediary metabolism. Genome Res 13:2455-2466. 10. Arita M (2004) The metabolic world of Escherichia coli is not small. Proc Natl Acad Sei USA 101:1543-1547 11. Deutscher J, Galinier, A, Martin-Verstraete I (2002) Carbohydrate uptake and metabolism. In: Sonenshein A, Hoch J, Losick R (eds) Bacillus subtilis and its closest relatives from genes to cells. ASM Press, Washington DC, pp 129-162 12. Doan T, Servant P, Tojo S, Yamaguchi H, Lerondel G, Yoshida K-I, Fujita, Y, Aymerich S (2003) The Bacillus subtilis ywkA gene encodes a malic enzyme and its transcription is activated by the YufLAfufM two-component system in response to malate. Microbiology 149:2331-2343
Chapter 10: Metabolomics in Arabidopsis thaliana Kazuki Saito Graduate School of Pharmaceutical Sciences, Chiba University, CREST of Japan Science Technology Agency, 1-33 Yayoi-cho, Inage-ku, Chiba 263-8522, Japan and RIKEN Plant Science Center, Tsurumi-ku, Yokohama 230-0045, Japan
1. Introduction Plant science has stepped forward into the "post-genome (sequence) era" in earnest by the completion of determination of the whole genome sequences of Arabidopsis thaliana and of rice. Plant metabolomics has recently emerged as an important field of post-genome sciences. Even if there is no visible change in cells and individual plants, metabolomics, which allows phenotyping by exhaustive metabolic profiling, can show precisely how the cells respond as a system. There is an advantage with A. thaliana, a model plant for modem plant science, because the resources related to its genome sequence can be fully applied. In this chapter, metabolomics study with A. thaliana will be described. Some reviews and commentary articles can be referred to for more detailed discussion [1-8].
2. The Impact of IVIetabolomics Study with a Model Plant for Genomics A way to understand an entity as a system by its comprehensive analysis is peculiar in post-genome (sequence) science. In other words, the fields of genomics (all genome sequences), transcriptomics (all cellular transcripts), proteomics (all cellular proteins), and metabolomics (all cellular low-molecular-weight metabolites), comprise post-genome science. The general idea of "metabolomics" or "metabolome" was defined several years ago first in the field of the microbiology [9], and its importance in plant science was pointed out immediately after that [1]. However, it has been overlooked in comparison with other so-called "-ome science" even recently. For example, when a publication is referred to in PubMed for 2004 in June, the number of "genome or genomics" related articles amounts to 97 420; "transcriptome or transcriptomics," 820; "proteome or proteomics,"
142
K. Saito
5393; however, the number for "metabolome or metabolomics" totals only 163. But, since the utility of crops, medicinal plants, industrial plants, etc., is due to the variety and producibility of their metabolites, the importance of plant metabolomics should be emphasized more than those of microorganisms and animals. The number of metabolites from the plant kingdom has been estimated at 200 000 or even more, up to 1 000 000 [7,10]. These numbers are significantly greater than those of microorganisms and animals. Metabolomics is based upon the nontargeted comprehensive analysis of total metabolites, while the usual "metabolite analysis" or "phytochemical analysis" is based upon the targeted analysis of a particular group of compounds. In addition, metabolomics is integrated with other -ome sciences with the aid of bioinformatics, as shown in Fig. 1. In other words, the whole metabolic change can correspond with genome, transcriptome, and protein functions for better understanding an entity as a system. Although the term "metabolite profiling" or "metabolic profiling" has often been used to point at comprehensive profiling of metabolic change for many years, metabolomics should differ from the classical "metabolite profiling" by means of integrating with other genome sciences. The following four classifications are defined by Fiehn as strictly relevant to research related to metabolomics [2]: -Target analysis -Metabolic profiling -Metabolomics -Metabolic fingerprinting
3. The Elements Composing Metabolomics Metabolomics is composed by a chemical analysis of metabolite and an in silico analysis of the data. Chemical analysis is the basis of metabolome analysis, and it is used for the comprehensive qualification and quantitation of whole metabolites. To do this, proper extraction methods without changing the metabolite profile and proper technology of data acquisition are required. At present, mass spectrometry (MS) and nuclear magnetic resonance spectrometry (NMR) are mainly used for metabolome analysis. The hyphenated terminology with several chromatographic methods is often used, e.g., gas chromatography-mass spectrometry (GC/MS), high-performance liquid chromatography-mass spectrometry (LC/MS), and recently, capillary electrophoresis-mass spectrometry (CE/MS). The analytical methods without preseparation by chromatography, for example.
Metabolomics in Arabidopsis thaliana
143
Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) and time-of-flight mass spectrometry (TOF-MS), are put to practical use. Nuclear magnetic resonance analysis of the cell extract as it is and an on-line NMR analysis of LC-stopped flow method are also reported. For data analysis, multivariate analysis such as a principal component analysis (PCA) and a hierarchical cluster analysis (HCA), as well as self-organization mapping (SOM), are carried out for data mining. Further, integration of metabolome data with other -ome data is performed to identify the gene/protein function and eventually leads to metabolic and cellular simulation in silico. Metabolite profiling can be applied to identification of mutants and function of genes. This technology, called "chemical bio-panning," is a branch field of metabolomics.
Fig. 1. Outline of post-genome sciences. Genomics, transcriptomics, proteomics, and metabolomics are integrated by the aids of bioinformatics. Complexity increases along the line from genomics to metabolomics
144
K. Saito
Table 1. Genome-related resources of Arabidopsis thaliana Number
Annotated genes EST clones Full-length cDNA clone T-DNA/transposon inserted mutant lines Single nucleotide polymorphism Up to May 2003. EST, expressed sequence tag
28 974 178 000 27 000 140 000 57 000
4. Genome-Related Resources of Arabidopsis tiialiana Arabidopsis thaliana is most suitable for carrying out research effectively, because a resource related to its genome is satisfactory. The resources related to the Arabidopsis genome useful for metabolomics research is summarized in Table 1. Moreover, AraCyc (http://www.arabidopsis.org/ tools/aracyc/), a database about the metabolic pathway of A. thaliana, is useful for metabolomics research of ^. thaliana [11]. By June 2004, 186 metabolic pathways and 1161 reactions have been covered, and 53% of these reactions are annotated with enzymes and genes. This database will be improved from time to time.
5. Examples of Metabolomics Research of Arabidopsis We proceed with the research on the comprehensive analysis of gene expression-metabolic pathway network in plant cells. The response under nutritional stress of sulfur and nitrogens and the effects of a single Myb transcription factor have been investigated through transcriptome and metabolome analyses.
5.1. Change of Metabolome and Transcriptome by Nutrition Stresses Arabidopsis plants were grown by hydroponic culture under nutritional stress of sulfur and/or nitrogen for 3 weeks (long-term starvation experiments). The plants were also grown on agar plates with sufficient nutrition for 3 weeks and then subjected to sulfur starvation or addition of 0-acetylserine (OAS), a key intermediary metabolite of sulfur assimilation.
Metabolomics in Arabidopsis thaliana
145
for 2 days (short-term experiments) (Fig. 2). Transcriptome analysis with DNA microarray was carried out for the leaves and roots of about 14 samples . With the same samples, metabolomics combining non-targeted comprehensive FT-MS analysis and targeted analysis by capillary electrophoresis and HPLC for amino acids was investigated [12-15]. Data mining was conducted by pairwise one-to-one correlation, hierarchical cluster analysis, and self-organization mapping. The clusters of the experimental group by metabolomics corresponded to those by transcriptomics (Fig. 3). In other words, both metabolome and transcriptome were influenced most predominantly by the difference of organ (leaf and root), followed by the difference of cultivation conditions (long-term hydroponic culture or short-term agar plate culture) and then by the treatments (sulfur and nitrogen starvation). Furthermore, it was shown that addition of OAS caused a transcriptome change nearly identical with that by sulfur deficiency even under the sulfur-sufficient condition. Metabolome changes also showed similar trends though less evident that with transcriptome. These results indicated that OAS acts as a positive regulatory factor for transcriptome and metabolome, in addition to a key metabolic intermediate in the sulfur assimilation pathway. Self-organization mapping of transcriptome and metabolome indicated the groups of genes and metabolites, of which the patterns of change are similar under given conditions. For example, the genes involved in photosynthesis form a cluster, and the genes involved in pentose phosphate form another cluster. In the same way, a group of metabolites or ion peaks of FT-MS exhibits similar patterns of change. The glucosinolate-related metabolites and the genes involved in glucosinolate metabolism are integrated in a metabolic map. Further experiments with time-course changes show the coordinated modulation of glucosinolates together with their degradation products and gene expression involved in glucosinolate metabolism. These integrated analyses of metabolome and transcriptome allowed us to identify unknown gene functions in Arabidopsis; for example see references [16,17]. Taken together, the networks of auxin and methyl jasmonate signalings in addition to glucosinolate metabolism were modulated by sulfur starvation [15]
146
K. Saito
Fig. 2. Outline of nutritional stress experiments in Arabidopsis. Long-term starvation experiments: Arabidopsis plants were grown by hydroponic culture under nutritional stress of sulfur and/or nitrogen for 3 weeks. Short-term experiments: the plants were also grown on agar plates with sufficient nutrition for 3 weeks and then subjected to sulfur starvation or OAS addition for 2 days. OAS, O-acetylserine
Fig. 3. Hierarchical clustering analysis of transcriptome and metabolome data under nutritionally stressed conditions. See Color Plate 4. Names of experimental groups are listed in Fig. 2
Metabolomics in Arabidopsis thaliana
147
5.2. Secondary Metabolism Controlled by PAP1 Transcriptional Factor It is known that anthocyanins are highly accumulated when the PAPl gene encoding a Myb-like transcription factor is constitutively expressed [18]. DNA microarray and the metabolome analysis of the activation tagged line and cDNA overexpressing line clarify the holistic effects by an ectopic expression of a single Myb-like factor [19]. Metabolic profiling of flavonoids by LC/MS was combined with the comprehensive nontargeted analysis by FT-MS. PAPl gene expression was specific to increasing the accumulation of anthocyanins. Several new anthocyanins were tentatively identified from PAPl overexpressing lines (Fig. 4). Expression of the genes involved in anthocyanin production was upregulated with these changes in
Cmin]
Fig. 4. High-performance liquid chromatography/photodiode array/mass spectrometry (HPLC-PDA-MS) profile of the anthocyanin fraction of the wild-type (a) and^a/?l-D mutant (b) Arabidopsis. 1, cyanidin-3-Glc-(Xyl)-(coumaroylGlc)-5-Glc; 2, cyanidin-3-Glc-(Xyl)-(coumaroyl-Glc)-5-Glc-(malonyl); 3, cyanidin-3-Glc-(Xyl-sinapoly)-(coumaroyl-Glc)-5-Glc; 4, cyanidin-3-Glc-(Xylsinapoly)-(coumaroyl-Glc)-5-Glc-(malonyl); 5, cyanidin-3-Glc-(Xyl) (coumaroyl)-5-Glc-(malonyl); 6, cyanidin-3-Glc- (Xyl-sinapoly)- (coumaroyl)- 5-Glc(malonyl) (Tohge et al, [19])
anthocyanin metabolite changes. The particular gene whose expression was induced by PAPl among paralogous members of a gene family was presumed to be actually responsible for the production of anthocyanins (Fig. 5).
148
K. Saito
Fig. 5. Intercalation of transcriptome and metabolome data on the biosynthetic pathway of flavonoids. Arrows indicate the paralogous genes in Arabidopsis. Black arrow indicates the upregulated genes in pap l-D mutant, and dashed arrow indicates the downregulated genes. The levels of cyanidin derivatives and quercetin derivatives increased, and that of kaempferol derivatives decreased. PAL, phenylalanine ammonia lyase; C4Hy cinnamate 4-hydroxylase; 4CL, 4-coumarate coenzyme A ligase; CHS, chalcone synthase; CHL chalcone isomerase; F3H, flavanone 3-hydroxylase; FLS, flavonol synthase; FGT, flavonoid glycosyltransferase; F37I, flavonoid 3'-hydroxylase; DFR, dihydroflavonol reductase; ANS, anthocyanidin synthase; AAT, anthocyanin acyltransferase; GST, glutathione-^'-transferase These assumptions allowed the tentative identification of specific genes responsible for biosynthesis of anthocyanin. For example, 110 glycosyltransferase-family genes, 50 acyltransferase-family genes, and 30 glutathione-5'-transferase-family genes are coded in the genome of ^. thaliana. By integrating metabolome and transcriptome data on some reactions, we were able to comprehensively predict the function of the related genes. Thereafter, some functions were experimentally identified by classical analysis of T-DNA knock-out mutant lines and by in vitro enzymatic study using recombinant proteins. Not only the genes coding for the biosynthetic enzymes, but the genes for transporter-like proteins and regulatory proteins were also predicted to be involved from the results of PAPl gene overex-
Metabolomics in Arabidopsis thaliana
149
pression. These results show the usefulness of integrated analysis of metabolome with transcriptome for functional genomics of Arabidopsis [16]. The protocol of integration of metabolome and transcriptome for gene identification is applicable not only to Arabidopsis but also to useful exotic plants. Perilla frutescens, a herbal plant producing anthocyanin, and Ophiorrhizapumila, a plant producing the antitumor alkaloid camptothecin, are being investigated by comprehensive metabolite profiling of both targeted and nontargeted analysis and by extensive gene expression profiling using polymerase chain reaction-select differential screening, expressed sequence tag analysis, and cDNA differential display [20,21].
5.3. The Metabolome Analysis of Arabidopsis Ecotypes The group led by L. Willmitzer in the Max-Planck Institute of Plant Molecular Physiology in Germany has conducted pioneering work on plant metabolomics. They used GC/MS for nontargeted analysis of hydrophilic compounds, amino acids, organic acids, sugars, sugar alcohols, and amines, after methoxylation and silylation. About 330 metabolites were detected in A. thaliana leaf extracts and about half of them were identified [22-24]. Comparison of four Arabidopsis genotypes (two homozygous ecotypes, Col-2 and C24, and a mutant of each ecotype) indicated that each genotype possessed a distinct metabolic phenotype. Principal component analysis of those four metabolic phenotypes indicated that the metabolic phenotypes of the two ecotypes were more divergent than those of the mutant and their parent ecotypes [25]. Fiehn and coworkers are further investigating several different ecotypes and their recombinant inbred lines [26].
5.4. Secondary Metabolites by Capillary LC/ESI-MS Capillary LC/ESI quadrupole time-of-flight mass spectrometry was applied to comprehensive analysis of Arabidopsis extracts [27]. About 2000 mass signals (1400 in leaves and 800 in roots) were detected, and most of them were presumed to be secondary products. As a case study, this analytical method was applied to the tt4 mutant lacking a functional chalcone synthase. In addition to the expected disappearance of kaempferol and its glycosides in the mutant, the levels of some indole-derived metabolites were reduced, suggesting the connection of the flavonoid metabolism to the indole metabolism such as auxin.
150
K. Saito
5.5. ^H-NMRfor Arabidopsis Metabolomics ^H-NMR spectroscopy has been used for metabolomics of Arabidopsis. The lyophilized tissue of Arabidopsis was extracted with deuterated HaO-MeOH and directly determined with ^H-NMR for metabolic fingerprinting [28]. Nine ecotypes of Arabidopsis were investigated for clustering by principal component analysis. Response to salt stress in Arabidopsis was analyzed by ^H-NMR fingerprinting [29].
6. Chemical Bio-panning with Arabidopsis T-DNA iVIutants Nontargeted metabolite profiling can be applied to screen mutants in which accumulation of particular metabolites is expected. This is referred to as "chemical bio-panning" [30]. The screening of Arabidopsis mutants exhibiting different accumulation patterns of flavonoids led to the selection of interesting loss-of-function mutants in 1991 [31]. However, it was not easy to identify the gene(s) responsible for the mutation caused by chemical mutagenesis. Nowadays, Arabidopsis mutant lines of gain-of-function are available with T-DNA activation technology. The combination of T-DNA activation-tagged mutants of Arabidopsis to this screening program allows rapid identification of genes involved in particular metabolic pathways. Screening can be carried out by nontargeted chemical analysis using GC/MS, HPLC, and CE, or by targeted analysis of a specific group of compounds such as anthocyanins and thiol compounds (easy to detect by fluorescent HPLC). Positive selection against growth inhibitors (heavy metals, metabolite analogues, herbicides, etc.) is also useful for large-scale screening.
Acknowledgments The author would like to thank his colleagues who shared unpublished data. The author' s original studies described in this chapter were supported in part by Grants-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology, Japan, and by CREST of Japan Science and Technology Agency (JST), Japan.
Metabolomics in Arabidopsis thaliana
151
References 1. Trethewey RN, Krotzky AJ, Willmitzer L (1999) Metabolic profiling: a Rosetta stone for genomics? Curr Opin Plant Biol 2:83-85 2. Fiehn O (2002) Metabolomics—the link between genotypes and phenotypes. Plant MoL Biol 48:155-171 3. Weckwerth W, Fiehn O (2002) Can we discover novel pathways using metabolomics analysis? Curr Opin Biotech 13:156-160 4. Roessner U, Willmitzer L, Femie AR (2002) Metabolic profiling and biochemical phenotyping of plant systems. Plant Cell Rep 21:189-196 5. Femie AR (2003) Metabolome characterization of plant system analysis. Funct Plant Biol 30:111-120 6. Sumner LW, Mendes P, Dixon RA (2003) Plant metabolomics: large-scale phytochemistry in the functional genomics era. Phytochemistry 62:817-836 7. Trethewey R (2004) Metabolite profiling as an aid to metabolic engineering in plants. Curr Opin Plant Biol 7:196-201 8. Kopka J, Femie A, Weckwerth W, Gibon Y, Stitt M (2004) Metabolite profiling in plant biology: platforms and destinations. Genome Biol 5:109 9. Tweeddale H, Notley-McRobb L, Ferenci T (1998) Effect of slow growth on metabolism of Escherichia coli, as revealed by global metabolite pool ("Metabolome") analysis. J Bacteriol 180:5109-5116 10. Dixon RA, Strack D (2003) Phyotchemistry meets genome analysis, and beyond. Phytochemistry 62:815-816 11. Mueller LA, Zhang P, Rhee SY (2003) AraCyc. A biochemical pathway database for Arabidopsis. Plant Physiol 132:453^60 12. Aharoni A, de Vos CH, Verhoeven HA, Mailiepaard CA, Kruppa G, Bino RJ, Goodenowe DB (2002) Nontargeted metabolome analysis by use of Fourier transform ion cyclotron mass spectrometry. OMICS J Integr Biol 6:217-234 13. Hirai MY, Fujiwara T, Awazuahara M, Kimura T, Noji M, Saito K (2003) Global expression profiling of sulfur-starved Arabidopsis by DNA macroarray reveals the role of 0-acetyl-L-serine as a general regulator of gene expression in response to sulfur nutrition. Plant J 33:651-663 14. Hirai MY, Yano M, Goodenowe D, Kanaya S, Kimura T, Awazuhara M, Arita M, Fujiwara T, Saito K (2004) Integration of transcriptomics and metabolomics for understanding of global responses to nutritional stresses in Arabidopsis thaliana. Proc Natl Acad Sei USA 101:10205-10210 15. Hirai MY, Saito K (2004) Post-genomics approaches for the elucidation of plant adaptive mechanisms to sulphur deficiency. J Exp Bot 55:1871-1879 16. Saito K (2004) Functional genomics through integration of transcriptomics and metabolomics in y^rafe/c/op^w thaliana- Third Intemational Conference on Plant Metabolomics, Ames, Iowa, 3-6 June 2004, Abstracts, p 19 (http ://w WW. metabolomic s.nl/) 17. Nikiforova V, Freitag J, Kempa S, Adamik M, Hesse H, Hoefgen R (2003) Transcriptome analysis of sulfur depletion in Arabidopsis thaliana'- interlacing of biosynthetic pathways provides response specificity. Plant J 33:633-650
152
K. Saito
18. Borevitz JO, Xia Y, Blount J, Dixon RA, Lamb C (2000) Activation tagging identifies a conserved MYB regulator of phenylpropanoid biosynthesis. Plant Cell 12:2383-2393 19. Tohge T, Nishiyama Y, Hirai MY, Yano M, Nakajima J, Awazuhara M, Inoue E, Takahashi H, Goodenowe DB, Kitayama M, Noji M, Yamazaki A, Saito K (2005) Functional genomics by integrated analysis of metabolome and transcriptome of Arabidopsis plants overexpressing an MYB transcription factor. Plant J 42:218-235 20. Yamazaki M, Nakajima J-I, Yamanashi M, Sugiyama M, Makita Y, Springob K, Awazuhara M, Saito K (2003) Metabolomics and differential gene expression in anthocyanin chemo-varietal forms of Perillafrutescens- Phytochemistry 62:987-995 21. Yamazaki Y, Urano A, Sudo H, Kitajima M, Takayama H, Yamazaki M, Aimi N, Saito K (2003) Metabolite profiling of alkaloids and strictosidine synthase activity in camptothecin producing plants. Phytochemistry 62:461^70 22. Fiehn O, Kopka J, Trethewey RN, Willmitzer L (2000) Identification of uncommon plant metabolites based on calculation of elemental compositions using gas chromatography and quadrupole mass spectrometry. Anal Chem 72:3573-3580 23. Rossener U, Wagner C, Kopka J, Trethewey RN, Willmitzer L (2000) Simultaneous analysis of metabolites in potato tuber by gas chromatography-mass spectrometry. Plant J 23:131-142 24. Rossener U, Luedemann A, Brust D, Fiehn O, Linke T, Willmitzer L, Femie AR (2001) Metabolic profiling allows comprehensive phenotyping of genetically or environmentally modified plant systems. Plant Cell 13:11-29 25. Fiehn O, Kopka J, Doermann P, Altman T, Trethewey RN, Willmitzer L (2000) Metabolite profiling for plant functional genomics. Nat Biotechnol 18:1157-1161 26. Fiehn O (2004) High quality automated analyses of Arabidopsis metabolic phenotypes: Ecotypes, environmental stress and crosses. Third International Conference on Plant Metabolomics, Ames, Iowa, 3-6 June 2004, Abstracts, p 8 (http://www.metabolomics.nl/) 27. Von Roepenack-Lahaye E, Degenkolb T, Zerjeski M, Franz M, Roth U, Wessjohann L, Schmidt J, Scheel D, Clemens S (2004) Profiling of Arabidopsis secondary metabolites by capillary liquid chromatography coupled to electrospray ionization quadrupole time-of-flight mass spectrometry. Plant Physiol 134:548-559 28. Ward JL, Harris C, Lewis J, Beale MH (2003) Assessment of ^H NMR spectroscopy and multivariate analysis as s technique for metabolite fingerprinting of Arabidopsis thaliana- Phytochemistry 62:949-957 29. Kikuchi J, Shinozaki K, Hirayama T (2004) Stable isotope labeling of Arabidopsis thaliana for an NMR-based metabolomics approach. Plant Cell Physiol 45:1099-1104 30. Dixon RA (2001) Phytochemistry in the genomics and post-genomics eras. Phytochemistry 57:145-148
Metabolomics in Arabidopsis thaliana
153
31. Graham TL (1991) A rapid, high resolution high performance liquid chromatography profiling procedure for plant and microbial aromatic secondary metabolites. Plant Physiol 95:584-593
Chapter 11: Lipidomics: Metabolic Analysis of Phospholipids Ryo Taguchi Department of Metabolome, Graduate School of Medicine, The University of Tokyo, 7-3-1 Kongo, Bunkyo-ku, Tokyo 113-0033, Japan
1. Introduction In the study of phospholipases that have been the subject of our research for a long time, we have found that soft ionization mass spectrometry by electrospray ionization (ESI-MS) was effective and useful in elucidating their substrate specificity at the level of individual molecular species of phospholipids and their physiological functions [1-3]. We therefore began a comprehensive analysis with ESI-MS. There are several advantages in applying ESI-MS to the analysis of phospholipids. First, ESI is a milder ionization method than matrix-assisted laser desorption/ionization (MALDI). Therefore it is convenient to analyze phosphoethers and lipid esters which have relatively weak inner molecular bonds. Second, phospholipids can be easily ionized by ESI because of their phosphoryl group and their polar head group. Third, in the case of lipid analysis, most of the nonvolatile ions can be removed during the extraction with organic solvents. This is convenient when the analysis needs to be done without these salts. Also, lipids, different from peptides, have molecular species some of which are different from each other by a different number of unsaturated bonds. Liquid chromatography-mass spectrometry (LC/MS), which can take advantage of both the separation by high-performance liquid chromatography (HPLC) and the separation according to the number of mass values, was found to be effective [4]. In addition, recent research has proven that high resolution and high mass accuracy analysis by Fourier-transform ion cyclotron resonance FT-ICR-MS was effective in the case of lipid analyses because of its ability to distinguish each molecular species by its atomic composition [5]. We have attempted to perform various types of experiments such as a combined analysis with HPLC, an analysis with a change of ionization conditions in a positive and a negative ion mode [4], high-resolution
156
R. Taguchi
analysis at the level of 1 to 2 ppm [5], tandem mass analysis with fragment data, and quantitative analyses with multiple reaction monitoring and selected reaction monitoring. Among the many experimental modes, an appropriate analytical method should be chosen according to the purpose and the accuracy that needs to be achieved in each analysis. We are now making an effort to construct a lipid database and a search engine that will be useftil in analyzing various mass data obtained from each analytical method.
2. Analysis by Mass Spectrometry for Lipid IVIetabolomes In our studies, we selected several different approaches to mass spectrometric analysis of lipids (Table 1) [4-7]. The most popular methods that have been used in metabolic analysis were selected ion monitoring (SIM) and multiple reaction monitoring (MRM). These methods were normally used in combination with HPLC as LC/MS. The individual metabolites were identified from their retention time and m/z value. In the case of MRM, essentially the combination with the detection of precursor ions and major fragment ions was used. Even in this analysis, ESI-MS made it possible to detect more than ten metabolites by a single LC analysis. MRM is commonly used in quantitative analysis by mass spectrometry. But both in SIM and MRM analyses, the target metabolites to be analyzed must be defined in advance, and data of their molecular masses and their fragments are required in advance to set the analytical conditions. On the other hand, comprehensive analysis by soft ionization is essentially used for crude mixtures containing many different metabolites, and these metabolites existing in the samples are expected to be identified as much as possible. Even in the absence of preliminary perception for a specific metabolite before mass analysis, the significant difference in profiling data of metabolites can be obtained by mass analyses. In this case, however, some focuses by an individual researcher in the specific category of metabolites may be effective in detecting an important factor with a low amount. For this purpose, a precursor ion scanning method and neutral loss scanning methods are used for comprehensive analysis of focused categories of metabolites with structural similarities (6). By using these analytical methods, the detection limits of individual metabolites are increased up to 100-fold. By focusing on some limited categories of metabolites, the detection limit is greatly enhanced, thus making it possible to detect minor but important metabolites. We are now attempting to develop the optimal collision conditions for individual metabolites to use these methods for the detection of a specified class of phospholipids.
Lipidomics
157
Table 1. Several analytical methods for lipid metabolome by mass spectrometry Comprehensive analysis focused on whole phospholipids LC/MS and cycle sequence High-resolution analysis by FT-ICR-MS Group-specific analysis Neutral loss scanning Precursor ion scanning (class-specific or fatty acid-specific analysis) Molecular-specific analysis SRM, selected reaction monitoring MEM, multiple reaction monitoring (quantitative analysis combination of molecular-related ions and their fragments) On the other hand, in the case of not using any MS/MS, the combination of differences in the elution time as determined by separation w^ith HPLC and mass resolution in terms of m/z values are used as important data for the detection of individual metabolites. In this system, some separation methods such as HPLC and capillary electrophoresis (CE) are used to eliminate the number of metabolites eluted at the same retention time. Further, in the analysis of FT-ICR-MS, w^hen a limited mass range of metabolites is selected, very small mass differences such as 0.001 can be effectively detected w^ith high-resolution efficiency of 10^. Even w^ith broadband analysis, mass resolution of more than 100 000 and mass accuracy of less than 2 ppm can be obtained [5]. At this resolution, elementary composition of individual metabolites can be determined from their accurate mass values. Recently, untargeted metabolic analysis by FT-ICR-MS has beeen applied to metabolomics by several venture companies. We are planning to construct several different lipid databases and search tools for mass data obtained by different analytical methods, such as comprehensive analysis of whole phospholipids, focused analysis for a specific fatty acid or a specific polar head, or an identification method for the detection of a single individual metabolite.
3. Features of Lipid Analysis by ESI-IMS In ionization by ESI, the observed ions vary according to polar head of the phospholipids. In addition, neutral lipids such as triglycerides can be also
158
R. Taguchi
effectively ionized as ammonium-adduct ions in the presence of ammonium formate. The data of m/z values both in positive and negative ion modes are very important for identification of phospholipid classes. Species of polar head groups or hydrophilic and hydrophobic balance greatly influence ionic efficiency. In the case of quantification, correction by difference in ionization by polar head groups, fatty acyl chain length, and number of double bonds of individual metabolites is necessary. By using triple-stage quadrupole MS, with a cycle sequence for LC/MS for 1 h, most of major phospholipids are roughly separated by polar head groups (Fig. 1) [4]. During this analysis, in addition to normal detection with positive and negative ions, detection by in-source fragmentation at a higher collision energy was used. Individual molecular species of phospholipids were identified by pseudo-two-dimensional analysis of retention time and m/z value (Fig. 2).
Fig. 1. A two-dimensional map of phospholipids by mass spectrometry. PC, phosphatidylcholine; PE, phosphatidylethanolamine; SM, sphingomyelin; lysoPC, lyso phosphatidylcholine
Lipidomics
>"
T=>
r—
V
> DG y > MG> /
• i
EÖH \ /
"]^
) PC
lyso Vv lyso \ \ PG ^y PI / /
lyso PE
z._
\
))_^
\
/
IJD
\
Iwcn
h'/
\ — \
1
0 100 r
lyso PA /\
PC 1
\
\ 'yso \
z
y
"")
100
y
^
/
22.4 min
^ ^
SM
25.1 min / \
0 lysoPC
100
i
)
/
^PAF
)'^*)>
i
)) SM
//
159
0
)
(
27.4 min
y\
20
10
30
[min]
Fig. 2. Elution profile of phospholipids by liquid chromatography-electrospray ionization mass spectrometry with a silica column. PC, phosphatidylcholine; PE, phosphatidylethanolamine; PI, phosphatidylinositol; PS, phosphatidylserine; PG, phosphatidylglycerol; SM, sphingomyelin; MG, monoglycerol; DG, diglycerol; TG, triglycerol; FA, fatty acid; PAF, platelet activating factor
4. Several Practical Approaches to Metabolic Analyses of Lipids 4.1. Comprehensive Analysis of Phospholipids from Whole Cells by LC/MS The most abundant molecular species of phosphatidylcholine (PC), diacyl 16:0/18:1 PC, can be effectively identified w^ith a total number of 10^-10^ cells. However, more than 10^ cells are required for detecting a much smaller number of metabolites such as one thousand metabolites or less. In a typical analysis a lipid mixture is extracted from 10^-10^ cells by Bligh and
160
R. Taguchi
Dj^er's methods. We analyzed an extract by liquid chromatography-electrospray mass spectrometry (LC/ESI-MS). For the identification of phospholipids, an effective database and search tool, such as Mascot in proteome, is not yet in existence. Thus we are trying to construct identification tools for individual metabolites from their mass spectrum with our ow^n experimental data and expanded theoretical data containing molecular relied ions and their fragment ions along with a search engine named "L|pid Search" that has been available on our web site since 2003. 4 4 . Analysis of Phospholipids of Human Serum Aijalysis by LC/ESI-MS of phospholipids from human serum, lyso phosph^tidylethanolamine (PE) and lyso phosphatidylcholine (PC), were analy2|ed with their molecular species. )[n molecular species in lyso phosphatidic acid (PA), both 1-acyl and 2-4cyl types were detected. Further, concerning cyclic PA, a small amount of Mkyl type was also observed. To detect these small amounts of molecular species, we confirmed that the neutral loss scanning method is very effectivb. 4.^. High-Resolution Analysis by FT-ICR-MS Effective separation and identification of an isotopic peak of PC containing one ^^C atom from a monoisotopic peak of sphingomyelin was obtained by practical broadband analysis of nano ESI-FT-ICR-MS. In the analysis of FT-ICR-MS, a mass difference of 0.01 amu can be easily detected under resolution of 10^. As shown in Fig. 3, PC and sphingomyelin (SM) were identified without separation by HPLC. Also, in the case of FT-ICR-MS analysis of PEfi-omCaenorhabditis elegans with an odd number fatty acids separated and identified from alkyl-acyl PE with an even number of fatty acids [5]. Oxidative phospholipids are synthesized under oxidative stress, and these metabolites seem to be major components causing arteriosclerosis. Oxidized PCs from soybean were analyzed by broadband analysis mode of nano ESI-FT-ICR-MS. Individual molecular species of oxidative PC such as 34:3 diicyl PC with peroxide (+20) {m/z 788.544) or 34:2 diacyl PC with peroxide (+20) {m/z 790.560) were separated from their nonoxidized phospholipids [5]. By the analysis with FT-ICR-MS, more than 80 000 resolution can be easily obtained even in broadband analysis mode. We assumed that the in-
Lipidomics
161
dividual molecular species of PC and PE with the mass difference of 0.01-0.05 can be identified by high-resolution analysis using FT-ICR-MS. This method is extremely effective to identify diacyl, or alkyl- or alkenyl-acyl subclasses with very close m/z values within 0.04. Several recent papers have indicated the existence of significant amounts of phospholipid molecular species with an odd-number acyl chain even in normal mammalian cells. For exact qualitative and quantitative determination for an odd number of phospholipids, both clear mass peak separation of individual molecular species of phospholipids with FT-ICR-MS and determination of fatty acid fragments with MS/MS analysis are needed. Furthermore, the mass values with high accuracy and high resolution obtained by FT-ICR-MS are also expected to give us a chance to realize the possibility of the existence of unexpected or unknown metabolites or derivatives.
9.0e+0S fIPC 38:4j 8.0e+0S [
ISM24
III
^
1
7.0e+0S 810.6168 6.0e+0S [
\ 1
5.0e+0S
1 1
4.0e+0S
L
1
813.7006 813.6367 1
IPC 38:5| 3.0e+0S [L808.6011 2.0e+0S [\
1
\
24:2
L 0
811.6202 / 811.6325 \ \ 811.6851
1
\
809.6041
/ 815.7162 814.7040
1 1 \ \ j
1
\
|SM24:0|
i
816.7200
r
312.6885 \i
1.0e+0S 1
1
1
810
1
1
r-
m^i^k^^^mMhLmi^¥M^j>^.^mii^^.m((m^'% 814
812
816
Cm/z]
Fig. 3. High-resolution analysis and identification of phosphatidylcholine and sphingomyelin by Fourier-transform ion cyclotron resonance-mass spectrometry
162
R. Taguchi
4.4. Other Applications By applying the methods described above, we have obtained experimental results suggesting new functions of phospholipids through collaborations. Identifying molecular species of phospholipids with very long fatty acyl chains [8], confirming the existence of cyclic PA in human serum albumin fractions [9], and then finding of the substrate specificities of two phospholipases Ai were achieved [10,11]. Further, a study was conducted to analyze phospholipids localized in a specific membrane domain in order to elucidate the specific function of each molecular species of phospholipids. Concerning its membrane localization, we elucidated the concentrations of specific phospholipid molecules in lipid droplets [12]. This indicated the existence of a special membrane organization of lipid droplets. Further, by using these methods we were able to obtain new biological substrates in addition to known substrates determined by in vitro activities. By applying these techniques to highly expressed cells in combination with some forms of stimulation process of the precursors in the system, key roles of lipid metabolizing enzymes in some physiological functions can be effectively elucidated. Thus we would like to emphasize the important role of the analysis of metabolomes in the functional study of lipid metabolizing enzymes.
5. A Search Tool for Lipid Metabolome "Lipid Search" is a tool for the identification of phospholipid molecular species by experimental mass data, which can be accessed from our home page (http://metabo.umin.ac.jp) [13]. The database for this search was constructed of essential core mass data of phospholipids commonly detectable in biological sources. Three different search windows are available for the identification of molecular species of phospholipids by "Lipid Search" (Fig. 4). The first window only uses molecular-related mass values in a positive or a negative ion mode. Here, the mass accuracy determines the probability of identification results. The mass tolerance in the search conditions was determined by the type of mass spectrometer used in each experiment. By pasting a personal data table of mass values and their corresponding ion intensities into our search table, the most probable lipid molecules and their corresponding compensated peak intensities can be obtained. Furthermore, it is possible to narrow down the candidate molecules by selecting a possible class of phospholipid in advance using the data obtained from the retention
Lipidomics
163
"Lipid Search" LS/MS (Lipid Search for Mass Spectrometry)
Select Class P-Ethanol, Amine group
P-lnosito)
GPE
rjPI
P-Choline group
L i LPE
D L P I
D
P-Serine group
P-Ethanol group
riLPMe
E l PS
n
Others
D
ALL
PC
L i LPC n
RAF
n
SM(SP)
G SM(DISP) ["i SM{PHSP)
PEt
P-Methanol group IJPMe
v i LPS
QLPEt
r] cPA
P-Grycerol group
P-acid group
CI '^'^
n
PG
G
D
LPG
[jLPA
PA
Precursor (m/z) :762.30;8 3S73.8800 l.ltVIt 1774.6710 2.£6S6o4 12.7584 Tolerant (Da)
^\
-^l
:o4
Polarity Q
Positive
• •> Negative
0°!J.?.gyiL.^ LS/MS (lipid search for mass spectrometry) Copyright (2003) Department of Metalwlome, Graduate School of Medicine, the University )f Tokyo.
Fig. 4. Search window 1 in "Lipid Search" [13]. Identification results can be obtained by pasting a data table containing m/z vales and ion intensities to this window. From May, 2005, a revised version of lipidsearch which can be applied directly by low data of mass will be available time of LC, or a precursor ion scanning or neutral loss scanning of the polar head group of phospholipids. The second search w^indow^ is for identification by mass profile data obtained from the precursor ion scanning of the specific fragments of individual fatty acyl chains. Here almost all mass peaks of the lipid molecular species containing a focused fatty acid can be obtained immediately. In the third w^indow, data detected by MS/MS is required. The mass data of this window^ w^ere constructed from the combination data of the m/z value of a molecular-related ion and the m/z values of its fragment ions. In the third search window^, as w^ell as the second w^indow^, individual pairs of fatty acyl chains at the sn-l and sn-l positions can be obtained. Here, at least the data of one fragment ion of a fatty acyl chain is required. Three different w^indow^s of our search program can be selected from the main page of "Lipid Search."
164
R. Taguchi
6. Conclusions At present, we are attempting to develop basic analytical methods, a database, and search tools for lipidomics, i.e., metabolomics focusing on lipids. Further, in order to develop in new^ analytical methods and identification tools, we believe that our efforts will be successful only through collaboration with numerous cooperating researchers and application personnel of MS manufacturers. Analysis by ESI-MS can be applied to elucidate lipid mediators in the cells as well as for the purposes of detection and quantification, and for localization in the cells of minor lipid metabolites. Bioinformatics studies based on both data of proteome of enzymes and receptors, as well as metabolome of lipids, will make it possible to acquire new knowledge on the metabolic and signaling pathways of lipid metabolism.
References 1.
2. 3. 4.
5.
6.
7.
Han X, Zupan LA, Hazen SL, Gross RW (1992) Semisynthesis and purification of homogeneous plasmenylcholine molecular species. Anal Biochem 200: 119-124 Murphy RC (1993) Mass Spectrometry of lipids. Plenum Press: New York, pp 71-282 Kim, HY, Salem N (1993) Liquid chromatography-mass spectrometry of lipids. Prog Lipid Res 32: 221-245 Taguchi R, Hayakawa J, Takeuchi Y, Ishida M (2000) Two-dimensional analysis of phospholipids by capillary liquid chromatography/electrospray ionization mass spectrometry. J Mass Spectrometry 35: 953-966 Ishida M, Yamazaki T, Houjou T, Imagawa M, Harada A, Inoue K, Taguchi R (2004) High-resolution analysis by nano-electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry for the identification of molecular species of phospholipids and their oxidized metabolites. Rapid Commun Mass Spectrom 18: 2486-2494 Houjou T, Yamatani K, Nakanishi H, Imagawa M, Shimizu T, Taguchi R (2004) Rapid and selective identification of molecular species in phosphatidylcholine and sphingomyelin by conditional neutral loss scanning and MS3. Rapid Commun Mass Spectrom 18: 3123-3130 Houjou T, Yamatani K, Imagawa M, Shimizu T, Taguchi R (2005) A shotgun tandem mass spectrometric analysis of phospholipids with
Lipidomics
8.
9.
10.
11.
12.
13.
165
normal-phase and/or reverse-phase liquid chromatography/electrospray ionization mass spectrometry. Rapid Commun Mass Spectrom 19: 654-666 Yokoyama K, Saitoh S, Ishida M, Yamakawa Y, Nakamura K, Inoue K, Taguchi R, Tokumura A, Nishijima M, Yanagida M, Setaka: Biochim M (2001) Very-long-chain fatty acid-containing phospholipids accumulate in fatty acid synthase temperature-sensitive mutant strains of the fission yeast Schizosaccharomyces pombe fas2/lsdl. Biochim Biophys Acta 1532: 223-233 Kobayashi T, Tanaka-Ishii R, Taguchi R, Ikezawa H, Murakami-Murofushi K (1999) Existence of a bioactive lipid, cyclic phosphatidic acid, bound to human serum albumin. Life Sei 65: 2185-2191 Hosono H, Aoki J, Nagai Y, Bandoh K, Ishida M, Taguchi R, Arai H, Inoue K (2001) Phosphatidylserine-specific phospholipase Al stimulates histamine release from rat peritoneal mast cells through production of 2-acyl-l-lysophosphatidylserine. J Biol Chem 276: 2966429670 Sonoda H, Aoki J, Hiramatsu T, Ishida M, Bandoh K, Nagai Y, Taguchi R, Inoue K, Arai H (2002) A novel phosphatidic acid-selective phospholipase Al that produces lysophosphatidic acid. J Biol Chem 277: 34254-63 Tauchi-Sato K, Ozeki S, Houjou H, Taguchi R, Fujimoto T (2002) The surface of lipid droplets is a phospholipid monolayer with a unique Fatty Acid composition. J Biol Chem 277: 44507-44512 Lipid Search : http://lipidsearch.jp and http://metabo.umin.ac.jp
Chapter 12: Chemical Diagnosis of Inborn Errors of Metabolism and Metabolome Analysis of Urine by Capillary Gas Chromatography/Mass Spectrometry Tomiko Kuhara Division of Human Genetics, Medical Research Institute, Kanazawa Medical University, 1-1 Daigaku, Uchinada-machi, Kahoku-gun, Ishikawa 920-0293, Japan
1. Use of Metabolite Analysis to Evaluate Enzymatic Reactions: Enzyme Deficiency, Lack of Coenzyme, or Substrate Overload Urinary metabolite analysis can be used to evaluate enzyme deficiency, lack of coenzymes, or substrate overload, by comparing the data of patients with those of age-matched, healthy controls. In the absence of special diet or drugs, metabolite analysis can be used to nearly comprehensively detect enzyme dysfunctions that are caused by genetic abnormalities. Almost all mutations—known or unknown, common or uncommon—^that result in a significant reduction in enzyme activity can be detected. Enzyme dysfunction can be due to an abnormal structure of an enzyme/apoenzyme, a reduced quantity of a normal enzyme/apoenzyme, or a lack of a coenzyme. It can also result from an abnormal regulatory gene, abnormal sub-cellular localization, or post-transcriptional or post-translational modification [1]. Urine contains numerous metabolites. The metabolome analysis of urine by the combined use of urease pretreatment, stable isotope dilution, gas chromatography/mass spectrometry (GC/MS), mass chromatography, and comparison with age-matched controls offers reliable data for the simultaneous screening or molecular diagnosis of over 130 inborn errors of metabolism (EEMs). These lEMs include hyperammonemias, lactic acidemias, organic acidemias, and lEMs of amino acids, pyrimidines, purines, carbohydrates, and other metabolites.
168
T. Kuhara
2. The Role of Metabolome Analysis in Early, Rapid, and Differential Diagnosis Emergency diagnostic laboratory evaluation should cover all differential diagnoses that are therapeutically relevant, and should alw^ays include ammonia, glucose, lactate, and acid-base status as w^ell as a urine test for ketones. These tests are indispensable to the planning and execution of the first steps of metabolic emergency treatment, and should be available within 30 min [2]. Information on blood pH, ammonium, lactate, glucose, acid-base status, and ketones is important but does not focus on the etiology of the patients' illness. For hyperammonemias, there are more than 42 etiologies, for lactic acidemia, more than 24, and for lEMs of methylmalonate, sulfiir-containing amino acids, vitamin Bn, and folate, more than 22. The metabolome analysis of urine by urease pretreatment of urine or eluates from dried urine on filter paper, stable-isotope dilution, and GC/MS enables the simultaneous screening and molecular diagnosis of numerous lEMs. Early, rapid, and differential diagnosis is most effectively achieved by this procedure.
2.1. Differential Diagnosis of Hyperammonemia As shown in Table 1, there are more than 42 etiologies that give rise to hyperammonemia. Primary hyperammonemia is caused by disorders in any of six urea cycle enzymes and two membrane transport systems; disorders in the latter two are known as hyperomithinemia-hyperammonemia-homocitruUinuria (HHH) syndrome (MIM 258870) and lysinuric protein intolerance (LPI, MIM 247900) [3-5]. Of the above eight disorders, urinary levels of uracil and/or orotate are not elevated in deficiencies of carbamoylphosphate synthetase and A^-acetylglutamate synthetase [6]. In the remaining six disorders, the utilization of carbamoylphosphate is impaired and accumulates in the mitochondria. The activity of carbamoylphosphate synthase is 100 times lower in the cytosol than in the mitochondria, but carbamoylphosphate accumulation in the mitochondria causes an increase in the de novo synthesis of pyrimidine in the cytosol, inducing a marked increase in uracil and/or orotate. Thus, orotate and uracil are specific indicators for the screening and diagnosis of these six primary hyperammonemias. The lack of an increase in these two indicators together with rather persistent hyperammonemia, hyperglutaminuria, pyroglutamic aciduria, and alaninuria strongly suggests the primary hyperammonemia caused by one of the first two urea cycle disorders (primary hyperammonemia in Table 1).
Chemical Diagnosis of Inborn Errors of Metabolism
169
Ornithine transcarbamylase (EC 2.1.3.3) is a hepatic mitochondrial protein that catalyzes the formation of citrulline from ornithine and carbamoylphosphate. An ornithine transcarbamylase deficiency (MIM 311240) does not always show an increase in specific amino acids except for pyroglutamate, glutamine, alanine, or proline. Argininosuccinate synthase (EC 6.3.4.5), the rate-limiting enzyme in the urea cycle, is located in the hepatic cytosol. It catalyzes the formation of argininosuccinate from citrulline and asparagine. A deficiency of argininosuccinate synthase causes citrullinemia (MIM 215700). Argininosuccinate lyase (EC 4.3.2.1) catalyzes the formation of arginine. A deficiency of this enzyme (MIM 207900) causes argininosuccinic acidemia. Arginase (EC 3.5.3.1), a cytosolic enzyme, releases urea and ornithine from arginine. An arginase deficiency (MIM 207800) causes hyperargininemia. In the HHH syndrome, ornithine and homocitrulline increase, and in LPI, lysine markedly increases in the urine but not in the serum. Therefore, to prevent the misdiagnosis of LPI, examination of the urine is critical. For patients with orotic aciduria and/or uraciluria, the levels of citrulline, arginine, and homocitrulline are further examined using a conventional amino acid autoanalyzer, or soft ionization MS, such as fast atom bombardment or electrospray ionization (not GC/MS). The patients are discriminated between ornithine transcarbamylase deficiency, citrullinemia, argininosuccinic aciduria, arginase deficiency, LPI, or HHH syndrome. In the urease-pretreatment procedure, orotate and uracil are determined sensitively and quantitatively by using stable isotope dilution method, and lysine can be quantified and ornithine measured semiquantitatively, using d4-lysine as an internal standard [7]. Recently, liver transplantation has become available as a treatment of lEMs. The urinary levels of orotate and uracil resumed normal after a female patient with ornithine transcarbamylase deficiency received a living related liver transplant (not shown). Thus, the stable isotope dilution method is used not only to make a diagnosis but also to monitor and/or to evaluate the treatment, including liver transplantation, for the disorders in carbamoylphosphate utilization. It would be applicable to experiments using animal models of these diseases, especially in the development of new therapies, including gene therapy [8]. Secondary hyperammonemia is caused by several organic acidemias and other lEMs that cause hepatic dysflinction, such as tyrosinemia or Wilson's disease. Eleven organic acidemias, galactosemia, and hereditary fructose intolerance can be chemically diagnosed [9-11]. Fatty acid oxidation defects associated with rather persistent 3-hydroxydicarboxylic aciduria can also be screened [12].
170
T.Kuhara
Table 1. There are more than 42 etiologies in hyperammonemia I. Primary hyperammonemia (8) (1) Urea cycle disorder • 1 Carbamoylphosphate synthase def. •2 7V-Acetylglutamate synthase def. 1^3 Ornithine transcarbamylase def. ^4 Argininosuccinate synthase def. i!V5 Argininosuccinate lyase def. ^6 Arginase def. (2) Membrane transport disorder 1^7 Lysinuric protein intolerance (LPI) i^ 8 Hyperomithinemia-hyperammonemia-homocitrullinemia (HHH) syndrome II, Secondary hyperammonemia (34) (1) Organic acidemia (11) * 1 Branched chain keto acid DH def. (MSUD) *2 Isovaleryl- CoA DH def ^3 Multiple carboxylase def. *4 ß-Methylcrotonyl-CoA carboxylase def. *5 ß-Ketothiolase def. ^6 Propionyl-CoA carboxylase def. *7 Methyhnalonyl-CoA mutase def. *8 ß-Hydroxy-ß-methylglutaryl-CoA lyase def. *9 Multiple acyl-CoA DH def * 10 Dihydrolipoyl DH (E3) def. ir 11 Tyrosinemia type I (2) Lactic acidemia (8) ^ 1 Pyruvate DH (Ej) def •2 Dihydrolipoyl transacetylase (E2) def. •3 Pyruvate carboxylase def •4 Oxidative phosphorylation defect (I-V) (5) (3) Fatty acid oxidation defects (7) j ^ T F P , i^LCHAD, • M C A D , (4) Hepatic dysfunction (inherited) (5) it: 1 Galactosemia type I
• M C K T , VLCAD,
CAT, CPTII
Chemical Diagnosis of Inborn Errors of Metabolism
171
Table 1. (continued) ^2 Hereditaryfructoseintolerance 3 Glycogen storage disease type IV •4 Citrin deficiency 5 Hemochromatosis (5) Other causes (4) 1 Transient neonatal hyperammonemia 2 Hepatic failure 3 Portacaval shunt 4 Obstructive urinary tract infection * , Differential molecular diagnosis (DMD) [13]; DMD: -i^, screened (SC) [8], orotate t (6), polar 3-hydroxydicarboxylate t (2); • , screened (SC) but not always [13] def., deficiency; DH, dehydrogenase 2.2. Differential Diagnosis of Lactic Acidemia In lactic acidemia, a serum lactate concentration is higher than 2 mM. More than 24 etiologies are know^n for lactic acidemia (Table 2). Primary lactic acidemia includes pyruvate dehydrogenase complex disorders, oxidative phosphorylation (OXPHOS) disorders, gluconeogenesis disorders, and membrane transport defects [13]. In OXPHOS disorders, mutations in both nuclear and mitochondrial DNA are involved, except for mutations that involve complex II. Nuclear DNA encodes 70 OXPHOS subunits. Mitochondrial DNA encodes 12 OXPHOS-associated subunits, 22 tRNAs, and 2 rRNA subunits. The mutations in mitochondrial DNA cause a complex phenotypic expression metabolically and clinically. Mitochondrial DNA is susceptible to mutation, and a large number of mutations are accumulated at a high incidence. Eight organic acidemias are secondary lactic acidemias. These organic acidurias are differentially and chemically diagnosed. Lactic acidemia and lactic aciduria are observed during hypoglycemic episodes in patients with gluconeogenesis disorders. In addition, sensitive and specific indicators, glyceroI-3-phosphate and glycerol, increase in a fructose-1,6-diphosphatase deficiency. Glucose-6-phosphatase deficiency gives specific metabolic profiles during hypoglycemic episodes. Of the 24 etiologies, all can be screened except for very mild lactic acidemia that causes no lactic aciduria. Differential molecular diagnosis can be applicable for 8 or 10 disorders. During remission, the profiles for patients w^ith glconeogenesis disoders
172
T.Kuhara
shows no abnormality. Table 2. There are more than 24 etiologies in lactic acidemia L Pyruvate dehydrogenase complex disorders (4) 1 Pyruvate DH (Ei) def. 2 Dihydrolipoyl transacetylase (E2) def. Dihydrolipoyl DH (E3) def. 3 Pyruvate DH phosphatase def 4 E3 bmding protein def IL Oxidative phosphorylation (OXPHOS) (5)^ 1 Complex I 2 Complex II 3 Complex III 4 Complex IV 5 Complex V III. Gluconeogenesis disorders (4) 1 Pyruvate carboxylase def 2 Phosphoenolpyruvate carboxylase def iV 3 Fructose-1, 6-bisphosphatase def ^4 Glucose-6-phosphatase def IV. Membrane transport (3) 1 2 _3
Pyruvate: mt NADH: mt Lactate: cytoplasmic
V. Organic acidurias (8) * 1 Branched chain keto acid DH def (MSUD) -kl Lipoamide DH (E3) def *3 Isovaleryl- CoA DH def *4 Multiple carboxylase def *5 ß-Methylcrotonyl-CoA carboxylase def *6 ß-Ketothiolase def *7 Propionyl-CoA carboxylase def *8 Methylmalonyl-CoA mutase def. All screened (SC) except for very mild lactic acidemia and during remission in gluconeogenesis. Differential molecular diagnosis (DMD) 8 ( * ) , but 10 (lir) during episode in gluconeogenesis def, deficiency; DH, dehydrogenase ^Nuclear DNA encodes 70 OXPHOS subunits. mt DNA encodes 12 OXPHOS, 22 tRNA, and 2 rRNA subunits
Chemical Diagnosis of Inborn Errors of Metabolism
173
2.3. Inborn Errors of Metabolism of Sulfur-Containing Amino Acids, Cobalamin, and Folate, and Methylmalonic Acidemia There are more than 22 etiologies of methylmalonic aciduria and disorders of the metabolism of sulfur-containing amino acids, cobalamin, and folate. The trans-sulfuration pathway converts the sulfur atom of methionine into the sulfur atom of cysteine, and produces more methionine by the methylation of homocysteine. Homocystinuria types I, II, and III are characterized by different etiologies, biochemical abnormalities, and therapeutic measures. In type I, a deficiency of cystathionine ß-synthase (L-serine hydrolyase [adding homocysteine]; EC 4.2.1.22), homocysteine increases, resulting in methionine overproduction. A simple treatment with pyridoxine for the pyridoxine-responsive type, or a dietary restriction of methionine and supplementation with cystine for the pyridoxine-unresponsive type, greatly improves the outcome of affected infants [14]. In type II, defective remethylation due to a deficiency of TV^'^^-methylenetetrahydrofolate reductase (5-methyltetrahydrofolate: [acceptor] oxidoreductase; EC 1.1.99.15) (MTHFR, EC 1.1.1.68), folate, and betaine may have the advantage of lowering the homocysteine levels and increasing the methionine levels [15]. Recently, the importance of the role of folate and of early detection of type II patients has been stressed [16,17]. Type III is caused by a deficiency of TV^-methyltetrahydrofolate homocysteine methyltransferase (iS-adenosylL-methionine: L-homocysteine iS-methyltransferase, EC 2.1.1.10), resulting in the defective synthesis of methylcobalamin and deoxyadenosylcobalamin. This condition, caused by a genetic mutation, or nutritional vitamin Bn deficiency, is accompanied by combined homocystinuria and methylmalonic aciduria [18]. If the neonatal screening method for homocystinuria targets methionine in filter-paper blood spots, type II is not detected, because it causes low or relatively normal levels of plasma methionine with moderate homocystinuria. Furthermore, isolated hypermethioninemia due to a deficiency of hepatic methionine adenosyltransferase (^-adenosylmethionine synthetase, ATP: L-methionine iS-adenosyltransferase; EC 2.5.1.6) is detected as well; the latter is thought to be free of clinical symptoms in most cases, indicating that the accumulation of methionine in the body is not harmful. The simplified urease pretreatment is able to differentiate the three types of homocystinuria by the simultaneous measurement of methionine, homocystine, methylmalonate, uracil, and creatinine in filter paper urine samples, when each respective stable isotope-labeled compound is used [19]. The total ion current (TIC) and mass chromatograms of the trimethylsilyl derivatives of metabolites from a male patient with type I are shown in Figs. 1 and 2.
174
T. Kuhara
liv^
ob' ' / o b '
UMI)!!^^
IM
's'.ob' ' 'g'o'o' " lÖ.Öo' ' 'l-I.Öo' ' '12.00' ' ld.Ö0'tirne{min
Fig. 1. Total ion chromatograms (TIC) of the trimethylsilyl derivatives of metabolitesfromthe urine of a patient with homocystinuria type I during transient megaloblastic anemia due to folate and vitamin B12 deficiency. The major components of the peaks are: 7, glycine; 2, ß-aminoisobutyrate; 3, phosphate and leucine; 4, erythritol; 5, threitol; 6, methionine and internal standard (IS); 7, tetronate; 8, creatinine and IS; A, orotate and ^^N2-orotate (IS); P, mannitol; 10, urate; 77, n-heptadecanoate spiked; 72, cystine and IS; 75, pseudouridine; B, homocystine-dg (IS); ^'^ homocystine
9.20 9.30 9 40 9.50 9.60 9,70 9.80 9,90 10.00 J''"®, 13.50 13.60 13.70 13.BO 13.90 14.00 14.10 14,20 ^'me (min) (mm;
Fig. 2. Part of the total ion current (TIC) and mass chromatograms of Fig. 1. a m/z 254 (256) and 357 (359) for orotate. b m/z 278 (282) and 128 (131) for homocystine
Chemical Diagnosis of Inborn Errors of Metabolism
175
Urinary metabolite levels determined using the simplified urease procedure were compared before and after treatment with folate in a patient who temporarily had developed megaloblastic anemia and whose serum folate had been below the normal range. By way of this treatment the megaloblastic anemia disappeared, the level of orotate decreased to the normal range, the methionine level doubled, and the homocystine decreased markedly but was still distinctly higher than the control range. Thymidylate synthase, which catalyzes the conversion of deoxyuridine monophosphate (dUMP) to deoxythymidine monophosphate (dTMP), is folate-dependent, and pyrimidine biosynthesis is regulated by end-product inhibition. Folate deficiency thus causes impaired DNA synthesis, enhanced pyrimidine biosynthesis, megaloblastic anemia, and orotic aciduria. Folate supplementation significantly reduced the level of homocystine and dramatically increased that of methionine. Therefore, this simple yet sophisticated diagnostic procedure has proved usefiil for monitoring the biochemical and nutritional conditions of patients, especially for acquired deficiency of folate and vitamin B12, as well as for evaluating the efficacy of treatments [19].
3. Tailor-Made Medicine and Disease Diagnosis
3.1. Inborn Errors of Pyrimidine and Purine Metabolism Inherited enzyme defects in the de novo synthesis of purines and pyrimidines or in their salvage and catabolism can cause alterations in cellular nucleotide patterns and the accumulation of normal or abnormal purines, pyrimidines, and their degradation products in body fluids. Twenty-four disorders have now been recognized. These defects manifest clinically with a broad spectrum of symptoms, including severe neurological abnormalities, fatal immunodeficiency, anemia, or urolithiasis. Inborn errors of purine and pyrimidine metabolism still present a diagnostic problem [20]. To date, several methods have been reported, including high-performance liquid chromatography (HPLC), thin-layer chromatography, capillary electrophoresis, GC/MS, and nuclear magnetic resonance, for this diagnosis [21-26]. These methods, however, analyze only a part of the index metabolites in urine, and cannot be used readily for quantification, are lack of specificity or sensitivity, or require further analysis for a differential diagnosis [27]. Recently, high-performance liquid chromatography-electrospray ionization~MS/MS (LC/ESI-MS/MS) methods have been
176
T.Kuhara
developed for this purpose, but they still cannot be used for the accurate and concurrent determination of several metabolites [28,29]. Thus, metabolome analysis should be employed to screen and make a chemical diagnosis of these defects. Here, we describe a rapid and specific procedure for the chemical diagnoses of pyrimidine degradation disorders, Lesch-Nyhan syndrome, and adenine phosphoribosyl transferase deficiency using GC/MS.
3.2. Deficiencies of Pyrimidine Degradation The pyrimidine nucleic acid bases, thymine (T) and uracil (U), are degraded by four successive enzyme reactions in humans, as shown in Fig. 3. The degradation is catalyzed by dihydropyrimidine dehydrogenase (5,6-
H N - ' N ^ ^ NADPH+H^ NADP^
0V H (1)
-^i;^-
HN''\<^
H2O
JS^TCU
H2O NH3+CO2 HO^Nc^
o<^N> -^er- oK^ - S r ^ H.N'^" H (2)
H (3)
(4)
Fig. 3. The degradative pathway of pyrimidines. El, dihydropyrimidine dehydrogenase; E2, dihydropyrimidinase; £5, ß-ureidopropionase, [i?(-)-3-amino-2methylpropionate-pyruvate aminotransferase (D-3-aminoisobutyrate-pyruvate aminotransferase), ß-alanine-pyruvate aminotransferase, 4-aminobutyrate aminotransferase]. R=H: uracil (7), 5,6-dihydrouracil (2), ß-ureidopropionate (i), ß-alanine (4); R=CH3: thymine (7), 5,6-dihydrothymine (2), ß-ureidoisobutyrate (5), ß-aminoisobutyrate (4); R=F: 5-fluorouracil (7), 5,6-dihydro-5-fluorouracil (2), a-fluoroß-ureidopropionate (i), a-fluoro-ß-alanine (4) dihydropyrimidineiNADP"^ oxidoreductase; DHPDH, EC 1.3.1.2), dihydropyrimidinase (5,6-dihydropyrimidine amidohydrolase; DHP, EC 3.5.2.2), ß-ureidopropionase (UP, EC 3.5.1.6), and three aminotransferases. Many cases of DHPDH deficiency ( M M 274270) or DHP deficiency (MIM 222748) have been reported [30]. Clinical abnormalities in those with symptoms were variable and nonspecific [27,31]. There is no known treatment for either of these enzyme defects. Withdrawal of pyrimidine analogues from cancer chemotherapy regimens was critical for DHPDH deficiency [32,33]. The catabolic route (shown in Fig. 3) plays a significant role in the degradation of pyrimidine analogues in vivo. DHPDH is responsible for the breakdown of the widely used antineoplastic agent 5-fluorouracil (5-FU), given that more than 89.7% of administered 5-FU was excreted unchanged into the urine in DHPDH deficiency [34]. In
Chemical Diagnosis of Inborn Errors of Metabolism
177
healthy humans, however, more than 87% of the drug was metabolized to 5-fluoro-ß-alanine by this pathway [35]. Although the side effects of 5-FU for patients with a DHP deficiency have not been reported, a uracil-loading test indicated the possibility of the side effects [31]. Test whether a cancer patient has pyrimidine metabolism deficiencies before deciding treatment could prevent devastating side effects. DHPDH deficiency, DHP deficiency, and UP deficiency are respectively characterized by markedly increased concentrations of U and T [36], 5,6-dihydrouracil (DHU), and 5,6-dihydrothymine (DHT) accompanied by moderately increased U and T, and ß-ureidopropionate (ßUP) and ß-ureidoisobutyrate (ßUIB) in the urine [37,38]. Although several methods to screen for disorders of pyrimidine metabolism have been reported [39-41], they are time-consuming or lack of specificity or sensitivity. Recently, a rapid and specific screening method that involves the use of urine and LC/ESI-MS/MS [28], was described; it analyzes U, T, 5-hydroxymethyluracil, and orotate, but not DHT, DHU, creatinine, and amino acids. More recently, a method to screen all three disorders was developed, which simultaneously analyzes T, U, DHT, DHU, ßUP, and ßUIB, but not orotate or creatinine [29]. The diagnosis of these three pyrimidine degradation defects is attained by the measurement of pyrimidines, dihydropyrimidines, ß-ureides, creatinine, and orotate by the simplified urease pretreatment and stable isotope dilution method [42]. Recovery of the targets and their coefficient of variation are satisfactory, and the values for healthy controls were determined [43]. Efficacy of the procedure was tested by using artificial urine specimens that simulated DHPDH deficiency and DHP deficiency [42]. Figure 4 shows the TIC chromatograms of trimethylsilyl (TMS) derivatives of metabolites from urine of a patient with DHPDH deficiency and a patient with DHP deficiency. Figure 5 shows the mass chromatograms for these patients with pyrimidine degradation disorders. Abnormal levels of the indicators were determined by using urine specimens from three patients with proven DHPDH deficiency and from six with proven DHP deficiency [43]. Because the levels of the indicators, T, U, DHT, and DHU in the urine from healthy controls did not show a normally distributedfi-equencydistribution, the data were logio-transformed for statistical analysis, and mean and standard deviation (SD) are obtained from this distribution. Abnormality is defined as n in the mean above wxSD. In DHPDH and DHP homozygotes, the abnormality n was above 5.9 for the lower target, U, and DHU, respectively, above 8 for the higher one, T, and DHT, respectively. 5-Hydroxymethyluracil was distinctly detected in three cases with proven DHPDH deficiency. The value oin was 8.4-12 for DHT and 7.2-11 for DHU in the six
178
T. Kuhara
(a) DHPDH1
14
18
27
10.
13
13! 2
15
12..
3
24
21 23;
11,
IkilUM4iJy 00
25 17 20
l4U..i>^|^ 7.00
8,00
9,00
10.00
_«J ^WM^vA>4r^ 1100 12.00 13.00 Time (min) (b) DHP1
16 ^g 21 22 14 18
27
24
26
LMUHJ 8.00
ujUyü 00 10,00
yvi^M^'ii ^ ^JU4 M| 11,00
1
00
Time (min)
Fig. 4. Total ion current chromatograms of the trimethylsilyl derivatives of metabolites from the urine of a patient with dihydropyrimidine dehydrogenase (DHPDH) deficiency (a), and a patient with 5,6-dihydropyrimidme amidohydrolase (DHP) deficiency (b). Peaks are: 7, lactate-2; 2, do- and d5-glycine-2; 3, urea-2; 4, phosphate-3; 5, 2,2-dimethylsuccinate (ISO; ^, ^^No- and ^%-U-2; 7, T-2; 8, dihydrothymine (DHT); P, dihydrouracil (DHU); 10, erythritol-4; 77, doand d3-methionine-2; 72, 5-oxoproline-2; 13 and 13\ threonate-4 and erythronate-4; 14, do- and ds-creatinine-S; 75, do- and d5-phenylalanine-2; 16, 4-hydroxy-phenylacetate-2; 77, hydroxymethyluracil-3; 18, do- and d4-lysine-3 19, 2-hydroxyundecanoate-2 (IS2); 20, ^^No- and ^^N2-orotate-3; 27, citrate-4 22,4-hydroxyphenyllactate-3; 23, do- and d4-lysine-4; 24, do- and d4-tyrosine-3 25, glucose-5; 26, A^-acetyltyrosine-2; 27, heptadecanoate-1 (IS3). Do- and ^^NQmean the endogenous unlabeled metabolites
Chemical Diagnosis of Inborn Errors of Metabolism 3.DHP5
1.DHP DH1
1
1
TIC X 0.05
* ..A J
1
m/z 329 X 1
A
m/z 241x2
m/z 243x4 I
1 1
*
m/z 270 X 4 7.20
7.60
8.00
8.40
8.80 Time (min)
2.Control
6.40
6.80
7.20
Ä
ft 7.60
11 8.00
m/z 271 X 20 8.40
8.80 Time (min)
TIC X 0.05
A
TIC X 0.05
m/z 329 X 1
j
m/z 329 X 1 m/z 243 X 8
m/z 241x4 jl
m/z 243 X 8
A
m/z 255 X 4 7,20
7.60
8.00
8.40
8.80 Time (min)
m/z 241 X 8 m/z 255 X 8
i
Jl
m/z 270 X 8 6.80
m/z 241x4 m/z 255 X 4
4.Control H
6.40
,.^
1 ^ m/z 329x1
m/z 255 X 2 6.80
TIC X 0.05 .
Lk^ AAA iJ\
m/z 243x4
6.40
179
6.40
680
7.20
7.60
8.00
m/z 271x40 8.40
8.80 Time (min)
Fig. 5. Mass chromatograms of the trimethylsilyl derivatives of metabolites from the urine of a patient with DHPDH deficiency (7), a healthy control (2), a patient with DHP deficiency (i), and a healthy control (4). The ions targeted for DHPDH were [M]^ at m/z 329 for creatinine-3, [M-15]^ at m/z 241 for U-2, m/z 243 for ^%-U-2 (IS), and [M-15]^ at m/z 255 and [M]^ at m/z 270, for T-2. The ion of/w/z 255 due to [M-H]^ was also seen at U-2. The ions targeted for DHP were m/z 329 for creatinine-3, m/z 243 for 5,6-DHU-2, [M-15]'', m/z 271 for DHT, [M-H]^, m/z 243 for ^%-U-2 (IS), m/z 241 for U-2, and m/z 255 for T-2. The ion of m/z 111, due to the isotope of [M],^ was also seen at T-2. To clearly show the difference between the control and the patients, the mass chromatogram intensity in 2 and 4 is expressed as twice that in 1 and 5, except for creatinine-3 cases w^ith proven DHP deficiency. Thus, our determination of the abnormality n for the indicators in the urease-pretreatment procedure has sensitive diagnostic value, although no significant difference between the DHP carriers and controls w^as detected. During the pilot study of neonatal screening and diagnosis, an asymptomatic case of ßUPase deficiency was detected [44]. The urease pretreatment and lack of fractionation enabled the recovery of the highly polar ureides. In the urine of this newborn, ßUP and ßUIB were markedly increased and DHU and DHT were moderately increased. Defects in the three steps of pyrimidine degradation can now be screened for diagnosis. No further analysis is required for differential diagnosis because sensitive, specific, and simultaneous quantification of T, U, DHU,
180
T.Kuhara
DHT, ßUP, ßUIB, orotate, methylcitrate, and creatinine can be made by using the method of stable isotope dilution, urease pretreatment, and GC/MS. The simultaneous finding that the other metabolite levels are all within the healthy control range strongly supports the diagnosis. 3.3. Lesch-Nyhan Syndrome In the salvage pathway for purine nucleotide synthesis, hypoxanthine-guanine phosphoribosyltransferase (HPRT, EC 2.4.2.8) catalyzes the conversion of hypoxanthine and guanine to inosine monophosphate (IMP) and guanosine monophosphate (GMP), respectively. Lesch-Nyhan syndrome (LNS, MIM 30800) is caused by a severe or complete deficiency of HPRT [45]. Patients with LNS exhibit hyperuricemia, nephrolithiasis, and self-injurious behavior, but are frequently identified only after they present with gout or acute renal failure. Purine analogues such as azathiopurine, the standard immunosuppressive agent used in renal transplantation, is first converted to 6-mercaptopurine, a first-order cancer therapeutic agent but also an analogue of hypoxanthine. 6-Mercaptopurine is activated by HPRT to 6-mercaptopurine ribose phosphate. Thus both drugs have no effect on patients with LNS [46]. It is critical to develop a sensitive yet specific method to detect and identify LNS patients as early as possible. An adoption of the stable isotope dilution and urease pretreatment procedure for an accurate quantification of hypoxanthine and xanthine has enabled the fast and accurate chemical diagnosis of LNS [47]. Figure 4 shows the metabolic profile of a patient with LNS under treatment with allopurinol, which is used because it inhibits xanthine oxidase. The urine specimens were sent via air-mail as dried urine on filter paper, and only 0.1 ml of the eluate fi*om the paper was used for analysis. The prominent abnormality was detected in the hypoxanthine levels. The value was 8.4-9.0 SD above the mean (7.9-9.0 per total creatinine). The next most severe abnormality was in the xanthine levels, which were 4.0-6.1 SD above the mean (4-6.2). Guanine increased only slightly. Orotate also increased, to 3.9-5.7 SD above the mean (4.5-6.1), because of the treatment with allopurinol, which also inhibits orotidine monophosphate decarboxylase. 3.4. Adenine Phosphoribosyltransferase Deficiency In the salvage pathway for purine nucleotide synthesis, adenine phosphoribosyltransferase (APRT, EC 2.4.2.7) catalyzes the synthesis of adenosine
Chemical Diagnosis of Inborn Errors of Metabolism
181
monophosphate (AMP) from adenine and 5-phosphoribosyM-pyrophosphate (PRPP). APRT deficiency (MIM 1026000) blocks the use of adenine, which is oxidized by xanthine oxidase via 8-hydroxyadenine to 2,8-dihydroxyadenine. Because 2,8-dihydroxyadenine is extremely insoluble, its accumulation in the kidney leads to the formation of urinary stones. It is important that the exact nature of the stones must be recognized as early as possible after the initial presentation with urolithiasis. The clearest indicator of this disease is an accumulation of adenine and its oxidation products, 8-hydroxyadenine and 2,8-dihydroxyadenine. The urinary metabolites of a patient with APRT deficiency were examined using urease pretreatment, stable isotope dilution with labeled adenine, and GC/MS [8]. Only 100 |LI1 of urine was used for the sample preparation.
4. Mass Screening and Diagnosis by IVIetabolomics
4.1. Pilot Study for the Mass Screening of Newborns for 22 Target Diseases in Japan Practical, sufficiently specific, and cost-effective neonatal screening programs are currently conducted using blood spots on filter paper. GC/MS is also used for the secondary screening or scrutiny of positive cases detected by current neonatal screening procedures. Chamberlin and Sweeley reported that urine on filter paper was generally more useful than blood spots on filter paper [48]. Indeed, most laboratories using GC/MS techniques for diagnosis have used and still use urine or urine on filter paper rather than blood, blood spots on filter paper, or serum. For mass screening, however, only a limited number of studies have used urine. In 1990 Tuchman et al. adapted GC/MS for use in a mass screening program for neuroblastoma in 3-week-old infants in Quebec, using the organic solvent extraction of urine from filter paper under acidic conditions, which was further extended to a screening program for 20 or more different metabolic conditions in 1991 [49,50]. A GC/MS/MS method for screening urine specimens for 10 organic acidurias measured fourteen markers within 10 min after solid-phase extraction, oximation/trimethylsilylation [51]. However, polar organic acids, such as methylcitrate (the most sensitive and reliable indicator for propionic acidemia), were not analyzed because they were not extractable by solid-phase extraction. Using a simplified diagnostic procedure with urease-pretreatment and GC/MS techniques, a joint pilot study of neonatal screening started in 1995
182
T.Kuhara
among four institutions in Japan: Kanazawa Medical University, Kurume University Medical School, Shimane Medical University, and Chiba Prefecture Children's Hospital [9,10]. This pilot study targets twenty-two lEMs and neuroblastoma at present (Table 3). After obtaining written informed consent from the parents, urine specimens were takenfi-omneonates on day 5-7, when blood was taken for the Guthrie test, and sent for testing as dried urine on filter paper. By the end of January 2005, a total of 79,385 newborns were examined and a total of 57 cases were identified at the four institutions (Table 4). The incidence was one per 1390 at the four institutions. Table 3. The 23 target diseases in the present pilot study No. 1 2 3 4 5 6 7
8 9 10
11 12 13 14 15 16 17 18 19 20 21 22 23
Disease Methylmalonic acidemia Propionic acidemia Isovaleric acidemia ß-Methylcrotonylglycinuria ß-Hydroxy-ß-methylglutaric acidemia ß-Ketothiolase deficiency Multiple carboxylase deficiency holocarboxylase synthetase deficiency biotinidase deficiency glutaric aciduria type I Tyrosinemia Primary hyperammonemias* ornithine transcarbamylase (OTC) deficiency citruUinemia argininosuccinic aciduria argininemia Dicarboxylic aciduria (glutaric aciduria type II etc.) Lactic aciduria Glyceroluria Alkaptonuria Pyroglutamic aciduria a-Aminoadipic-a-ketoadipic aciduria Dibasic amino aciduria (cystinuria, lysinuria) Hyperglycinemia Maple syrup urine disease Phenylketonuria, hyperphenylalaninemia Galactosemia Hypermethioninemia, homocystinuria Neuroblastoma
*Four urea cycle disorders can be screened, and also other two primary hyperammonemia (LPI, HHH syndrome; see section 2.1 in this Chapter)
Chemical Diagnosis of Inborn Errors of Metabolism
183
Table 4. Abnormal cases detected in four institutes by pilot studies of newborn urine screening (February 1995-January 2005) No. 1 2
Disease
Total
Institute Kanazawa^ 1 4
Propionic acidemia 3 Methylmalonic 12 aciduria 3 1 Lactic aciduria: 1 PDH (El a) def 4 Maple syrup urine 1 — disease 5 OTC deficiency 1 — 6 Citrullinemia 1 7 Hyperphenylala1 — ninemia 8 2 HomocystinuriaMethylmalonic aciduria 9 Glyceroluria 4 10 DPDH deficiency^ 1 11 DHP deficiency' 1 12 3 1 ß-Ureidopropionase deficiency 13 15 2 a-Aminoadipic/ a-Ketoadipic aciduria 14 Hartnup disease 2 1 15 Orotic aciduria 1 1 16 Cystathioninuria 1 1 17 Cystinuria 2 2 18 Galactosemia/ 1 1 Tyrosinuria 19 Tyrosinemia 1 1 20 Neuroblastoma 3 1 Total with defect 57 17 Total screened 18,719 79,385 Incidence 1/1100 1/1390 ^Kanazawa Medical University ^Kurume University School of Medicine '^Shimane Medical University ^Dihydropyrimidine dehydrogenase 'Dihydropyrimidinase
Kurume'' Shimane^ Chiba 2 8 — — —
—
—
1
~
—
1
— 1
—
-
-
1 — —
—
—
2 39 54,146 1/1390
1 4424 1/4424
2096
1 — 2
4 1 1 2 13
184
T.Kuhara
Diseases including propionic acidemia and methylmalonic acidemia were found at an incidence of 0.09%, one per 1100 in neonatal mass screening carried out at our institute, Kanazawa Medical University. However the incidence was very high as 2.5% for high-risk patients, and 8.3% for high-risk newborns. A method to analyze amino acids and acylcamitines in blood spots on filter paper by tandem mass spectrometry has been developed by Millington et al., Chace et al., and Rashed et al. [52-54]. This method was used recently for neonatal screening [55]. This method detected lEMs with an incidence of 1 per 4300 infants in Pennsylvania and North Carolina [55]. In Japan, Professor Shigematsu of Fukui University found an incidence of lEMs of 1 per 7800 neonates using blood/tandem mass spectrometry (personal communication). Although that method permits high-speed analyses, it might be more appropriate to reserve it for screening rather than for chemical diagnosis. Propionic acidemia and methylmalonic aciduria were found to have high incidence in Japan. In our procedure, more than 200 compounds are quantified to screen for 22 lEMs. Because sufficient information for a conclusive chemical diagnosis can be obtained in most cases, no further analysis is required. At present, 22 diseases are targeted. In the very near future, to meet the demands of the emerging field of tailor-made medicine, this metabolome approach will be used for the screening and diagnosis of 130 EEMs [1].
4.2. Neonatal Screening and Diagnosis of Propionic Acidemia Propionic acidemia (PCCD) is a rare but serious disease caused by inborn errors of propionyl coenzyme A (propyonyl CoA) metabolism. Propionic acidemia has a wide range of clinical manifestations varying from severe neonatal ketoacidosis with the risk of major handicap or death, to an asymptomatic or mild disease that usually responds well to treatment and has a good long-term outcome. Recent studies have suggested that neurological deterioration occurs in patients even in the absence of ketosis or metabolic acidosis and that fatality is not limited to cases with neonatal onset but also include the unusual late-onset ones. Even patients with relatively high residual PCC activity (11%) are known to develop serious clinical manifestations later. Therefore, PCCD should be considered in all newborn infants with unexplained neurological deterioration, even in the absence of ketosis and metabolic acidosis. As neurological abnormalities in newborns are often difficult to notice, a sensitive, yet specific screening during the neonatal period seems critical to assure a high quality of life for individuals with
Chemical Diagnosis of Inborn Errors of Metabolism
185
PCCD. Propionyl-CoA is a catabolic intermediate derived from essential amino acids (isoleucine, valine, methionine, threonine), odd-chain fatty acids, and the cholesterol side chain. Propionyl-CoA is normally metabolized to D-methylmalonyl-CoA by biotin-dependent propionyl-CoA carboxylase (propionyl-CoA: carbon-dioxide ligase; FCC, EC 6.4.1.3). D-Methylmalonyl-CoA is racemized to L-methylmalonyl-CoA by L-methylmalonyl-CoA mutase (EC 5.1.99.1) and then isomerized to succinyl-CoA by L-methylmalonyl-CoA mutase (methylmalonyl-CoA CoA-carbonylmutase, EC 5.4.99.2). Propionic acidemia (MIM 232000 and 232050), originally described as ketotic hyperglycinemia, is an autosomal recessive lEM in which the activity of FCC is deficient or greatly reduced. Methylmalonic aciduria is an autosomal recessive lEM in which the activity of L-methylmalonyl-CoA mutase is deficient or greatly reduced (MIM 251000). It has four known etiologies; two involve an apomutase deficiency (mut, mut~) and two involve defective adenosylcobalamine synthesis (cblA and cblB). Another inborn error in cobalamine metabolism causes combined methylmalonic acidemia and homocystinuria, which is detectable by the urease-pretreatment procedure. In an asymptomatic infant that was identified during a screening program using GC/MS analysis of urine, only methylcitrate level was high, indicating a deficiency of propionyl-CoA carboxylase [56]. Methylcitrate was distinctly elevated at 4.8 SD above the normal mean relative to the 2-hydroxyundecanoate internal standard. The activity of PCC in lymphocytes was 7% of the control. Gene analysis detected a single mutation, TAT to TGT, resulting in a Y435C substitution in the ß chain of a homozygous form. Propionylcamitine is increased in the blood and urine of patients with PCCD or methylmalonic acidemia. These diseases are screened by evaluating the level of propionylcamitine in the blood by tandem mass spectrometry [57,58]. However, our study raises a question about the sensitivity of the more-practical tandem MS newborn screening for the detection of propionic acidemia. In the patient, methylcitrate was highly elevated and clearly diagnostic by the GC/MS of urine at 4 days of age. The parallel analysis of a dried blood spot at 4 days of age by tandem MS, however, showed only a borderline elevation of propionylcamitine [56]. Methylcitrate is the most reliable indicator for PCCD, but it also increases in methylmalonic acidemia and multiple carboxylase deficiency. The differential diagnosis of disorders with elevated methylcitrate depends on the presence or absence of methylmalonate, 3-hydroxyisovalerate, or 3-methylcrotonylglycine, which can also be determined by the GC/MS screening method. In the urine analysis using GC/MS, methylmalonate is the target for detecting methylmalonic acidemia, and methylcitrate is the target for detecting PCCD. Both methylmalonate and methylcitrate are ef-
186
T.Kuhara
ficiently cleared by the kidney. In patients with PCCD, methylcitrate and 3-hydroxypropionate were detected in large amounts in the urine, but only 3-hydroxypropionate was detected in serum only small amounts [7,10]. Neonates with benign methylmalonic aciduria showed a marked increase in methylmalonate in the urine but not in the serum. Urease pretreatment without fractionation allows the highly sensitive identification of presymptomatic newborns with PCCD, and patients during their remission.
5. Prenatal Diagnosis of Propionic Acidemia Once the index case was identified in a family, prenatal diagnosis of potentially affected fetuses is an important component of genetic counseling. Methylcitrate is increased in cell-free amniotic fluid when a fetus is affected and is a key indicator for the prenatal diagnosis of PCCD. Since the direct chemical analysis of methylcitrate in cell-free amniotic fluid using stable isotope dilution GC/MS was introduced by Naylor et al. and Sweetman et al., direct methods have been developed rapidly not only for PCCD but also for the other organic acidemias [59-65]. Methods involving cell culture are time-consuming and potentially unreliable due to the possible contamination of amniotic fluid with maternal cells [66]. The measurement of PCC activity in chorionic villi has also been reported to give false results due to contamination by maternal tissue. The levels of methylcitrate in cases where the fetus is not affected are as low as in control cases. However, the concentrations in the amniotic fluid of affected fetuses are 20 to 30 times higher than in the fluid of unaffected fetuses [67]. For the affected cases, 20 |LI1 of amniotic fluid is enough for quantification by GC/MS with the selected ion monitoring (SIM) mode, and 100 |J.l is enough with the scanning mode (scan range m/z 250 to m/z 500). The recovery of methylcitrate from the amniotic fluid is as high as 91%, and the coefficient of variation is low in this procedure. Thus, simplified urease pretreatment, stable isotope dilution method, which requires no fractionation, is rapid, highly sensitive, and more accurate than methods previously reported for the prenatal diagnosis of PCCD.
6. Conclusion The simplified urease-pretreatment, GC/MS diagnostic procedure targets more than 200 compounds in biological samples, from lactate to homocystine, within 15 min. Capillary GC allows the most efficient separation
Chemical Diagnosis of Inborn Errors of Metabolism
187
and gives the chromatographic information of very precise retention time of compounds of endogenous and exogenous origin. Mass spectrometry combined with GC, with full scan mass spectra and extracted ion chromatograms, enables the sensitive detection, accurate identification, and reliable quantification of these compounds. This procedure, technically practical yet comprehensive from the metabolic point of view, will provide a valuable tool for the screening and chemical diagnosis of more than 130 target diseases (Table 5). This procedure is also useful for detecting abnormal nutritional conditions in patients, such as an acquired vitamin deficiency or overload with a specific compound, and for the monitoring and evaluation of patients and animals receiving various treatments, including liver transplantation and gene therapy. Table 5. Screening and chemical diagnosis can be made simultaneously for more than 130 inherited metabolic disorders Diseases or metabolic disorders in: 1. Branched chain amino acids 2. Primary hyperammonemia and citrin deficiency 3. Aromatic amino acids 4. Sulfur-containing amino acids, folate, cbl 5. Membrane transport 6. Gly, His, Pro, ß-Ala 7. Om, Lys, Trp 8. Pyrimidine, purine 9. Galactose, fructose, TCA cycle Total Neuroblastoma Primary lactic acidemia Fatty acid oxidation
No. of disorders 20 9 15 16 9 17 7 8 7 108 1 DMD 16 screened 5 screened
References Kuhara T (2004) Gas chromatographic-mass spectrometric urinary metabolome analysis to study mutations of inborn errors of metabolism. Mass Spectrom Rev 2004; on-line, accessed Sep 16 Prietsch V, Lindner M, Zschocke J, Nyhan WL (2002) Emergency management of inherited metabolic diseases. J Inher Metab Dis 25:531-546 Brusilow SW, Horwich A (2001) Urea cycle enzymes In: Scriver CR, Beaudet AL, Sly WS, Valle D (eds) The metabolic and molecular bases of inherited disease, 8th edn. McGraw-Hill, New York, pp 1909-1963
188
T.Kuhara
4. Valle D, Simell O (2001) The hyperomithinemias In: Scriver CR, Beaudet AL, Sly WS, Valle D (eds) The metabolic and molecular bases of inherited disease, 8th edn. McGraw-Hill, New York, pp 1857-1895 5. Simell O (2001) Lysinuric protein intolerance and other cationic amino acidurias In: Scriver CR, Beaudet AL, Sly WS, Valle D (eds) The metabolic and molecular bases of inherited disease, 8th edn. McGraw-Hill, New York, pp 4933^956 6. Bachmann C, Colombo JP (1980) Diagnostic value of orotic acid excretion in heritable disorders of the urea cycle and in hyperammonemia due to organic acidurias. Eur J Pediatr 134:109-113 7. Kuhara T (2001) Diagnosis of inborn errors of metabolism using filter paper urine, urease treatment, isotope dilution and gas chromatography-mass spectrometry. J Chromatogr B 758:3-25 8. Kuhara T (2002) Diagnosis and monitoring of inborn errors of metabolism using urease-pretreatment of urine, isotope dilution, and gas chromatography-mass spectrometry. J Chromatogr B 781:497-517 9. Matsumoto I, Kuhara T (1996) A new chemical diagnostic method for inborn errors of metabolism by mass spectrometry. Mass Spectrom Rev 15:43-57 10. Kuhara T, Shmka T, Inoue Y, Zhen-Wei X, Ohse M, Yoshida I, Inokuchi T, Yamaguchi S, Takayanagi M, Matsumoto I (1999) Pilot study of gas chromatography-mass spectrometric screening of newborn urine for inborn errors of metabolism after treatment with urease. J Chromatogr B 731:141-147 11. Shinka T, Inoue Y, Peng H, Zhen-Wei X, Ohse M, Kuhara T (1999) Urine screening of five-day-old newborns: metabolic profiling of neonatal galactosuria. J Chromatogr B 732:469^77 12. Miyajima H, Orii KE, Shindo Y, Hashimoto T, Shinka T, Kuhara T, Matsumoto I, Shimizu H, Kaneko E (1997) Mitochondrial trifunctional protein deficiency associated with recurrent myoglobinuria in adolescence. Neurology 49:833-837 13. Robinson BH (2001) Lactic acidemia: disorders of pyruvate carboxylase and pyruvate dehydrogenase. In: Scriver CR, Beaudet AL, Sly WS, Valle D (eds) The metabolic and molecular bases of inherited disease, 8th edn. McGraw-Hill, New York, pp 2275-2295 14. Mudd SH, Levy HL, Skovby F (1995) Disorders of transsulfuration In: Scriver CR, Beaudet AL, Sly WS, Valle D (eds) The metabolic and molecular bases of inherited disease, 7th edn. McGraw-Hill, New York, pp 1279-1327 15. Rosenblatt DS (1995) Inherited disorders of folate transport and metabolism transsulfuration In: Scriver CR, Beaudet AL, Sly WS, Valle D (eds) The metabolic and molecular bases of inherited disease, 7th edn. McGraw-Hill, NewYork,pp 3111-3128 16. Guenther BD, Sheppard CA, Tran P, Rozen R, Matthews RG, Ludwig ML (1999) The structure and properties of methylenetetrahydrofolate reductase fi*om Escherichia coli suggest how folate ameliorates human hyperhomocysteinemia. Nat Struct Biol 6:359-365
Chemical Diagnosis of Inborn Errors of Metabolism
189
17. Valevski AF, Bassan H, Korman SH, Lerman-Sagie T, Gutman A, Harel S (2000) Methylenetetrahydrofolate reductase deficiency: importance of early diagnosis. J Child Neurol 15:539-543 18. Fenton WA, Rosenberg LE (1995) Inherited disorders of cobalamin transport and metabolism Inherited disorders of folate transport and metabolism transsulfliration In: Scriver CR, Beaudet AL, Sly WS, Valle D (eds) The metabolic and molecular bases of inherited disease, 7th edn. McGraw-Hill, New York, pp 3129-3149 19. Kuhara T, Ohse M, Ohdoi C, Ishida S (2000) Differential diagnosis of homocystinuria by urease-treatment, isotope-dilution and gas chromatography-mass spectrometry. J Chromatogr B 742:59-70 20. Simmonds HA, Fau-banks LD, Duley JA, Marinaki A (2000) Genetic disorders of purine and pyrimidine metabolism: problems of diagnosis. CPD Bull Clin Biochem 2:13-18 21. Simmonds HA, Duley JA, Davies PM (1991) Analysis of purines and pyrimidines in blood, urine, and other physiological fluids. In: Hommes FA (ed) Techniques in diagnostic human biochemical genetics. A laboratory manual. Wiley-Liss, New York, pp 397-424 22. Van Gennip AH, Van Noordenburg-Huistra DY, De Bree PK, Wadman SK (1978) Two-dimensional thin-layer chromatography for the screening of disorders of purine and pyrimidine metabolism. Clin Chim Acta 86:7-20 23. Friedecky D, Adam T, Bartak P (2002) Capillary electrophoresis for detection of inherited disorders of purine and pyrimidine metabolism: a selective approach. Electrophoresis 23:565-571 24. Christensen E, Brandt NJ, Laxdal T (1987) Adenine phosphoribosyltransferase deficiency: a case diagnosed by GC-MS identification of 2,8-dihydroxyadenine in urinary crystals. J Inher Metab Dis 10:187-194 25. Duran M, Dorland L, Meuleman EEE, Allers P, Berger R (1997) Inherited defects of purine and pyrimidine metabolism: laboratory methods for diagnosis. J Inher Metab Dis 20:227-236 26. Wevers RA, Engelke UF, Moolenaar SH, Bräutigam C, De Jong JG, Duran R, De Abreu RA, Van Gennip AH (1999) IH-NMR spectroscopy of body fluids: inborn errors of purine and pyrimidine metaboHsm. Clin Chem 45:539-548 27. Van Gennip AH, Abeling NGGM, Vreken P, Van Kuilenburg AB (1997) Inborn errors of pyrimidine degradation: clinical, biochemical and molecular aspects. J Inher Metab Dis 20:203-213 28. Ito T, Van Kuilenburg AB, Bootsma AH, Haasnoot AJ, Van Cruchten A, Wada Y, Van Gennip AH (2000) Rapid screening of high-risk patients for disorders of purine and pyrimidine metabolism using HPLC-electrospray tandem mass spectrometry of Hquid urine or urine-soaked filter paper strips. Clin Chem 46:445-452 29. Van Lenthe H, Van Kuilenburg AB, Ito T, Bootsma AH, Van Cruchten A, Wada Y, Van Gennip AH (2000) Defects in pyrimidine degradation identified by HPLC-electrospray tandem mass spectrometry of urine specimens or urine-soaked filter paper strips. Clin Chem 46:1916-1922
190
T. Kuhara
30. Webster DR, Becroft DM0, Van Gennip AH, Van Kuilenbury AB (2001) Hereditary orotic aciduria and other disorders of pyrimidine metabolism. In: Scriver CR, Beaudet AL, Sly WS, Valle D (eds) The metabolic and molecular bases of inherited disease, 8th edn. McGraw-Hill, New York, pp 2663-2702 31. Sumi S, Imaeda M, Kidouchi K, Ohba S, Hamajima N, Kodama K, Togari H, Wada Y (1998) Population and family studies of dihydropyrimidinuria: prevalence, inheritance mode, and risk of fluorouracil toxicity. Am J Med Genet 78:336-340 32. Wei X, Mcleod HL, McMurrough J, Gonzalez FJ, Femandez-Salguero P (1996) Molecular basis of the human dihydropyrimidine dehydrogenase deficiency and 5-fluorouracil toxicity. J Clin Invest 98:610-615 33. Van Kuilenburg AB, Vreken P, Beex LV, Meinsma R, Van Lenthe H, De Abreu RA, Van Gennip AH (1998) Heterozygosity for a point mutation in a invariant splice donor site of dihydropyrimidine dehydrogenase and severe 5-fluorouracil related toxicity. Adv Exp Med Biol 431:293-298 34. Diasio RB, Beavers TL, Carpenter JT (1988) Familial deficiency of dihydropyrimidine dehydrogenase: biochemical basis for familial pyrimidinemia and sever 5-fluorouracil-induced toxicity. J Clin Invest 81:47-51 35. Heggie GD, Sommadossi JP, Cross DS, Huster WJ, Diasio RB (1987) Clinical pharmacokinetics of 5-fluorouracil and its metabolites in plasma, urine, and bile. Cancer Res 47:2203-2206 36. Bakkeren JAJM, De Abreu RA, Sengers CA, Gabreels FJM, Maas JM, Renier WO (1984) Elevated urine, blood and cerebrospinal fluid levels of uracil and thymine in a child with dihydrothymine dehydrogenase deficiency. Clin Chim Acta 140:247-256 37. Duran M, Rovers P, De Bree PK, Schreuder CH, Beukenhorst H, Dorland L, Berger R (1991) Dihydropyrimidinuria: a new inborn error of pyrimidine metabolism. J Inher Metab Dis 14:367-370 38. Van Gennip AH, Abeling NG, Elzinga-Zoetekouw L, Schölten LG, Van Cruchten A, Bakker HD (1989) Comparative study of thymine and uracil metabolism in healthy persons and in a patient with dihydropyrimidme dehydrogenase deficiency. Adv Exp Med Biol 253:111-118 39. Ohba S, Kidouchi K, Katoh T, Kibe T, Kobayashi M, Wada Y (1991) Automated determination of orotic acid, uracil and pseudouridine in urine by high-performance liquid chromatography with column switching. J Chromatogr 568:325-332 40. Van Gennip AH, Busch S, Elzinga L, Stroomer AE, Van Cruchten A, Schölten EG, Abeling NG (1993) Application of simple chromatographic methods for the diagnosis of defects in pyrimidine degradation. Clin Chem 39:380-385 41. Wadman SK, Beemer FA, De Bree PK, Duran M, Van Gennip AH, Ketting D, Van Sprang FJ (1984) New defects of pyrimidine metabolism. Adv Exp Med Biol 165:109-114 42. Kuhara T, Ohdoi C, Ohse M (2001) Simple gas chromatographic-mass spectrometric procedure for diagnosing pyrimidine degradation defects for prevention of severe anticancer side effects. J Chromatogr B 758:61-74
Chemical Diagnosis of Inborn Errors of Metabolism
191
43. Kuhara T, Ohdoi C, Ohse M, Van Kuilenburg ABP, Van Gennip AH, Sumi S, Ito T, Wada Y, Matsumoto I (2003) Rapid gas chromatographic-mass spectrometric diagnosis of dihydropyrimidine dehydrogenase deficiency and dihydropyrimidmase deficiency. J Chromatogr B 792:107-115 44. Ohse M, Matsuo M, Ishida A, Kuhara T (2002) Screening and diagnosis of ß-ureidopropionase deficiency by gas chromatographic/mass spectrometric analysis of urine. J Mass Spectrom 37:954-962 45. Jinnah HA, Friedmann T (2001) Lesch-Nyhan disease and its variants. In: Scriver CR, Beaudet AL, Sly WS, Valle D (eds) The metabolic and molecular bases of inherited disease, 8th edn. McGraw-Hill, New York, pp 2537-2570 46. Nyhan WL (1997) The recognition of Lesch-Nyhan syndrome as an inborn error of purine metabolism. J Inher Metab Dis 20:171-178 47. Ohdoi C, Nyhan WL, Kuhara T (2003) Chemical diagnosis of Lesch-Nyhan syndrome using gas chromatography and mass spectrometric detection. J Chromatogr B 792:123-130 48. Chamberlin BA, Sweeley CC (1987) Metabolic profiles of urinary organic acids recovered from absorbent filter paper. Clin Chem 33:572-576 49. Tuchman M, Lemieux B, Auray-Blais C, Robinson LL, Giguere R, MacCann MT, Woods WG (1990) Screening for neuroblastoma at 3 weeks of age: methods and preliminary results from the Quebec Neuroblastoma Screening Project. Pediatrics 86:765-773 50. Tuchman M, McCann MT, Johnson PE, Lemieux B (1991) Screenmg newboms for multiple organic acidurias in dried filter paper urine samples: method development. Pediatr Res 30:315-321 51. Hagen T, Korson MS, Sakamoto M, Evans JE (1999) A GC/MS/MS screening method for multiple organic acidemias from urine specimens. Clin Chim Acta 283:77-88 52. Millington DS, Norwood DL, Kodo N, Roe CR, Inoue F (1989) Application of fast atom bombardment with tandem mass spectrometry and liquid chromatography/mass spectrometry to the analysis of acylcamitines in human urine, blood, and tissue. Anal Biochem 180:331-339 53. Chace DH, Millington DS, Terada N, Kahler SG, Roe CR, Hofinan LF (1993) Rapid diagnosis of phenylketonuria by quantitative analysis for phenylalanine and tyrosine in neonatal blood spots by tandem mass spectrometry. Clin Chem 39:66-71 54. Rashed MS, Ozand PT, Bucknall MP, Little D (1995) Diagnosis of inborn errors of metabolism from blood spots by acylcamitines and amino acids profiling using automated electrospray tandem mass spectrometry. Pediatr Res 38:324-331 55. Naylor EW, Chace DH (1999) Automated tandem mass spectrometry for mass newborn screening for disorders in fatty acid, organic acid, and amino acid metabolism. J Child Neurol 14:S4-S8 56. Kuhara T, Ohse M, Inoue Y, Yorifiiji T, Sakura N, Mitsubuchi H, Endo F, Ishimatu J (2002) Gas chromatographic-mass spectrometric newborn screening for propionic acidaemia by targeting methylcitrate in dried filter-paper urine samples. J Inher Metab Dis 25:98-106
192
T.Kuhara
57. Millington DS, Kodo N, Norwood DL, Roe CR (1990) Tandem mass spectrometry: a new method for acylcamitine profiling with potential for neonatal screening for inborn errors of metabolism. J Inher Metab Dis 13:321-324 58. Rashed MS, Bucknall MP, Little D, Awad A, Jacob M, Alamoudi M, Alwattar M, Ozand PT (1997) Screening blood spots for inborn errors of metabolism by electrospray tandem mass spectrometry with a microplate batch process and a computer algorithm for automated flagging of abnormal profiles. Clin Chem 43:1129-1141 59. Naylor G, Sweetman L, Nyhan WL, Hombeck C, Griffiths J, Morch L, Brandange S (1980) Isotope dilution analysis of methylcitric acid in amniotic fluid for the prenatal diagnosis of propionic and methyhnalonic acidemia. Clin Chim Acta 107:175-183 60. Sweetman L (1984) Prenatal diagnosis of the organic acidurias. J Inher Metab Dis 7(suppll): 18-22 61. Fensom AH, Benson PF, Chalmers RA, Tracey BM, Watson D, King GS, Pettit BR, Rodeck CH (1984) Experience with prenatal diagnosis of propionic acidemia and methylmalonic aciduria. J Inher Metab Dis 7(suppl 2): 127-128 62. Kretschmer RE, Bachmann C (1988) Methylcitric acid determination in amniotic fluid by electron-impact mass fi-agmentography. J Clin Chem Clin Biochem 26:345-348 63. Holm J, Ponders L, Sweetman L (1989) Prenatal diagnosis of propionic and methylmalonic acidaemia by stable isotope dilution analysis of amniotic fluid. J Inher Metab Dis 12(suppl 2):271-273 64. Jakobs C (1989) Prenatal diagnosis of inherited metabolic disorders by stable isotope dilution GC-MS analysis of metabolites in amniotic fluid: review of four years experience. J Inher Metab Dis 12(suppl 2):267-270 65. Jakobs C, Ten Brink HJ, Stellaard F (1990) Prenatal diagnosis of inherited metabolic disorders by quantitation of characteristic metabolites in amniotic fluid: fact and future. Prenat Diagn 10:265-271 66. Buchanan PD, Kahler SG, Sweetman L, Nyhan WL (1980) Pitfalls in the prenatal diagnosis of propionic acidemia. Clin Genet 18:177-183 67. Inoue Y, Kuhara T (2002) Rapid and sensitive method for prenatal diagnosis of propionic acidemia using stable isotope dilution gas chromatography-mass spectrometry and urease pretreatment. J Chromatogr B 776:71-77
Part IV. Metabolome Informatics
Chapter 13: Introduction to the ARM Database: Database on Chemical Transformations in Metabolism for Tracing Pathways Masanori Arita^'^'^ ^Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, and PRESTO, JST Kashiwanoha 5-1-5 CB05, Kashiwa 277-8561, Japan ^Institute for Advanced Biosciences, Keio University, Nipponkoku 403-1, Daihouji, Tsuruoka, Yamagata 997-0017, Japan ^Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Aomi 2-42, Koto-ku, Tokyo 135-0064, Japan
1. Introduction Imagine that you are planning a train trip. You will need a railroad map to decide which trains to take and where to transfer. Likewise, a map of metabolic pathways, called a metabolic map, is necessary to trace pathways through the complex cellular metabolism (Fig. 1). The most famous is the map designed and maintained by Boehringer Mannheim Corporation, and subsequently by Roche Applied Sciences for more than 30 years; on the roughly 1.5x2 m wall-chart, thousands of molecular structures and their catalytic transformations are neatly positioned according to their biological roles. The map is also available in a book form, in which every 2-page spread describes a pathway fulfilling a particular biological role [1]. At a glance, a railroad map and a metabolic map look alike; both represent the "relations" of objects such as stations, or metabolites. However, the two maps exhibit several inherent differences. First is the layout of the described objects: a railroad map represents geographical locations of the stations, but a metabolic map does not have such planar constraint. The layout of a metabolic map depends completely on its editors, and the same compound may be located at multiple positions (or pages) for easier understanding of biological function. The second difference is the way objects are linked. A railroad always connects one station with another, whereas a biochemical reaction most often transforms multiple substrates into multiple products. What happens in metabolism is, if interpreted using an analogy
194
M. Arita
to a railroad map, the rearrangement of coaches between trains at each station. In such a railroad system, researchers are trying to locate a coach that takes you to the destination without any transfer. Thus, tracing a metabolic pathway is more complicated than tracing a railway: structural transformation of metabolites must be considered at each reaction step to guarantee that a particular structural moiety is correctly traced. Computer-assisted tracing of metabolic pathways should have been the prioritized task of metabolic databases. Without this function, it is impossible to reconstruct metabolic pathways from genes or to predict alternative pathways. (Obviously, in each reconstruction or prediction, the result must be verified that it really constitutes a pathway.) In other words, tracing pathways is the most fundamental requirement for metabolic databases. This chapter introduces the Atomic Reconstruction of Metabolism (ARM) software system, designed particularly for this purpose.
Fig. 1. Part of the metabolic map edited by Roche Applied Sciences. A digitized version is available at http://www.expasy.org/tools/pathways/
ARM Database
195
Table 1. Abbreviations and acronyms ATP ADP AMP CoA Ery4P FAD Fru6P GA3P NAD NADH PPi Rbu5P Rib5P Sed7P Xyl5P
Adenosine triphosphate Adenosine diphosphate Adenosine monophosphate Coenzyme A D-Erythrose 4-phosphate Flavin adenine dinucleotide D-Fructose 6-phosphate D-Glyceraldehyde 3-phosphate Nicotinamide adenine dinucleotide (Reduced) nicotinamide adenine dinucleotide Pyrophosphate D-Ribulose 5-phosphate D-Ribose 5-phosphate D-Sedoheptulose 7-phosphate D-Xylulose 5-phosphate
2. The Difficulty of Tracing IVIetabolic Pathways This section introduces several examples to show^ the difficulty of tracing metabolic pathways. Table 1 is the list of acronyms for compounds used in the text.
2.1. Definition of IVIetabolic Pathways In this chapter, a metabolic pathway (pathway for short) from metabolite X to 7is defined as a sequence of biochemical reactions through which at least one carbon or nitrogen atom in breaches Y. Only carbon and nitrogen atoms are considered throughout this text. A metabolite 7 is called reachable from X if there is a pathway from X to Y. This definition is consistent with the traditionally used terms such as "biosynthetic pathway" or "degradation pathway."
2.2. Pentose Phosphate Pathway The pentose phosphate pathway is primarily an anabolic pathway to provide the cell with 5-carbon sugars from glucose (Fig. 2). Let us consider three transferase reactions in the pathway. "EC" means the hierarchical classification (numbering) of enzyme-catalyzed reactions.
196
M. Arita EC 2.2.1.1: Rib5P+Xyl5P=GA3P+Sed7P EC 2.2.1.2: GA3P+Sed7P=Fru6P+Ery4P EC 2.2.1.1: Fru6P+GA3P=Ery4P+Xyl5P
From the above reactions, the pathway from Xyl5P to Ery4P seems to proceed through either GA3P or Sed7P. Xyl5P
EC2.2.1.r
GA3P
EC2.2.1.2"
Ery4P
Xyl5P
EC2.2.L1
Sed7P
EC2.2.1.2
Ery4P
However, neither is a correct pathway. Figure 3 shows the structural transformation in EC 2,2.1.1 and EC 2.2.1.2 reactions. Since all carbon atoms in Xyl5P are transferred to Fru6P, none reaches Ery4P in the above two reaction steps. A correct pathway from Xyl5P to Ery4P requires at least three steps: Xyl5P
EC2.2.1.r
GA3P
EC2.2.1.2 '
Fru6P
EC2.2.1.1
Ery4P
In Fig. 2, this pathway can be shown in two ways, because both Fru6P and GA3P appear twice in the figure. From this example, it is clear that a superficial connectivity on a metabolic map does not always correspond to a metabolic pathway, and that the same pathway may be drawn in multiple ways on the traditional metabolic map. CHjOH
c=o HCOH
Rbu5P
HCOH CHjOPOsH ribulose-5-phosphate isomerase (EC 5.3.1.6 )
ribulose-5-phosphate 3-epimerase (EC 5 . 1 . 3 . 1 ) CH2OH
\
c=o Xyl5P
^
CHO HCOH HCOH
HOCH
RibSP
HCOH
HCOH
CHjOPOjH? 1
CHsOPOaHj 1 1
1 transketolase < E C 2 . 2 . 1 . 1 )
c=o
i
HCOH
HOCH
CHjOH Sed7P
CHO GA3P
CHjOPOaHj
HCOH HCOH CHjOPOaHz 1
1
1
transaldolase (EC 2.2 1.2 )
CHO
CH2OH
c=o
HCOH Ery4P
HOCH
HCOH CHjOPOjHj
1 transketolase ( E C 2 . 2 , 1 ,
•
G;k3P
i
Fru6P
Fig. 2. Pentose phosphate pathway
i Fm6P
HCOH HCOH
)
CHjOPOsHj
ARM Database
197
(a) O EC 2.2.1.1
RibSP
XylSP
Sed7P
GA3P
Fru6P
Sed7P
Fig. 3. a Structural transformations in EC 2.2.1.1 and b EC 2.2.1.2 reactions. The groupings indicate atomic mappings
2.3. Tricarboxylic Acid (TCA) Cycle The TCA cycle in mitochondria is the central route to produce ATP through the oxidation of pyruvate. Several structurally symmetric molecules (fumarate, succinate, and citrate) are involved in the cycle. Because of their synmietry, the acetyl moiety of acetyl coenzyme A (CoA) can reach either the upper or the lower half of oxaloacetate (Fig. 4) [2]. Such synametry in molecular structures should be considered in tracing pathways. To detect synmietry in molecular structures, stereochemical information is essential. For example, xylitol is not stereochemically symmetric (its central carbon has chirality), whereas arabitol is (its central carbon is not chiral) (Fig. 5). Similarly, other structural information such as aromaticity and keto-enol tautomerism is necessary in detecting synometry and therefore in tracing pathways. For example, benzene ring is traditionally described as alternation of single and double bonds (Fig. 5). Since the six carbon atoms in the ring are all equivalent, benzoic acid is symmetric (carboxylic acid is ionized in water and becomes symmetric).
198
M. Arita HsC-CO-COOHpyruvate COz-^^-^^AD ^-MSIADH ^—CoA 1 H2O CoA acetyl CoA H3C-CO-C0Y /
NADH MAr^
\
0
I
"^
^ - U ^
/
/
^ H-C-COOH
HO > ^ ^ H-C-COOH u '^ r^r^>, u ^ . .^ H-L-UOOH H ^ ^ * ^ H-C-COOH oxaloacetate H /
"^ r "-^^^ H-C-COOH 1. ^N^ ^ H - C - COOH H - C - COOH'SOcitrate
1^^'^ fumarate H-C-COOH HOOC-C-H k ™
H-C-COOH ^^ 1 NAD k_.^ADH
^ ^ FAD-^\
M 7 H-C-(OOH succinate H-C-COOH
^
CoA
H^-^^CO, H-C-::OOH H-C-H a-ketoglutarate / Q QOOH
iy^ / I ^ /
H-C-H ;.
^JC
"*^/ \
C-COA
GTPcDP.Pi ^ succinyl CoA
i
\
X
\ ^oA NAD
eo2^^°^
Fig. 4. Tricarboxylic acid (TCA) cycle. NAD, nicotinamide adenine dinucleotide; FAD, flavin adenine dinucleotide; NADH, FADH, reduced NAD, FAD; CoA, coenzyme A
m
Carboxylic acid
Fig. 5. Examples of chirality- or aromaticity-dependent symmetry. Arabitol and xylitol have different chiralities of carbon atoms, and only arabitol is stereochemically symmetric. Likewise, carboxylic acid is symmetric
)
NH2
O
O "^
o L-Serine
Pyruvate
O
O
O
^-O
Hydroxy pyruvate
O
NH2
L-Alanine
Fig. 6. Amino-transfer reaction. The amino group of 1-serine is transferred to 1-alanine, and the rest of the structure to hydroxypyruvate. Since structures of these molecules are all similar, it is impossible to predict the atomic correspondence only from their structures
ARM Database
199
2.4. Nitrogen Metabolism The major part of the traditional metabolic map describes the carbon metabolism, and the metabolism of other atomic elements (e.g., nitrogen or sulfur) is comparatively less well treated. For example, the relationship of Urea cycle with other amino-transfer reactions is not explicitly shown in the metabolic map. Figure 6 shows a typical amino-transfer reaction in which the amino group in L-serine is transferred to L-alanine. In the nitrogen metabolism, this reaction constitutes a pathway between L-serine and L-alanine, whereas in the carbon metabolism, the same reaction forms a pathway between L-serine and hydroxyl-pyruvate. From this example, it is clear that the notion of pathways depends on the atomic element under focus. In summary, the following data are required to correctly trace metabolic pathways: • Structures of metabolites including the information on stereochemistry, aromaticity, and tautomerization • The atom-to-atom correspondence between metabolites in each catalytic transformation These indispensable data for computing pathways are not provided by traditional metabolic databases. So far, metabolic databases have been designed to display prepared (or fixed) views to the users, and do not support function to search or reconstruct pathways on demand.
3. ARM Database The Atomic Reconstruction of Metabolism (ARM) Database is designed to search pathway models in metabolism [3].
3.1. What is a Pathway Model? Ideal functionality of metabolic databases is often described as the prediction of unknown pathways or the design of new metabolic maps. Such ideas are generalized to the notion of "pathway models." A model is an abstract description of experimental observation. In the study of metabolism, a model may refer to a set of differential equations used in quantitative simulations, or a set of biochemical reaction formulas for qualitative simulations. The ability of flexibly selecting a model is an important prerequisite of biological simulation. Few software systems, however, have
200
M. Arita
been designed for the flexible selection of models. For example, most software tools for metabolic simulation aim to reproduce biological mechanisms (by changing parameters) on a fixed model. Most optimization tools output limited number of optimal models only. From a practical standpoint, it is crucial that they could provide multiple options so that each user can select the most persuasive, elegant model to explain the given biological mechanism. Database designers should be aware how the model-selection is supported in their databases, hi metabolic study the fundamental step in the model-selection is tracing (or searching) pathways. The ARM database uses a graph representation of the entire metabolic network to achieve this process. 3.2. Graph Representation of the Metabolic Networic "Graph" is defined by a set of nodes and a set of edges, each connecting two nodes as in a railroad map. For the basic terms for graph, refer to standard textbooks in graph theory [4]. To transform metabolic reactions into graph, the software tool in ARM first detects the structural correspondence in each reaction formula at the atomic scale. Figure 7 shows the transformation corresponding to the reactions in Figs. 3 and 6. The set of biochemical reactions are then transformed to a graph where nodes and edges represent metabolites and their substructural correspondences, respectively. Li this graph, any metabolic pathway forms a graph path. Note that the reverse does not always hold. The software tool in ARM computes a metabolic pathway between the given two compounds as follows. The algorithm to search a pathway 1. Let an integer value A: = 0. 2. Find the Ath shortest graph path P between the given nodes. 3. If P does not represent a valid pathway, increment k and go to Step 2. 4. Output p. Since not all graph paths are valid metabolic pathways, an algorithm to find the kth shortest path for any k is necessary [5]. To check the validity of a pathway, reachability between the given compounds must be verified. 3.3. Digitization of Molecular Structures Molecular structures are described either in the MOL- or SMILES format [6,7]. In the ARM database, molecular structures are represented as graphs
ARM Database
201
where nodes and edges correspond to atoms and their bondings, respectively. Aromatic rings are detected by applying the Hueckel rule for all cyclic parts in the structure [8]. By this rule, nucleic acids and heme are found to contain aromatic rings. The Hueckel Rule A ring structure is aromatic if it is physically planar and contains (4«+2) n electrons («: positive integer). Each double bond in the conjugation (alternative appearance of single and double bonds) contributes two n electrons, and each nitrogen, oxygen, or sulfur that is not doubly bonded contributes two n electrons. A classic algorithm called the Morgan method is used to detect the symmetry of molecular graphs. In the Morgan method, each graph node is labeled with an integer value that is iteratively updated so that topologically equivalent nodes will receive the same integer labels [9]. The Morgan Algorithm 1. For each graph node, initialize its label as its node degree. 2. For each node, update its label as the sum of all the integer labels of its adjacent nodes and its current label. 3. While the updating procedure results in finer classification of nodes, repeat Step 2. 4. Consider nodes with the same integer labels as topologically equivalent. The algorithm fails to detect the symmetry of a graph in which all nodes share the same degree (i.e., a regular graph). Theoretically, the problem of finding topological symmetry is related with the graph isomorphism problem [10]. For molecular structures in metaboUsm, however, such difficulty does not arise, and a variant of the Morgan method that considers atomic number and chirality suffices [11]. EC 2. 2.1.1 5ed7P Sed7P GA3P O^
Rib5P '"---.^XylbP »-O
EC 2. 2.1.2 Se Sed7P GA3P 0-*
Ery4P ^^ ,,Fru6P »-O
EC 2. 6.1.51 L-Serine Pyruvate O^
Hydroxy pyruvate L-Alanine •O
Fig. 7. Graphic representation of biochemical reactions. Metabolites and their (sub)structural relationships become nodes and edges, respectively. The figure shows carbon mappings
202
M. Arita
3.4. Drawing Molecular Structures Since the SMILES format does not provide the coordinate information of molecules, an algorithm to draw structures is required for their graphical display. Since most molecular structures in metabolism are simple, the software tool in ARM uses the following procedure. The algorithm to draw structures 1. Find biconnected components (rings) of the structure. 2. For each biconnected component: A) Find its cycle basis C so that the sum of its component-cycle sizes in C is the minimum of all cycle bases. B) Draw the largest cycle in C using an equilateral polygon. C) Draw adjacent cycles of the lastly drawn cycle so that they spread outward. 3. Sort all the biconnected components, and consider the entire structure as a tree with the largest component at its root. From the root, place each component breadth-first, and connect them by drawing branches. Branches are always drawn outward. 4. After the layout of all components, detect, if any, pair of nodes that are too close. 5. For all edges in the shortest path between the detected close pair, adjust the length and the angle so that no node pairs will be too close. 3.5. Digitization of Enzymatic Reactions If the mass balance of a metabolic reaction is equilibrated, all atoms in one side (say, substrates) must correspond one-to-one to the atoms in the other side (products). In the ARM database, such structural relationship at the atomic scale is digitized as a set of atomic correspondents, i.e., atomic position-pairs between substrates and products. Atomic correspondents can be grouped for each substrate-product pair of molecules, as shown in Figs. 3 and 6. Each group is called an atomic mapping [3]. Structural comparison is applied molecule-wise for both sides of a reaction: first substrate and first product, second substrate and second product, and so on. In each comparison, the largest common substructure between two metabolites is computed and registered as one mapping. Then, leftover structures are collected and compared again until all atoms are matched. For example, in the reaction EC 2.2.1.1: Rib5P+Xyl5P=Sed7P+GA3P
ARM Database
203
RibSP and Sed7P, and Xyl5P and GA3P are compared with priority. For this purpose, all reaction formulas in the ARM database were rearranged so that the molecular orders on the left and right sides roughly correspond one-to-one. Reactions may contain generic names such as "alcohol." Such abstract names were substituted with their corresponding specific names: e.g., methanol, ethanol, propanol, and butanol. For example, the reaction by alcohol dehydrogenase (EC 1.1.1.1): (Original) alcohol+NAD=aldehyde+NADH was rewritten as (Curated)methanol4-NAD=formaldehyde+NADH ethanol+NAD=acetaldehyde+NADH
In a reaction for polymerization, it is inherently impossible to compute one-to-one atomic correspondence because the polymer size is indeterminable. As a compromise, variable regions of polymers were substituted with their corresponding monomers or dimers in the ARM database. Consequently, the notation for polymers, typically described as (...)„, was changed to its corresponding monomeric or dimeric expression. For example, the reaction by DNA ligase (EC 6.5.1.1) was rewritten as follows: (Original) ATP+(DNA)^+(DNA)^AMP+PPi+(DNA)^^^ (Curated)ATP+DNA+H20=AMP+DNA+PPi For this reason, the function for ligase or protease is not explicitly described in the ARM database, and consequently water molecule (H2O) is artificially introduced to balance the number of atoms. For the detection of a common substructure, the software tool in ARM uses the Morgan method. By applying the method for only a few steps, it is possible to detect topologically equivalent, small substructures. By extending computed substructures in this manner, the software tool detects the largest common substructure between two molecules under comparison [3].
4. Pathway Statistics The ARM database provides data for the basic metabolism in Escherichia coli. Bacillus subtilis, and Saccharomyces cerevisiae- The number of me-
204
M. Arita
tabolite structures and biochemical reactions are about 2500 and 2000, respectively. They together constitute one large graph representing the entire metabolism [3]. Hereafter the network is referred to as the metabolic graph. For a flexible network analysis, any weight of positive value can be assigned to edges in the metabolic graph. Because pathways are computed by the Ä:-shortest paths algorithm, by changing edge weights, we can search pathways for different purposes: typical examples include pathways preferring or avoiding specified metabolites, or pathways containing as many specified reaction types as possible. Also realized is the function to trace specific atoms (carbon, nitrogen, or sulfur) using the mapping information at the atomic scale. 4.1. Analysis of £. coli metabolism About 850 annotations for E. coli metabolic genes accounted for roughly 1000 reaction formulas in about 600 EC enzyme subsubclasses [12,13]. These reactions are converted into 1230 atomic mappings, without duplication, among 906 metabolites for the carbon and nitrogen metabolism. Of 1230 mappings, 1179 accounted for carbon-carbon relationships among 905 metabolites (the only excluded metabolite was ammonia) [14]. Since most reactions contain three or more mappings, the statistic indicates the presence of frequently used atomic mappings. One such example is the mapping between ATP and ADP, which appears in most phosphorylation reactions. With our current dataset, each metabolite exhibits less than 1.5 patterns of structural transformations on average. Table 2 shows the metabolites whose structures are most diversely transformed in currently identified biochemical reactions. Carbon dioxide is the most variously transformed metabolite because it is involved in all decarboxylation reactions. Next in the order of the number of transformations are acetyl CoA and pyruvate, which play central roles in metabolism. ATP is also highly convertible because not only of phosphorylation but of adenylation to form FAD or NAD. »S-Adenosyl methionine is convertible because it transfers a methyl group to different donors. Li contrast. Table 3 shows the metabolites that frequently appear in biochemical reactions. Water, orthophosphate, and other inorganics without carbon atoms are excluded from Table 3. Most of the high-ranking metabolites are cofactors. The ranks for L-glutamate and keto-glutarate became also high because of amino-transfer reactions they participate in. On the other hand, central metabolites such as acetyl CoA and pyruvate were not highly ranked.
ARM Database
Table 2. Top 20 of the most diversely transformed metabolites 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Acetyl-CoA Pyruvate CoA ATP ^-Adenosyl-L-methionine L-Glutamate D-Galactose D-5-Phospho-ribosyl 1-diphosphate D-Glucose D-Fructose 6-phosphate L-Aspartate Dihydroxy-acetone phosphate UDP-D-Galactose Acetate Glycine Oxaloacetate L-Serine UDP-A^-Acetyl-D-glucosamine Keto-glutarate
Table 3. Top 20 of the most frequently appearing metabolites ~1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
ÄTP NAD NADH ADP CO2 NAD phosphate CoA NADH phosphate L-Glutamate AMP Pyruvate Keto-glutarate Acetyl-CoA D-Galactose Acetate L-Methionine D-5-Phospho-ribosyl 1-diphosphate L-Homocysteine D-Glucose Phosphoenol pyruvate
205
206
M. Arita
Table 4. Top 20 metabolites ranked by the score of the number of appearance divided by the number of conversions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
NADH NAD phosphate NADH phosphate NAD ADP L-Homocysteine Protein A^(pai)-phospho-L-histidine ATP L-Glutamine L-Methionine Protein L-histidine AMP Inosine diphosphate »S-Adenosyl-L-homocysteine Pyridoxal L-Glutamate L-Tetrahydrofolate Deoxy ATP CoA Flavin mononucleotide
35000 ^ 30000 i 25000 g, 20000 2 15000 E 10000 ^
5000
n II II II II II II II1
0 1
3
5
7
9
1
1
II H n n n n
11 13 15 17 Pathway length
19
21
. 23
25
Fig. 8. The distribution of pathway lengths in basic metabolism of Escherichia coli' The average path length is around 8. Biochemical reactions are considered directed, based on the Biochemical Pathways wall-chart by Roche Applied Sciences
ARM Database
207
A class of metabolites called cofactors can be characterized by focusing on the pattern of structural transformations. That is, we can define cofactors as frequently appearing metabolites whose structures are little transformed. Table 4 shows the metabolites ranked by the score of the number of appearance divided by the number of structural transformations. In Table 4, acetyl Co A and pyruvate do not appear and cofactors occupy the high ranks. Among high-ranking metabolites, protein (phospho)-L-histidine functions as a cofactor in sugar-transfers, and methionine and homocysteine function as cofactors in methyl-transfers. These statistics depend on the chosen set of biochemical reactions and the precise rankings in Tables 2-4 per se do not give much biological insight. Still, such global analysis suggests the characteristic of the entire network. For example, metabolic networks were reported to satisfy small-worldness. The definition of a small-world network is: (i) most nodes (metabolites) have a low connection degree and the degree distribution follows a power law; (ii) high-degree nodes, called hubs, dominate the network, and most nodes are clustered around hubs; and (iii) the average path length (i.e., the average of the shortest path length over all pairs of nodes in the network) remains the theoretical minimum, that of a random graph. Many naturally developed networks are known to satisfy these properties [15]. According to recent reports, the metabolic network of bacteria forms a small world of average path length (AL)«3 [16,17]. This means that any pair of metabolites is connected through a few reaction steps. However, previous analyses did not verify the reachability of atoms through reaction steps, and therefore included unrealistic pathways through which no atoms are actually transferred. Such overestimation increases the possibility of false connections and consequently reduces the AL. When the AL of pathways that conserve at least one carbon atom is computed, the distribution of pathway lengths becomes as shown in Fig. 8. The distribution may slightly differ depending on the reversibility of reactions, but the AL cannot be as small as 3. Moreover, since each reaction represents a unique biochemical step, its function cannot be compensated by others. For these reasons, it is unlikely that a metabolic network satisfies characteristics of a small-world network [14].
208
M. Arita
5. Future Applications 5.1. Degradation Pathways of Xenobiotics As environmental issues become increasingly focused, bacteria that can degrade persistent organic pollutants such as polychlorinated biphenyl (PCB) receive more attention. There exists an on-line database that specializes in degradation pathways of xenobiotics [18]. One major theme in future biotechnology is the systematic prediction of the degradation pathway achieved by combination of microorganisms or of simple genetic engineering. The hurdles for this prediction are that (i) the target compound is unspecified and that (ii) prediction of uncharacterized reactions is necessary. The former problem can be solved by radially computing all pathways from a given compound. The latter problem is more challenging. Since catalytic transformations in uncharacterized pathways are assumed to be similar to those of known ones, structure-based classification of reactions is indispensable. One solution is to automatically apply basic reaction steps such as oxidation or reduction classified in this manner, and to test all possible biochemical transformations.
5.2. Prediction of Drug {Metabolism A similar approach can be applied to the prediction of drug metabolism or drug degradation. The difficult part is that a drug molecule is not always broken down to be excreted but make a conjugate (e.g., glucuronide conjugation). To cope with such structural changes the application of known, basic reaction steps is informative. The software tool in ARM has an ability to apply such characteristic conversions to compute all possible degradation pathways, but the conversions listed in textbooks are too general to be used for this purpose. For example, a monoamine oxidase can substitute an amino group with a hydroxyl group, but the substitution is not applicable to all amino groups. Therefore, more specific degradation conditions are to be clarified, preferably for each species under investigation, to achieve realistic prediction using the ARM database.
ARM Database
209
6. Conclusion The ARM database provides a unique function to digitize the atom-to-atom correspondences between metabolites in identified biochemical reactions. Currently, no other metabolic database can trace biochemical pathways at the atomic scale. The technique used can be extended to other pathway searches in which molecular structures are stepwise transformed. Such an application includes the design of bio-processes for useful materials such as amino acids or plastics. Li the train network, the primary criterion to select a route is the distance (i.e., shortest paths are chosen), hi the evolutionary process of metabolic networks, however, it seems that this criterion has not been much exercised: not a few metabolic pathways include redundant steps and are not optimized in terms of length or robustness. The systems biology on metabolic networks is still at its earliest stage, and will become more important in future biological research.
References 1. Michal G (ed) (1999) Biochemical pathways. Wiley/Spektrum Akademischer Verlag, New York 2. Alberts B, Bray D, Johnson A, Lewis J, Raff M, Roberts K, Walter P (1998) Essential cell biology. Garland, New York. 3. Arita M (2003) In silico atomic tracing by substrate-product relationships in Escherichia co//intermediary metaboUsm. Genome Res 13(ll):2455-2466 4. Bondy JA, Murty USR (1976) Graph theory with applications. Elsevier North-Holland, Amsterdam. 5. Eppstein D (1998) Finding the k shortest paths. SIAM J Comput 28(2):652-673 6. MOL Format: MDL Information Systems. Description downloadable from http://www.mdli.com 7. SMILES Format: Daylight Chemical Information Systems. Description downloadable from http://www.daylight.com 8. VoUhardt KPC, Schore NE (1998) Organic chemistry: structure and function, 3rd edn. Freeman, New York 9. Wipke WT, Dyott TM (1974) Stereochemically unique naming algorithm. J Am Chem Soc 96:4834-4842 10. Koebler J, Schoening U, Toran J (1993) The graph isomorphism problem. Birkhauser, Boston 11. Arita M (2000) Metabolic reconstruction using shortest paths. Simulation Pract Theory 8(2): 109-125 12. Karp PD, Riley M, Saier M, Paulsen IT, Paley S, Pellegrini-Toole A (2002) The Ecocyc database. Nucleic Acids Res 30(l):56-58
210
M. Arita
13. Ouzounis CA, Karp PD (2000) Global properties of the metabolic map of ^5cherichia coli. Genome Res 10(4):568-576 14. Arita M (2004) The metabolic world of Escherichia coli is not small. Proc Natl Acad Sei USA 101(6):1543-1547 15. Strogatz SH (2001) Exploring complex networks. Nature 410(6825):268-276 16. Jeong H, Tombor B, Albert R, Oltvai ZN, Barabasi AL. (2000) The large-scale organization of metabolic networks. Nature 407(6804):651-654 17. Fell DA, Wagner A (2000) The small world of metabolism. Nat Biotechnol 18(11):1121-1122 18. Ellis LB, Hou BK, Kang W, Wachett LP (2003) The University of Minnesota Biocatalysis/Biodegradation Database. Nucleic Acids Res 31(l):262-265
Chapter 14: The Genome-Based E-CELL Modeling (GEM) System Kazuharu Arakawa, Yohei Yamada, Kosaku Shinoda, Yoichi Nakayama, and Masaru Tomita 1311 Laboratory for Bioinformatics, Institute for Advanced Biosciences, Keio University, Endo 5322, Fujisawa 252-8520, Japan
1. Introduction Rapid advances in the sequencing technologies have brought forth a vast amount of genome data. The GenBank database at the National Center for Biotechnology Liformation now distributes more than 100 complete genomes [1]. This fast accumulation of data is also present in all fields of molecular biology, including transcriptome, proteome, and metabolome. With the existence of the large amount of data, understanding of the cell requires a "systems biology" approach in order to view the dynamic behavior as a complex system [2]. Several successful attempts for this purpose through cell simulation have already been reported [3-5]. Dynamic simulation provides an integrative view through systematic modeling of reaction networks within a cell from quantitative data. Each reaction is required to be expressed in accurate rate equations with precise parameters that are difficult to be found as a complete set. Therefore, large-scale modeling of cells in silico demands a novel, high-throughput approach. If successfully integrated, the availability of a large amount of genome sequence, transcripts and expression data, enzyme reaction data, metabolic pathway maps, and the data of metabolites in cells will create a strong base for a whole cell model. In this chapter we discuss a genome-based large-scale modeling approach, implemented as the Genome-based E-CELL Modeling System (GEM System) on the generic bioinformatics analysis workbench, G-language Genome Analysis Environment (G-language GAE) [6]. The GEM System enables automatic generation of a cell-wide metabolic pathway model ready for pseudodynamic simulation over E-CELL simulation environment, based on the input genome sequence. A Graphical User Interface (GUI) is provided via G-language GAE for easy manipulation.
212
K. Arakawa et al.
2. Integration of Public Databases In whole-cell modeling, integration of different biological databases is a challenging task because the target field is broad and the subject of each database is not the same. For example, it is difficult to link a proteome database to the transcriptome database because this step requires a reference of protein to mRNA. Moreover, even if the subject is the same, the names of genes and proteins are often ambiguous and are thus difficult to match. However, because of the rapid advance in sequencing technology and the abundance of genome sequence data, most databases contain links to the genome sequences regardless of the subject, so that it is possible to link the large amount of biological information by following the Central Dogma. Automation of this integration process by bioinformatics realizes high-throughput extraction of data. The GEM System contains a set of internal databases that integrates public biological databases such as EMBL, SWISS-PROT, KEGG, ARM, BRENDA, and WIT [7-12]. Two main databases exist: one is the "Variable" database, which contains the unified names of genes, metabolites, and enzymes, and the other is the "Process" database which stores enzyme reaction stoichiometry that is checked for atomic consistency using chemical formulae of the substrates. After matching the genome sequence to the corresponding enzyme, the entire metabolic network is reconstructed and generated from these two databases (Fig. 1).
3. Prediction of Coding Regions The GEM System accepts the genome sequence as its input and automatically generates a simulation model based on the sequence data. Although information about the location of each coding region is available if the input genome sequence is in annotated database formats such as GenBank and EMBL, genomes sequenced in-house or those in raw sequence formats must first of all be scanned for potential open reading frames. In the GEM System, Glimmer2 is employed for prokaryotes and GlimmerM for eukaryotes for this purpose [13,14]. Glimmer is known to have a high rate of false positives but a low rate of false negatives, and because the subsequent steps in the GEM System will filter out the false-positive hits, it is most convenient.
Genome-based E-CELL Modeling (GEM) System
213
Fig. 1. Public biological databases are stored internally in a relational database for quick access and data retrieval. From this internal database, a database containing consistent metabolite names and another database containing consistent enzyme stoichiometry are derived as variable dsvd process databases. Simulation model is generated using these internal databases
Fig. 2. Genes are matched to their enzyme products through three levels of matching. Level 1 directly refers to the annotation in the input genome database file, and Level 2 performs homology search against the SWISS-PROT database for those left unmatched. Lastly, all remaining genes are searched by orthology in Level 3
214
K. Arakawa et al.
4. Matching Genes to Enzymes In our top-down approach, enzymes are not matched from the annotation but from the sequence itself. Current approaches in modeling usually rely on sequence searches by homology; however, homology-only methods are usually insufficient in functional genomics. Lester et al. suggest that the similarity obtained by interspecies homology search for orthologs is "generally poor" [15]. We have conducted a multiple alignment test for amino acid sequences in SWISS-PROT having the same EC number, and also found that distant species have very weak similarity even though their biological functions are analogous (data not shown). However, homology search is effective between related species, and between sequences having high similarity. Therefore, genome annotation techniques usually screen open reading frames with homology, and then combine several methods to annotate sequences with low homology. In the GEM System, the required function is to connect an amino acid sequence to stoichiometric equations. Searching for orthology and identifying the EC number for the sequence may achieve this, since the reaction mechanism is conserved between the orthologs. Among the several database of orthologous genes, we use the "cognitor" program provided with the expert curated COG database in the GEM System [16-18]. The "cognitor" program assigns a COG id for a given amino acid sequence, and a COG id is allocated to represent one gene product. We assigned COG ids for all SWISS-PROT entries, and found that of the monomer enzymes for which EC numbers are assigned, 83.3% has one-to-one relationship between the EC number and the COG id. It is worth noting that the rest of 16.7% is not completely random, but has about one-to-three relationship in average, which can be further specified to be one-to-one relationship by combining homology information. Therefore a sequence can be matched to an EC number using a hybrid method of homology and orthology. Using the above approach, three levels of matching the sequences to stoichiometric equations is implemented as shown in Fig. 2. To obtain the most reliable matches, level 1 uses the reference in EMBL genome database which links a gene sequence to a SWISS-PROT accession number, and level 2 performs a BLASTP homology search against the SWISS-PROT database with relatively high cutoff values [19]. In the GEM System, the default e-value is e-05. When the matched SWISS-PROT data do not contain the "CATALYTIC ACTIVITY" line with the stoichiometric equation, the system searches for SWISS-PROT accession numbers in the same cluster category in the WIT database. This will provide a list of analogous enzymes with their stoichiometry.
Genome-based E-CELL Modeling (GEM) System
215
For sequences with low similarity, an orthology search is performed on level 3 of the system. About 70% of most microbial sequences have already-assigned COG ids in the PTT database, and the unassigned genes go through the online "cognitor" and the offline "dignitor" programs. Then a BLASTP homology search is performed against SWISS-PROT sequences assigned to the same COG id, to accomplish the best one-to-one match. A WIT cluster search is additionally performed for stoichiometric equations where necessary. At this level of search, the KEGG Enzyme database can also be used directly from the EC number matched by orthology. All of the above information retrieved is readily checked in the pathway databases for connectivity. By matching every gene to the enzyme product, the system cannot distinguish isozymes from enzymes with multiple subunits; moreover, there may be false negatives during the matching process. Stoichiometry of isozymes responsible for a specialized part of pathways are also difficult to obtain. Therefore, all extracted stoichiometry is checked for connectivity based on KEGG and BioCyc reference pathways. The resulting output file is compiled into E-CELL Simulation Environment (ERI) format file, ready for large-scale qualitative simulation of the entire cell. The significance of these methods is that the only required information is the genome sequence, so that models from any organism whose complete genome is available can be constructed, regardless of the accuracy and progress in annotation.
5. Hybrid Algorithm The reaction list generated by the GEM System is the list of enzyme stoichiometry for static simulation, and cannot be applied for the study of dynamic behavior. A key to this problem is the hybrid algorithm of dynamic and static simulation methods (for details see the chapter 15 by Y. Nakayama). With this hybrid algorithm, when all rate-limiting reactions are dynamically represented, every other reaction can be statically represented with the same accuracy as completely dynamic simulation. This algorithm reduces as much as 80% of the necessary parameters for the dynamic simulation. The static part of simulation, in other words 80% of the dynamic cell model, can then be directly generated using GEM System stoichiometry, using the hybrid algorithm. Moreover, using the G-language GAE functions, it is possible to predict the rate-limiting enzymes. AUosteric enzymes are often responsible for the rate-limiting step, and known allosteric enzymes can be specified by homology searches against protein databases such as SWISS-PROT. Then the
216
K. Arakawa et al.
literature search module in the GEM System can be automated to search for the necessary parameters to dynamically model the rate-limiting reaction. Obviously, the whole modeling process cannot be automated and many parameters can only be obtained by biological experiments; nevertheless, the GEM System with the hybrid algorithm will greatly reduce the cost required in large-scale modeling.
6. Results and the Modeling Environment Using the GEM System package, several whole-cell simulation models are generated, including a virtual Escherichia coli model consisting of 2264 reactions of 1682 metabolites, which is the largest model ever created with E-CELL (see Fig. 3) [20]. It is worth noting that the whole conversion process takes only about 30 seconds on a personal computer when a GenBank or EMBL format genome sequence is seeded to the GEM System, and that the whole process can be manipulated with a friendly GUI (see Fig. 4).
Fig. 3. Dynamic changes in quantities of metabolites are visualized with the .e (dot e) viewer, receiving the simulation results from E-CELL Simulation Environment. Metabolic pathway model generated from Escherichia coli Kl2 MG 1665 genome consists of 2264 reactions and 1682 metabolites
Genome-based E-CELL Modeling (GEM) System
217
Fig. 4. The GEM System is equipped with a user-friendly graphic interface for easy access and manipulation. Generation of a simulation model is an automated procedure requiring only the genome sequence as its input. Users can modify numerous parameters of the system through this interface, which is powered by the G-language GAE Although this simulation model is only qualitative and cannot represent the dynamic behavior, the GEM System is flexible enough to build on to the generated simulation model. Because the generated information contains a link to the biological database used in the GEM System, additional information can be loaded onto the simulation. One example of such extension to a model is protein localization. It is necessary to understand the specific localization of proteins, especially in eukaryotes where organelles make up the complex system of the cell. Using e-Rice, a virtual cell model of rice, enzyme reactions are separated into compartments of cytoplasm, chloroplast, cytoskeleton, endoplasmic reticulum, extracell, Golgi apparatus, lysosome, mitochondria, nucleus, peroxisome, plasma membrane, and vacuole according to the classification by Chou et al. [21]. The GEM System can also take advantage of the G-language GAE analysis functions for nucleotide and peptide sequences. Tissue specific expression of genes can be modeled by mapping the cDNA and EST sequences available in the UniGene database (http://www.ncbi.nih.gov/ UniGene/) using the cDNA Analysis System (CASYS) of G-language GAE. CASYS contains an automated mapping module, with which the
218
K. Arakawa et al.
GEM System realizes a specific modeling of cells of certain tissue. A wealth of microarray data available on the Internet is another source for this purpose. The GEM System also includes a gene expression prediction module for the estimation of enzyme quantity, a text search and retrieval module for literature search over Medline (http://www.ncbi.nih.gov/entrez/), a pathway checking module for finding substrates without input, and a pathway viewer using "a Java applet for visualizing protein-protein interaction" and BioLayout [22,23]. Shown in Fig. 5 is the glycolysis pathway of the Escherichia coli genome.
Fig. 5. As graphically represented using "a Java applet for visualizing protein-protein interaction," the Escherichia coli model generated by the GEM System effectively reconstructs metabolic pathways of the organism based on the genome sequence. Shown here is the glycolysis pathway. Generated pathways are also connected by the intemal pathway check procedure
Genome-based E-CELL Modeling (GEM) System
219
7. Discussion The GEM System realizes rapid and automatic generation of a large-scale static metabolic pathway simulation model from the genome sequence. This can be the draft model for the dynamic model using the hybrid algorithm, if the rate-limiting enzymes are modeled with dynamic rate equations. This enables observation and study of the dynamic behavior of cells in silico, effective not only for the study of life, but also for metabolic engineering and pharmaceutical experiments. When the dynamic model of a virtual cell with the enzyme reactions is achieved, the next goal is to model the signal transduction pathways and gene expression, which will include transcription, translation, and degradation processes. The GEM System is a strong platform for this purpose, because of the backbone on G-language GAE. Direct access to G-language GAE and references to the biological database accomplish a link to genome analyses. Required parameters may be directly calculated from the genome sequence as the simulation runs, and this will truly become the simulation from the genome.
References 1. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL (2003) GenBank. Nucleic Acids Res 31:23-27 2. Kitano H (2002) Computational systems biology. Nature 420:206-210 3. Takahashi K, Yugi K, Hashimoto K, Yamada Y, Pickett CJF, Tomita M (2002) Computational challenges in cell simulation. IEEE Intelligent Systems 17:64-71 4. Tomita M, Hashimoto K, Takahashi K, Shimizu TS, Matsuzaki Y, Miyoshi F, Saito K, Tanida S, Yugi K, Venter JC, Hutchinson CA (1999) E-CELL: software environment for whole-cell simulation. Bioinformatics 15:72-84 5. Tomita M (2001) Whole-cell simulation: a grand challenge of the 21st century. Trends Bibtechnol 19:205-210 6. Arakawa K, Mori K, Ikeda K, Matsuzaki T, Kobayashi Y, Tomita M (2003) G-language Genome Analysis Environment: a workbench for nucleotide sequence data mining. Bioinformatics 19:305-306 7. Brooksbank C, Camon E, Harris MA, Magrane M, Martin MJ, Mulder N, O'Donovan C, Parkinson H, Tuli MA, Apweiler R, Bimey E, Brazma A, Henrick K, Lopez R, Stoesser G, Stoehr P, Cameron G (2003) The European Bioinformatics Institute:is data resources. Nucleic Acids Res 31:43-50 8. Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, Martin MJ, Michoud K, O'Donovan C, Phan I, Pilbout S, Schneider M (2003)
220
9. 10. 11.
12.
13. 14. 15.
16. 17. 18.
19. 20.
21. 22. 23.
K. Arakawa et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 31:365-370 Kanehisa M, Goto S, Kawashima S, Nakaya A (2002) The KEGG databases at GenomeNet. Nucleic Acids Res 30:42-46 Arita M (2000) Metabolic reconstruction using shortest paths. Simulat Pract Theory 8:109-125 Schomburg I, Chang A, Hofmann O, Ebeling C, Ehrentreich F, Schomburg D (2002) B REND A: a resource for enzyme data and metabolic information. Trends Biochem. Sei 27:54-56 Overbeek R, Larsen N, Pusch GD, D'Souza M, Selkov E Jr, Kyrpides N, Fonstein M, Maltsev N, Selkov E (2000) WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction. Nucleic Acids Res 28:123-125 Salzberg SL, Delcher AL, Kasif S, White O (1998) Microbial gene identification using interpolated Markov models. Nucleic Acids Res 26:544-548 Delcher AL, Harmon D, Kasif S, White O, Salzberg SL (1999) Improved microbial gene identification with GLIMMER. Nucleic Acids Res 27:4636^641 Lester PJ, Hubbard SJ (2002) Comparative bioinformatics analysis of complete proteomes and protein parameters for cross-species identification in proteomics. Proteomics 2:1392-1405 Koonin EV, Tatusov RL, Galperin MY (1998) Beyond complete genomes: from sequence to structure and function Curr. Opin Struct Biol 8:355-363 Tatusov RL, Koonin EV, Lippman DJ (1997) A genomic perspective on protein families. Science 278:631-637 Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, Kiryutin B, Galperin MY, Fedorova ND, Koonin EV (2001) The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res 29:22-28 Ahschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J. Mol. Biol. 215:403^10 Blattner FR, Plunkett G 3rd, Bloch CA, Pema NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, Gregor J, Davis NW, Kirkpatrick HA, Goeden MA, Rose DJ, Mau B, Shao Y (1997) The complete genome sequence of Escherichia coli K-12. Science 277:1453-1474 Chou KC, Elrod DW (1999) Protein subcellular location prediction. Protein Eng. 12:107-118 Mrowka RA (2001) Java applet for visualizing protein-protein interaction. Bioinformatics 17:669-671 Enright AJ, Ouzounis CA (2001) BioLayout-an automatic graph layout algorithm for similarity visualization. Bioinformatics 17:853-854
Chapter 15: Hybrid Dynamic/Static Method for Large-Scale Simulation of Metabolism and its Implementation to the E-CELL System Yoichi Nakayama Institute for Advanced Biosciences, Keio University, Nipponkoku 403-1, Daihouji, Tsuruoka, Yamagata 997-0017, Japan
1. Introduction A significant problem in the development of dynamic metabolic simulation models is a lack of data on the dynamic characteristics of reactions. Using the Genome-based E-CELL Modeling (GEM) system (Chaptre 14) and other databases, it has become possible to express metabolic pathways and the stoichiometry of the reactions comprised. However, the construction of dynamic models requires differential equations and their parameters that represent the dynamic characteristics of reactions. This chapter is an introduction to a new simulation method called the hybrid dynamic/static simulation method, and explains how it reduces the required dynamic data. Cell simulation is the reconstruction of intracellular reactions on a computer based on quantitative data in an attempt to analyze the systematic characteristics of the cell and predict unknown pathways/mechanisms. We have developed a general simulation software called the E-CELL system for reproducing the entire processes within a cell [1-3]. This system is flexibly structured to enable not only dynamic simulation based on rate equations, but also the simulated reproduction of virtually all configurations including the stochastic algorithms of Gillespie [4] and StochSim [5], as well as the S-System [6] and generalized mass action (GMA). We have constructed a simulation model of normal erythrocytes using this E-CELL system, and this model has been employed to analyze the glucose-6-phosphate dehydrogenase (G6PD) deficiency. The results of this simulation indicated that the pathways for glutathione synthesis partially compensate for the deficiency in reduced glutathione associated with a loss of G6PD [7]. From this example, in order to construct a simulation model for general purposes that enables the prediction of abnormal conditions, such as enzyme deficiency, it is essential to cover all the metabolic pathways.
222
Y. Nakayama
2. Principles of {Hybrid Dynamic/Static Simulation Method In general, enzyme rate equations are used in continuous and dynamic simulation models. Various types of this ordinary differential equation exist corresponding to the relevant reaction mechanism, but the common characteristic is they all now calculate the reaction rate from the amount of the substances concerned. The substances concerned are proteins that catalyze reactions such as enzymes, and substrates, products, and effectors. In addition, the rate equation also requires parameters such as rate constants contained in the equation. The Michaelis-Menten equation, which is a typical rate equation, is still used in many cell simulations, but it was originally contrived in the field of enzymology for analyzing the property of the initial velocity, which is a state in which there exists only substrates and no products. In dynamic simulation of living cells accompanying various changes in state, the conditions assumed by this equation—^that the amount of enzyme-substrate complex is always at a steady state and that the reaction is irreversible—^become inappropriate in most cases. With the Michaelis-Menten equation, if the reaction is reversible, no effect can be considered on the state of equilibrium between substrate and products. The development of an accurate simulation model that is applicable for general purposes, therefore, requires examining the reaction mechanism for each enzyme and formulating a detailed equation such as by the King-Altman method [8], MWC model [9], or KNF model [10], and then using an equation calculated by a method whereby each rate constant is approximated from experimental data. What have now become the biggest issues for constructing a simulation model are the rate reaction dependent on this reaction mechanism, parameters used in this equation, and the concentration of metabolites. There is a high-throughput method being established that measures the concentration of metabolites en masse using an assay called capillary electrophoresis/mass spectrometry (CE/MS), but for other enzymes, accompanying enzyme purification has become a major problem, with no high-throughput assay yet established. To provide a solution for these problems from the perspective of simulation algorithm, we have developed the hybrid dynamic/static simulation method. This method is a combination of metabolic flux analysis [11-14], which is a static method that has been developed in the field of metabolic engineering, and conventional dynamic simulation. This metabolic flux analysis is constructed using stoichiometric matrices that describe relationships only by the stoichiometric coefficients. The principle of this
Hybrid Dynamic/Static Method for Simulation
223
Reaction X: Si -^ 2S2 Reaction Y: S2 ^ S3 Reaction Z: 2S3 -* 3Si Influx: -^ A Efflux: C -^
0
-1
3
0
Stoichiometric system -*Dynamic system <••
Fig. 1. Example of metabolic flux model. The mass conservation law has been violated in this example model in order to avoid the under-determined system Solid line, fluxes that calculated using stoichiometric matrices; dotted line, fluxes that calculated using rate equations. Si to S5 are metabolic intermediates simulation is as foUow^s: a matrix is described using stoichiometric coefficients and individual fluxes (reaction rates), a matrix is added for the influx to and efflux from the system, and the simulation is calculated based on the assumption it is at a steady state (Fig. 1). Such a simulation method was combined with a dynamic model constructed using the rate equation.
Fig. 2. Summary of hybrid method
224
Y. Nakayama
First, the influx/efflux for input in the metabolic flux model (static module) is obtained from the calculation results of the dynamic module (rate equations), and all the fluxes within the static module are determined (Fig. 2). The calculation results of the static module are obtained by using the information of metabolite concentration on the dynamic/static boundary. These boundary metabolites change by integration of differential values (reaction rates) calculated using each dynamic/static module, and the values of these metabolites are then used for calculation of the dynamic module during the next integration time step. Therefore, the calculation results of the static module are transmitted to the dynamic module via this boundary metabolites, and the calculation results of the dynamic module are then used in the calculation of the static module. In this way, both calculation results have a mutual effect upon one another. Calculation of static modules is also conducted at the same stage as integration time steps for the dynamic module. In other words, a large number of operations for the static modules such as static images are fast-forwarded to resemble quasi-dynamic imaging. Furthermore, the dynamic module constructed by ordinary differential equations is broken down into individual segments like in a movie or TV show at the stage of integration. Therefore, if the static image operation is matched with these segments, the dynamic image should appear the same to the user when it is viewed. This is the principle of the hybrid method [8]. Actual matrices calculation is somewhat problematic. In the process of operating this static module, the inverse matrix needs to be calculated. In general, therefore, as shown in the example in Fig. 1, if the numbers of reactions and substances are not the same, the module cannot be worked out (the matrix must be a regular matrix). However, a system that overcomes this has already been developed in the field of metabolic engineering. For example, in systems that have a greater number of substances than number of reactions (overdetermined system), a single solution cannot always be obtained. In such conditions it is common practice to use a Moore-Penrose pseudo-inverse matrix [15]. In contrast, when the number of substances is fewer than the number of reactions, the equation for calculating the solution will be inadequate. In such cases linear programming, which is a method of optimization, is generally used. However, with this hybrid method, calculation is performed by dynamic model synchronization with the integration time step, so a faster solution method is now needed. We therefore decided to use the minimum-norm solution rather than linear programming. This method defines the flux under ideal conditions; the single point closest to this in the solution space is defined as the solution. With linear programming, constraints and objective functions are needed. Similarly, vectors under ideal conditions are needed even with this method, using the minimum-norm solution.
Hybrid Dynamic/Static Method for Simulation
225
By using a method such as the one above, the only data needed for constructing a model for the component statically expressed are the metabolic pathway map and stoichiometric coefficient for all reactions, and influxes/effluxes of the system. In other words, except for reactions in which rate-limiting enzymes are catalyzed, it is not necessary to calculate the rate equation, its parameters, or even the metabolite concentration. Analyses on enzyme kinetics by biochemical experiments can therefore be reduced.
3. Performance and Error Rate of the Hybrid Method As a performance test of the hybrid method explained above, all reactions were constructed using rate equations based on data from biochemical experiments, and a steady-state dynamic erythrocyte model was used. A hybrid model was prepared whereby part of the pentose phosphate pathway contained in the above model was substituted for the static description (Fig. 3), and a comparison of steady states was first performed. As a result, both behaviors were extremely similar, with a maximum error of 0.0745%. Moreover, there were only five reactions out of a total of 43 that had an error (error: 0.0745%-0.00677%) (Fig. 4). To compare the dynamic behaviors, perturbation that transiently trebles frue tose-1,6-diphosphate was also applied, whereupon the dynamic behaviors were virtually the same. The errors in this case were also about the same as those in the above steady state (Fig. 5). In another study we performed using a virtual model, the hybrid calculations all completely matched those of the dynamic ones. A study that applied perturbation also revealed that a very slight error occurred, but this was attributed to the nonexistence of time within the static module, so while in the dynamic model it only takes a short time for the downstream to respond to perturbation-induced change, in the hybrid model perturbation stimulation is transmitted in an instant to downstream when passing through the static module, so the time of response to the perturbation is short. However, it was found that if the number of static reactions were inadequate, it could be concluded the calculation error arising from this method would be sufficiently small compared with the accuracy of study data used for model construction, and no errors would occur that would be problematic in terms of simulation accuracy.
226
Y. Nakayama
Fig. 3. Erythrocyte hybrid model. Reaction substituting the static module with red line component. ATP, adenosine triphosphate; ADP, adenosine diphosphate; GLC, glucose; G6P, glucose 6-phosphate; F6Pj fructose 6-phosphate; FDP, fructose 1,6-diphosphate; GASP, glyceraldehyde 3-phosphate; DHAP, dihydroxy acetone phosphate; GSH/GSSG, glutathione; NADP(H), nicotinamide adenine phosphate; GL6P, gluconolactone 6-phosphate; G06P, gluconate 6-phosphate; X5P, xylulose 5-phosphate; RU5P, ribulose 5-phosphate; R5P, ribose 5-phosphate; S7Py sedoheptulose 7-phosphate; E4P, erythrose 4-phosphate; HK, Hexokinase; PGI, Phosphoglucoisomerase; PFK, Phosphofructokinase; ALD, Aldolase; TPI, Triose phosphate isomerase; GSHox, Glutathione oxidative stress; GSSGR, Glutathione reductase; G6PDH, Glucose 6-phosphate dehydrogenase; 6PGLase, 6-Phosphogluconolactonase; 6PGOLDH, 6-Phosphogluconate dehydrogenase; R5PI, Ribose 5-phosphate isomerase; X5PI, Xylulose 5-phosphate isomerase; TKl, Transketolase i; TK2, Transketolase 11; TA, Transaldolase
Hybrid Dynamic/Static Method for Simulation
227
1000
Fig. 4. Comparison of erythrocyte models in steady states. Solid line, dynamic model; dotted line, hybrid model
_A ^
0.12
'
^
'
^
'
^^
'
1 0.1 \ 0.08
-
0)
-^ 0.06
I
0.04
" 0.02
0
.^,
-;.•
^w.-
*
.„if,
^
- .,i„.„....™„...„.jj„.„.
.^
,
...,(,,.,.„. _ . . „ . ^
..
'!^:'!:--~....»...,-.==irr:r^r:r"ri" 1
200
1
400
1
600
t i me (sec)
1
800
1000
400
600
800
1000
t ime (sec)
Fig. 5. Comparison of erythrocyte model in the case of perturbation test. Solid line, dynamic model; dotted line, hybrid model
Also, if the hybrid method had not described dynamically the rate-limiting process, the entire behavior would have expectedly not been expressed accurately. However, this was shown from the results of the study using the virtual model and erythrocyte model. Figure 6 shows the time courses of metabolites based on steady-state data when G6PD, which is known as a rate-limiting enzyme, was moved to static module in the erythrocyte model. According to this, in the dynamic model all the lines should be flat, solid ones but, as is evident, completely different behaviors are shown. In general, in the rate-limiting process enzyme activity is lower by being controlled, and this slows the reaction rate and causes formation of bottleneck reactions in the metabolic pathway. However, in addition to this
228
Y. Nakayama
example, virtual model studies have clearly shown that bottleneck reactions are also formed under conditions of equilibrium. This suggests that a state of equilibrium occurs in the nonenzymatic process or when enzymes have a very high activity, but that a bottleneck could occur with reactions in this state. Thus, it has become clear the hybrid method has the following two limitations: (1) dozens of continuous reactions should not be expressed as a static module, and (2) a rate-limiting process should not be included in the static module. However, in the metabolic pathway, which is principally composed of nonlimiting rate processes, it has become possible to reduce enzyme kinetic analyses and quantitative analyses of metabolites. 22000 20000 _S 18000 u
16000
I 14000
600
Time (sec) E-6PGUse A
1200
E-X5PLA E-GSSGR A
Fig. 6. Behavior when including the rate-limiting enzyme G6PD in the static model module. Vertical axis, molecules; horizontal axis, time (seconds) E-GSHox, Glutathione oxidative stress; E-GSSGR, Glutathione reductase; E-6PGLase, 6-Phosphogluconolactonase; E-6PGOLDH, 6-Phosphogluconate dehydrogenase
4. Problems on Application of the Hybrid Method What becomes the biggest obstacle to applying a hybrid method is the specification of rate-limiting enzymes. We investigated cases that have been analyzed experimentally to date based on a search of the literature, and
Hybrid Dynamic/Static Method for Simulation
229
found unexpectedly few descriptions regarding rate-limiting enzymes or key enzymes. High-throughput methods to specify or predict this are therefore essential. The methods we are exploring involve predicting allosteric enzymes from sequence motifs or homology using the GEM system described in a previous chapter, and searching for cases, using the BRENDA database [16], in which effector metabolites have a regulatory effect. However, doubts remain about predicting rate-limiting enzymes using this method. Conventional explanations have used the term "rate-limiting enzyme," which is a generic biological expression. This "rate-limiting enzyme" refers to an enzyme that acts as a pacemaker to control the rate of a series of reactions; many have been thought to be regulatory enzymes such as allosteric enzymes. However, metabolic control analysis (MCA) [17], which has come to be performed recently, has revealed that regulatory enzymes do not control the rate of flow in the pathway as much as conventionally thought. Metabolic control analysis is one method of sensitivity analysis created for the purpose of analyzing metabolic control in the steady state or thereabouts, and primarily consists of the flux control coefficient (FCC; eq 1) that expresses the rate flow {J) response to changes in enzyme activity {E) such as in eq. (1), and the elasticity coefficient (EC; eq 2) that expresses the reaction rate (v) response to changes in metabolite concentration (C) such as in eq. (2). ^j
dJ E
.; = ^ - ^ ^ dc V J
(2) I
This type of analysis is applied not only to computer simulations, but also to biochemical experiments, with FCC measurements actually conducted. Based on such analyses, it has become known that regulatory enzymes are often low FCC and high EC. On the other hand, reactions with an FCC of close to 1 are thought to be at the rate-limiting stage. It is therefore considered important to determine the FCC in the pathway that is targeted for modeling. Among other ways to overcome the problem regarding the determination of bottleneck reactions, methods to fully determine them directly through biochemical experiments are most effective. For example, methods could be used such as those that calculate the reaction rate by trace studies using the
230
Y. Nakayama
time series data of the changes in metabolite concentration or radioactive isotopes.
5. Future Prospects Another obstacle to constructing a hybrid model is the necessity of rate equations, parameters, and related substance concentrations of the bottleneck reactions. Without doubt, by applying this method it has become possible to reduce the information needed to construct a dynamic model. However, kinetic analyses still remain essential for one quarter to one sixth of enzymes that form a bottleneck reaction. Another method for dynamic models known to date is the S-System [6] (S stands for synergistic), like eq. (3). This S-System assumes every metabolite is in control of the production/consumption of each metabolite, and expresses by power the influence these metabolite concentrations have on the amount of change. Because power is used, it is possible in theory to make an approximation also in relation to any rate equation, but because the search space is too wide in the optimization for parameter determination, approximate parameters with very poor accuracy will probably be obtained when attempting expression of the state of dynamic change. rIY
^
«
^'a^x';-ß:^x';
(3)
The development of metabolic models has been tried using this S-System, but these models were designed to reproduce the dynamic behavior at the steady state because of problems such as those described. However, it is known that hybrid method application considerably reduces the number of enzymes to be targeted for optimization and, furthermore, that these are bottleneck reactions. In this situation, if the search space is limited to effector metabolites using the B REND A database, for example, rather accurate parameters could be obtained using time series data of the metabolite concentration when a dynamic state is expressed using GMA shown in eq. (4) for the basic form of the S-System.
/=1
Hybrid Dynamic/Static Method for Simulation
231
6. Conclusion There are bright prospects for the modeling of metabolic systems. However, in most cells except cells in which no genetic expression at all occurs, such as human erythrocytes, metabolism is controlled by genes. That is, the maximum activity of each enzyme is regulated by the gene expression, and changes dynamically according to the circumstance. Moreover, because one group of enzyme genes is not expressed under normal conditions, some cases in each pathway are expunged. In such cases and only under specific conditions, the enzyme group associated with that pathway is expressed and operated. It is therefore unlikely the metabolic pathway map created with the aid of data mining from genome sequencing using pathway databases and the GEM system could be applied in its present form. That is, it is not likely that detailed cell simulation could be obtained through metabolic modeling only; rather, the cell simulation would likely be a combination of a gene expression model and signal transmission model controlling it. Although the principles differ from this hybrid method for metabolic systems, we are currently developing a new method to enable modeling of gene expression systems and signal transduction pathways.
References 1. Tomita M, Hashimoto K, Takahashi K, Shimizu T, Matsuzaki Y, Miyoshi F, Saito K, Tanida S, Yugi K, Venter C (1999) E-CELL: software environment for whole cell simulation. Bioinformatics 15:72-84 2. Takahashi K, Ishikawa N, Sadamoto Y, Sasamoto H, Ohta S, Shiozawa A, Miyoshi F, Naito Y, Nakayama Y, Tomita M (2003) E-CELL2: multi-platform E-CELL simulation system. Bioinformatics 19:1727-1729 3. Takahashi K, Yugi K, Hashimoto K, Yamada Y, Pickett C, Tomita M (2002) Computational challenges in cell simulation. IEEE InteUigent Systems 17:64-71 4. Gillespie T (1977) Exact stochastic simulation of coupled chemical reactions. J PhysChem 81:2340-2361 5. Le Novere N, Shimizu S (2001) StochSim: modelling of stochastic biomolecular processes. Bioinformatics 17:575-576 6. Savageau M (1976) Biochemical systems analysis: a study of function and design in molecular biology. Addison-Wesley, Reading, MA, 1976 7. Nakayama Y, Kinoshita A, Tomita M (2005) Dynamic simulation of red blood cell and its application to pathological analysis. Theor Biol Med Model; in press
232
Y. Nakayama
8. King L, Altman C (1956) The King-Altman method for driving kinetic equations. J Phys Chem 60:1375-1378 9. Monod J, Wyman J, Changeux J (1965) On the nature of allosteric transitions: a plausible model. J Mol Biol 12:88-118 10. Koshland D, Nemethy G., Filmer D (1966) Comparison of experimental binding data and theoretical models in proteins containing subunits. Biochemistry 5:365-385 11. Aiba S, Matsuoka M (1979) Identification of metabolic model: citrate production from glucose by Candida lipolytica. Biotechnol Bioeng 21:1373-1386 12. Henriksen C, Christensen L, Nielsen J, Villadsen J (1996) Growth energetics and metabolic fluxes in continuous cultures of Penicillium chrysogenum. J Biotechnol 45:149-164 13. Jorgensen H, Nielsen J, Villadsen J (1995) Metabolic flux distributions in Penicillium chrysogenum during fed-batch cultivations. Biotechnol Bioeng 46:117-131 14. van Gulik W, Heijnen J (1995) A metabolic network stoichiometry analysis of microbial growth and product formation. Biotechnol Bioeng 48:681-698 15. Albert A (1972) Regression and the Moore-Penrose pseudo inverse. Academic, New York 16. Schomburg I, Chang A, Schomburg D (2002) BRENDA, enzyme data and metabolic information. Nucleic Acids Res 30:147-149. http://www.brenda.uni-koeln.de/ 17. Comish-Bowden A (1995) Metabolic control analysis in theory and practice. Adv Mol Cell Biol 11:21-64
Part V. Metabolomics and Medical Sciences
Chapter 16: Metabolomics and Medical Sciences Takaaki Nishioka Division of Applied Biosciences, Graduate School of Agriculture, Kyoto University, Oiwake-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8502, Japan
1. Introduction Once the genome DNA of a living organism is sequenced, it soon promotes the studies on genes, mRNAs, and proteins of the target and other living organisms, hi these studies mRNAs and proteins are directly linked to the genes that code their sequences and experimentally proved by using contemporary chemical analysis such as transcriptomics and proteomics. Genome analysis also promotes studies on metabohsms. The KEGG database (http://www.genome.jp) daily updates the metabolites, enzyme reactions, and metabolic pathway networks by integrating the newly determined genome data. These items in the database are directly linked to enzyme proteins that are deduced from the corresponding genes on genomes. ARM (Chapter 13) and GEM (Chapter 14) systems also automatically generate metabolic pathway network of a target organism. Li the database of metabolism and the generating systems, however, metabolites and enzyme reactions are predicted on the basis of our knowledge about the individual enzyme reactions that have been accumulated by biochemical studies. Genome analysis hardly promotes the studies on metabolites that are the biochemical aspect of enzyme reactions and metabolic pathway networks. Whereas mRNA and proteins are the direct products of genes, metabolites are the indirect products. With the exception of enzyme genes, it is quite difficult to find a relation between metabolites and genes because of the multiplicity of metabolic pathway networks that will be discussed in the following sections. This was one of the two reasons why metabolomics has been less applied to basic and applied biological sciences among the "omics" sciences. Another reason was a limited availability of chemical methods to analyze metabolites that are widely varied in chemical structure and physicochemical property. Nowadays the advantages of metabolomics over the other omics sciences have become apparent with the development of analytical methods, databases, and tools for prediction and simulation as discussed in the articles in the previous chapters of this book, hi this chapter the novel definition and
234
T. Nishioka
properties of metabolism, and possible applications of metabolite profiles to drug discovery and medicine are discussed. Because of the unique properties of metabolism, metabolomics has prominent potentials as a new tool in the life sciences.
2. Novel Definitions of IVIetabolism and Metabolic Pathway Network Previously metabolism was defined as intracellular chemical reactions that produce chemical substances and energies sustaining life. Through post-genome scientific studies, we have become aware of that life is based on both chemical substances and genomic information. This adds a novel definition to metabolism; metabolism is defined as the system where chemical substances and genomic information interact with each other in the network of chemical reactions. Transcriptomics and proteomics have produced tremendous amounts of comprehensive data on life. Additionally, analysis of the experimental data by bioinformatics has also generated clusters and networks that are composed of many links connecting genes, proteins, and chemical substances. Such analysis confronts a serious problem; biological meanings of most links are biochemically unknown or not approved (Fig. la). Even if we could detect two proteins making a protein-protein interaction, we know little about the biochemical meaning and molecular mechanism of the interaction and the biological events that the interaction produces. Metabolic pathway networks are also composed of links that are defined as a transformation of chemical structures between two metabolites and an enzyme reaction. Metabolites and chemical mechanisms are the same throughout biological species. For any given pair of metabolites we can tell whether making a link for the pair is possible or not on the basis of biochemistry and if possible, we could even predict an enzyme reaction for the link by using pathway prediction tools such as KEGG and ARM. Chemical mechanism underlying a link is also conserved. The metabolic pathway network is chemically well defined. By referring to metabolic pathway networks, biological meanings of transcriptomics and proteomics data become interpretable (Fig. lb).
Metabolomics and Medical Sciences
235
Fig. 1. Omics data and networks, a Analyses using microarrays, proteomics, and protein-protein interactions have generated tremendous amount of data. Analysis of the data by bioinformatics has also produced many complex clusters and networks that are composed of links or relations. For only few of such links, we can identify their biological meanings about why the members are clustered or interacted. Referring to other clusters and networks, if any, would be helpful for additional identifications of other undefined links. However, such cluster and networks actually share few links between them, b Referring to metabolic pathway networks is much more helpful to find biological meanings of such undefined links because every link on metabolic pathway networks is chemically defined. Clusters and networks would be linkable via metabolic pathway networks
236
T. Nishioka
3. Properties of Metabolism Metabolism features the following two properties: multiplicity in metabolic pathway network and robustness of metabolite profile.
3.1. Multiplicity in Metabolic Pathway Network The metabolic pathway network contains the following three types of multiplicities (Fig. 2). (1) Multiple links: When a metabolite is located on a branching point of a metabolic pathway, it has three or more links (Fig. 2a). Any metabolic intermediate has at least two links as a product of the preceding enzyme reaction and as a substrate of the following reaction (Fig. 2b). In addition, enzyme reactions often synthesize a metabolite by combining two structural parts that are derived from two precursor metabolites. This makes the actual network more complex than the network drawn on metabolic maps. For example, aspartate is a product of an aminotransferase reaction (EC 2.6.1.1) that combines the carbon skeleton of oxaloacetate with the amino group of glutamine. As a result, aspartate has two links in this reaction (Fig. 2e). However, most metabolic maps show only one of them by omitting the transfer of the amino group from glutamine (Fig. 2f) (Chapter 13). (2) Isozymes: Two or more paralogous enzymes often catalyze the same transformation (Fig. 2c). (3) Bypasses: Two metabolites are multiply linked through two or more metabolic pathways (Fig. 2d). These multiplicities are essential for Bacillus subtilis that utilizes diverse carbon-containing compounds as its carbon source. Bacillus subtilis provides a specific pathway for each carbon source, through which a carbon source can access the central energy metabolism; for example, gluconate gains access via the Entner-Doudoroff pathway. Microorganisms are tolerable against mutational inactivation of enzymes. Systematic experiments to inactivate one of the B. subtilis genes by mutation revealed that among approximately 4100 genes of the bacteria, only 271 genes are indispensable for sustaining bacterial life [1]. This is mostly due to the multiplicity by isozymes because about 50% of B, subtilis genes have paralogues [2]. However, it is remarkable that most genes involved in glycolytic and Entner-Doudoroff pathways, and the pathways of menaquinone and isoprenoid biosynthesis, are essential and other vital cell processes are encoded by nonredundant genes. The presence of paralogues might thus allow the cell to respond to changing environmental conditions rather than to protect against mutational losses of enzyme activities. In Escherichia coli, 250 genes among 4409 are essential (Profiling of E. coli Chromosome, http://shigen.lab.nig.ac.jp/ecoli/pec/index.jsp).
Metabolomics and Medical Sciences
(a)
(c) — © -
237
Isozyme 1 Isozyme 2 ^— Isozyme n
(b)
—Q ^
^—
(d)
^
^
(f) Fig. 2. Multiplicity on metabolic pathway network, a 5/acA; circles are metabolites. In this chapter a link connecting two metabolites is defined as a transformation of chemical structures between two metabolites; as discussed in e and f, transformation does not always correspond to an enzyme reaction. Metabolite A has eight links and is located on more than two metabolic pathways, b Every metabolite has at least two links, c Isozymes catalyze the same transformation, d In a bypass, metabolites A and B are indirectly linked by a series of transformations through other metabolites, e Aspartate (A) has two links, from oxaloacetate (O) to aspartate and from glutamate (G) to aspartate, whose transformations are catalyzed by an aminotransferase (EC 2.6.1.1). Metabolite 2 is 2-oxoglutarate. f On conventional metabolic maps, the link shown by a dotted line is omitted
3.2. Metabolite Profiles are Globally Robust, but Locally Variable with Environmental Perturbations The second property of metabolism is robustness of metabolite profile. A good example on this topic is given in Chapter 9. Bacillus subtilis cells were cultured on one of the five different carbon sources. When they reached the maximum growth rate on each culture, their metabolite profile was analyzed. The observed maximum growth rates were the same on the five different carbon sources. The observed metabolite profiles were also similar to each other in spite of the fact that the cells grew on different carbon sources. Bacillus subtilis is most likely provided with a predetermined metabolite profile that is optimized for the maximum growth rate and that is used as a
238
T. Nishioka
template for the regulation of metabolism (Fig. 3a). The bacterial cells regulate their metabolism to make the metabolite profile match the template under variable environmental conditions. This is an example of the robustness of the metabolite profile. Because every living organism possesses its species-specific metabolic pathway network, the predetermined metabolite profiles also might be different from species to species. Each phenotype in plants possesses a distinctive metabolite profile [3]. However, against various external or internal perturbations such as food conditions or genetic modifications, global regulations locally disturbed the predetermined metabolite profile, hi the above case of the B. subtilis cells (Chapter 9), growth on different carbon sources disturbed the profile at the starting metabolites. Such local disturbances induced in the profile were specific and sensitive to internal and external perturbations. This is the basis of metabolomics as a tool of clinical diagnosis and drug development.
Fig. 3. Metabolite profile as a template, a This is a schematic drawing to give an idea of a template of a metabolite profile, where circles are metabolites and their size corresponds to the intracellular amount of metabolites. Each living organism has species-specific, predetermined metabolite profiles as templates. Depending on living conditions, it selects one of them and regulates its metabolism to keep a metabolite profile to match one of the templates, b In spite of the regulation of metabolism, extemal or intemal perturbations locally disturb the metabolite profile. Metabolites, Ml and M2, are decreased or increased from those in the template a
Metabolomics and Medical Sciences
239
Against a perturbation that exceeded a range of the global regulation, B. subtilis evolved a strategy to replace the predetermined metabolite profile to another predetermined one (Chapter 2). With the culture, the cells consumed glucose in culture medium resulting in the depletion of glucose, which was an excessive perturbation, and finally induced the spore formation. The intracellular amount of the metabolites concomitantly decreased with the consumption of glucose. On the metabolite profiles, the depletion of glucose first decreased the amount of metabolites in the upper half of the glycolytic pathway, then those in the whole of the glycolytic pathway, and finally those in tricarboxylic acid cycle and amino acids, replacing the metabolite profile of the maximum growth rate to that of the spore formation.
4. Metabolite Profiles Are Biomarkers Biomarkers have been defined as such a set of metabolites that are characteristic to an internal or an external perturbation; organic acids in urine are used as the biomarkers to detect some diseases in patients (Chapter 12). As the articles in Chapters 2 to 8 discussed, analytical methods have greatly increased the number of metabolites to be measured and shortened the time necessary to analyze biological samples. Here I propose that biomarkers should be metabolite profiles but not a set of metabohtes. Metabolite profiles are much more informative for defining unidentified diseases, and to study the molecular mechanism of known and unidentified diseases. An example is the metabolite profile of the spore formation of B. subtilis (Chapter 2). In the metabolite profile where most metabolites decreased, only two metabolites, tryptophan and succinyl-CoA, increased; these two metabolites changed oppositely among the observed metabolites. The increases in the two metabolites are characteristic of the spore formation. However, the decreases in other metabolites are also characteristic of the spore formation. The metabolite profile contains all those both changed and unchanged, and are much more informative for the study of biochemical mechanisms of spore formation than a set of metabolites. Most profiling tools analyze metabolomics data as if metabolites were independent, random samples. This is incorrect, because metabolites are not independent from each other but constrained within a metabolic pathway network.
240
T. Nishioka
5. Metabolomics as a Tool of Drug Development Drug design is composed of two steps: lead finding and lead optimization. Lead is a compound of new chemical structure and has only a weak drug activity, often accompanied by toxic side effects. Lead optimization improves the drug activity of a lead compound by modifying its chemical structure. Lead compound is optimized rationally by identifying the site of its biochemical action and target protein, revealing the mode of protein-drug interactions, and predicting in vitro activity of drug candidates by analyzing the relationships between their chemical structures and drug activities. In contrast, lead finding is not rational but still relies on chance by using random screening of synthetic compounds and natural products for a certain drug activity. A lead substance often has a weak activity that is undetectable by conventional in vivo assays using phenotypes as an indication of drug activity or that shows slow improvement of symptoms toward recovery. Such weak drug activities will emerge more sensitively and characteristically in metabolite profiles. As a tool of searching for lead compounds, metabolite profiles will replace conventional biological assays. Detecting side effects or toxicities of drug candidates is a purpose of animal and clinical tests. However, unexpected serious side effects that were undetectable by the tests have often emerged after a drug has come onto the market. Metabolite profiling has a potential to detect such covert and unfavorable biological effects that are not detectable by conventional tests but emerge as local disorders on metabolite profiles.
6. Applications of Metabolomics to Medicine Metabolomics could be applicable to medicine as a tool for diagnosis, follow-up, and pathophysiological analysis. The metabolic aspect of diseases is abnormal, irreversible disorders in human metabolite profiles (Fig. 3). Most clinical diagnoses measure enzyme activities in the patient's blood. However, enzyme activity data are not sufficient to detect and predict any disorders in a patient's metabolite profile. Suppression or activation in enzyme activity does not always result in the accumulations or reductions of its substrates and products because of the multiplicity of metabolic pathway networks and the robustness of metabolite profiles. Metabolome analysis is essential to find disorders on a patient's metabolite profiles and should be added as a clinical tool of diagnosis.
Metabolomics and Medical Sciences
241
When metabolite profiles from patients with various diseases are compiled, we can use them as a template for diagnosis; if a metabolite profile from a patient were equal to one of the templates, we could identify the patient's disease. Such metabolite profiles will also help us study the biochemical mechanism of diseases when they are combined with enzyme activity data. Combined analysis identifies the enzyme reactions and metabolites that are disordered, and is indispensable for analysis of physiological and environmental perturbations on the metabolite profile of a patient. Metabolites on a metabolic pathway network often have more than ten links for each. Among these, a predominant link that regulates the intracellular amount of a metabolite should be identified. Practical methods to find predominant links include combined analysis of metabolomics with transcriptomics (Chapter 9) and measurement and analysis of the metabolite profiles under various external perturbations [4]. Simulation of living cells based on systems biology such as E-CELL would be a rational method to quantitatively interpret metabolite profiles (Chapter 15) [5]. Loss of enzyme activity by genetic or chemical perturbations is critical in the metabolic pathways that have neither bypass nor isozyme (Ml in Fig. 3b). It induces a lack of essential metabolites in the pathway downstream and results in critical disorders on a predetermined metabolite profile. This is easily detected and could be medically treated. However, a loss of one or even more enzyme activities in the middle areas of a metabolic pathway network induces no serious problem because bypass pathways soon compensate the depletion and accumulation of metabolites by supplying and catabolizing them (Fig. 4). As the depletions and accumulations spread over a network, they can cause critical disorders on the metabolite profile. These global disorders are thought to be the origin of lifestyle-related diseases such as cerebral apoplexy, myocardial infarction, hyperlipemia, and hypertension. Collecting the metabolite profiles of the patients and monitoring their metabolite profiles during follow-up is therefore necessary.
7. Future Development of Metabolomics Analytical tools such as gas chromatography/mass spectrometry, liquid chromatography/mass spectrometry, and capillary electrophoresis/mass spectrometry have progressed. We are now able to measure metabolite profiles of patients and animal models that have genetic perturbation on their metabolic pathway networks by mutations or RNA interference. Development of standard protocols for the extraction and analysis of metabolites from human and animal tissues are necessary for the acquisition of
242
T. Nishioka
reproducible metabolomics data and for the accumulation and compilation of the data. Analysis of the relationships between metabolite profiles and internal perturbations is required as the references for diagnosis, medical treatment, and drug design. Metabolomics must become generally available in hospitals as a tool of diagnosis, as soon as further technical developments reduce the size and price of analytical tools as well as the cost of a sample analysis.
Fig. 4. Model of lifestyle-related diseases. Crosses are the sites where enzyme reactions are inactivated by intemal or extemal perturbations. When the sites of inactivation are few, they do not affect the metabolite profile because of the multiplicity and regulation of metabolism. When they are accumulated and spread over a metabolic pathway network, they affect irreversible, critical disorders on the metabolite profile. Sites of disorders are different from patient to patient depending on their life style
Metabolomics and Medical Sciences
243
References 1. Kobayashi K, Ehrich SD, Albertini A, et al (2003) Essential Bacillus subtilis genes. Proc Natl Acad Sei USA 100:4678^683 2. Kunst F, Ogasawara N, Moszer I, et al (1997) The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature 390:249-256 3. Fiehn O (2002) Metabolomics — the link between genotypes and phenotypes. Plant Mol Biol 48:155-171 4. Sato S, Soga T, Nishioka T, Tomita M (2004) Simultaneous determination of the main metabolites in rice leaves using capillary electrophoresis mass spectrometry and capillary electrophoresis diode array detection. Plant J 40:151-163 5. Ishii N, Soga T, Nishioka T, Tomita M (2005) Metabolome analysis and metabolic simulation. Metabolomics 1: in press
Index acetonitrile, 115, 117 acetyl-CoA, 21, 40, 137, 197, 204 7V-acetylglutamate synthetase, 168 0-acetylserine, 144, 145 acid-catalyzed hydrolysis polymerization, 108 acquired metabolic disorder, 53 activation tagged line, 147 active transporter, 136 acylcamitine, 184 acyltransferase, 148 adenine, 19, 181 — nucleotides, 101 — phosphoribosyl transferase, 176, 180 adenosylcobalamine synthesis, 185 adenylation, 204 ADP, 101,204 ß-alanine, 21 L-alanine, 169, 199 alaninuria, 168 alcohol dehydrogenase, 203 alkoxysilanes, 108 alkyl-acyl PE, 160 alkyl chloroformates, 63 allopurinol, 180 allosteric enzyme, 215, 229 amine, 11,42,53, 149 amino acid, 11, 37, 39,42, 53, 64, 99,131,145,149,167,184,239 — analyzer, 40 — biosynthesis, 43 stable isotope-labeled —, 65 amino group, 208 amino-transfer, 204 — reaction, 199 aminotransferase, 176, 236 9-amino quinolyl-7V-hydroxysuccinimidylcarbamate, 41 ammonia, 168, 204 ammonium acetate, 47, 115 ammonium-adduction, 158
ammonium formate, 158 amniotic fluid, 186 AMP, 101, 181 anabolism, 43 anaerobic culture, 132 anion-exchange chromatography, 47 anion-exchange HPLC, 44 anionic metabolite, 14 anode, 15 anthocyanin, 147, 149, 150 AQC,41 Arabidopsis thaliana, 115, 123, 141 arabitol, 197 AraCyc, 144 arginase, 169 L-arginine, 10 argininosuccinate, 169 — lyase, 169 — synthase, 169 argininosuccinic acidemia, 169 argon(Ar), 79, 81 — ion laser, 99 aromatic ring, 201 aromaticity, 197 arteriosclerosis, 160 L-asparagine, 41, 169 L-aspartate, 236 atmospheric pressure chemical ionization (APCI), 28 atomic correspondent, 202 atomic mapping, 204 Atomic Reconstruction of Metabolism (ARM), 4, 131, 194, 199, 200, 202, 208, 212, 233, 234 ATP, 39, 40, 47, 101, 197,204 autonomous oscillations, 41 auxin, 145 average path length, 207 average retention factor, 117 azathiopurine, 180
246
Index
Bacillus subtilis, 9, 19, 99, 101, 102, 130,203,236,239 bacteria, 208 bacterial differentiation, 21 band spreading, 114 batch cultures, 41 benzene ring, 197 benzofiirazan, 42, 44 benzoic acid, 197 betaine, 173 biochemical reaction, 200 BioCyc,215 BioLayout, 218 biomarker, 239 biosynthetic pathway, 195 biotin-dependent propionyl-CoA carboxylase, 185 bistrimethylsilyltrifluoroacetamide (BSTFA), 60, 62 BLASTP,214,215 Bligh and Dyer's method, 160 blood, 54, 185,240 boric acid, 101 bottleneck reaction, 227, 230 bottom-up approach, 3 BRENDA,212,229,230 butanol, 203 /er/-butyldimethylsilylation, 63 C18alkyl group, 119 CI8 stationary phase, 110, 112, 115 cadaverine, 19 Caenorhabditis elegans, 160 calibration factor, 57 camptothecin, 149 capillary array, 101 capillary electrochromatography, 91 capillary electrophoresis (CE), 2, 28, 91, 105, 145, 150, 157 —^/mass spectrometry (CE/MS), 7, 8,75,122, 131,142,222 —/MS/MS, 23 —/TOF-MS, 23 multiplexed—, 101 pressure-assisted—, 17
capillary gas chromatography (capillary GC), 63 — /MS, 68 capillary gel electrophoresis, 91 capillary HPLC, 112 capillary isoelectric focusing, 91 capillary isotachophoresis, 91 capillary LC/ESI-MS, 149 capillary zone electrophoresis (CZE),91,94, 104 carbamoylphosphate, 168, 169 — synthetase, 168 carbohydrate, 167 carbon, 204 — dioxide, 204 — metabolism, 199 5-carbon sugar, 195 carboxyl group, 44 carboxylic acid, 14, 197 phosphorylated — ,14 carnitine, 64 catabolism, 43 catabolite repression, 21, 129 catabolite-responsive element, 130 cathode, 14 cation exchange resin, 40 cation-exchanged membranes, 48 cationic metabolite, 10 CcpA,21 CcpC,21 cDNA,217 — Analysis System (CASYS), 217 — differential display, 149 cell simulation, 211 chalcone synthase, 149 channel, 136 charged metabolite, 7 charge-to-size ratio, 104 chemical bio-panning, 143, 150 chemical degradation, 40 chemical derivatization, 99 chemical formula, 212 chemical ionization (CI), 28 chemical mutagenesis, 150 chirality, 197 chloroform, 42, 78, 130
Index cholesterol, 185 c/^-aconitate, 21 citric acid, 43, 132, 197 — buffer, 40 citrulline, 169 citrullinemia, 169 CoA, 17 cobalamin, 173 coenzyme, 167 cofactor, 207 COG, 4 — database, 214 cognitor, 214, 215 collision activation, 48 collision energy, 158 collision-induced dissociation (CID) spectrum, 88 colorimetric determination, 40 column back-pressure, 110, 112, 113 column length, 113, 117 conductometry, 47 creatine, 61 ds-creatine, 61 creatinine, 57, 60, 61, 65, 67, 173, 177 ds-creatinine, 61,65 crop, 142 crown ether, 94 current drop, 15 cycle sequence, 158 cyclic AMP, 130 cyclic nucleotides, 17 cyclic PA, 160, 162 cyclodextrin, 94 —-bonded phase column, 47 cylindrical polyimide-coated fused silica capillary, 92 cystathionine ß-synthase, 173 L-cysteine, 173 cystine, 60 cystinuria, 64 cytochrome, 64 data driven, 1 DBD-PZ, 44
247
De Toni-Fanconi-Debre syndrome, 63 dead volume, 114 DEAE-Sephadex ion exchange, 56 decarboxylate form, 40 decarboxylation, 204 degradation pathway, 195, 208 2-dehydro-3-deoxy gluconate, 137 deoxyadenosylcobalamin, 173 deoxynucleotide, 17 deoxythymidine monophosphate (dTMP), 175 deoxyuridine monophosphate (dUMP), 175 deprotonated molecular ion, 16 detection sensitivity, 41, 112, 117 diacyl 16:0/18:1 PC, 159 diacyl 34:2 PC, 160 diacyl 34:3 PC, 160 diagnosis, 240 diastereomeric derivative, 68 dibasic amino aciduria, 64 diethanolamine, 19 differential electrophoretic mobility, 94 differential screening, 149 diffusion coefficient, 112 dignitor, 215 dihydropyrimidinase, 176 dihydropyrimidine, 177 — dehydrogenase, 176 5,6-dihydrothymine, 177 5,6-dihydrouracil, 177 2,8-dihydroxyadenine, 181 dilution factor, 112 4-(7V,A^-dimethylaminosulfonyl)-7piperazino-2,1,3-benzoxadiazole, 44 iV,7V-dimethylglycine, 19 2,2-dimethylsuccinate, 60, 67 DNA ligase, 203 DNA microarray, 129, 130, 145, 147 double bond, 201 doubling time, 131 draft model, 4
248
Index
drug design, 240 drug metabolism, 208 dynamic behavior, 230 dynamic metabolic simulation model, 221 dynamic module, 224 dynamic pH junction, 91, 97-99, 101, 104 —sweeping, 91,97, 102 dynamic simulation, 211,215, 222 dynamic/static boundary, 224 EC, 195,204,214 E-CELL,4,211,216,221,241 — Simulation Environment, 215 ecotype, 149 effector, 222 elasticity coefficient, 229 electrokinetic injection, 92, 104 electrokinetic stacking, 99 electrolysis, 48 electromigration, 92 electron ionization (EI), 28 electro-osmotic flow (EOF), 14, 92, 97 electro-osmotic velocity, 96 electrophoretic mobilities, 91 electrospray ionization (ESI), 11, 26, 79, 155 —FT-ICR-MS, 160 —-mass spectrometry (ESI-MS), 8 —Q-TOF-MS, 75, 81,82 —TOF-MS, 5, 30 elementary composition, 157 elutiontime, 110 Embden-Meyerhof-Pamas pathway, 47, 136, 137 EMBL,4,212,214,216 Entner-Doudoroff pathway, 131, 137,236 enzyme, 214, 222 — activities, 240 — deficiency, 167 — dysfiinction, 167 — reaction stoichiometry, 212 —-substrate complex, 222
equilateral polygon, 202 e-Rice, 217 erythrocyte, 221, 225 Escherichia coli, 40, 41, 44, 47, 76, 77,82,83,87,129,130,203,204, 216,236 EST, 217 ethanol, 60, 203 boiling buffered —, 41 hot —, 78 ethyl acetate, 56 ethyl ether, 56 eukaryote, 212, 217 e-value, 214 FAD, 204 fast atom bombardment (FAB), 28 fasting, 54 fatty acid, 185 — ß-oxidation, 54 fatty acyl chain, 163 feedback regulation, 127 field-enhanced sample stacking, 96 field strength, 97 filter paper, 60 FITC,41 flat flow profile, 92 flavin, 102 — mononucleotide (FMN), 102 flavonoid, 147, 149, 150 flow channel, 48 flow injection, 33 flow profile, 114 flow rate, 112,114,117, 118 flow resistance parameter, 113 fluorescein isothiocyanate, 41 fluorescence, 93 — detection, 44 —labeling, 99 fluorescent, 102 — derivative, 40 5-fluoro-ß-alanine, 177 fluoroalkyl-bonded silica particle-packed column, 119 4-fluoro-7-nitrobenz-2-oxa-1,3diazole(NBD-F),41,99
Index 4-fluoro-7-nitrobenzofurazan, 41 5-fluorouracil (5-FU), 176 flux control coefficient (FCC), 229 focused analysis, 34 folate, 168, 173, 175 — deficiency, 175 follow-up, 240 Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR-MS), 8, 26, 31, 37, 75, 79, 82, 143, 155, 157 fragment ion, 156 freeze-thaw, 42 FruA, 136 fructose, 130 —-1,6-diphosphatase, 54, 56, 171 — 1,6-diphosphate, 21,130, 136, 225 — intolerance, 169 — 1-phosphate, 132, 136 —6-phosphate, 16, 47 fiimaric acid, 43, 64, 197 fiised silica capillary, 10, 108 GAB A, 19 galactosemia, 169 GapA, 137 GapB, 137 gas chromatography (GC), 2, 62 —-mass spectrometry (GC/MS), 8, 28, 54, 56, 64, 122, 142, 150, 167, 181 —/MS/MS, 181 GenBank, 211,216 generalized amino aciduria, 64 generalized mass action, 221, 230 generic name, 203 Genome-based E-CELL Modeling System (GEM System), 4, 211, 221229,233 genome sequence, 212, 216 genotype, 149 G-language Genome Analysis Environment (G-language GAE), 211,215,217 glass filter, 130
249
Glimmer2, 212 GlimmerM, 212 global regulation, 127, 239 GlpF, 136 gluconic acid, 130, 132, 236 gluconeogenesis, 54, 171 glucose, 39, 47, 96, 101, 104, 129, 130,168,195,239 — deprivation, 21 —-6-phosphatase, 54, 171 —-1-phosphate, 16 —-6-phosphate, 16,47, 132, 136 dehydrogenase, 221, 227 stable isotope-labeled —, 37 *^C-glucose, 78 glucosinolate, 145 glucosuria, 64 glucuronide conjugation, 208 L-glutamic acid, 19, 204 L-glutamine, 40,41, 169, 236 glutathione (GSH), 84, 87, 221 —-tS-transferase, 148 glyceraldehyde-3-phosphate, 131, 137 glycerate, 67, 68 —-1,3-diphosphate, 137 — kinase, 68 — aciduria, 68 glycerol, 54, 130, 132, 171 — infusion treatment, 54 — kinase, 54 —-3-phosphate, 54, 56, 132, 171 glycine, 60 glycolysis, 14, 19,37,47, 104 glycolytic pathway, 16, 44, 131, 236, 239 glycome, 34 glycosyltransferase, 148 GntP, 136 gradient elution, 114, 117, 123 gradient range, 117 gradient time, 117 graph, 200 — isomorphism, 201 Graphical User Interface (GUI), 211 growth curve, 131
250
Index
Guthrie test, 182 Hartnup disease, 64 helium (He), 81 heavy metal, 150 heme, 201 hepatic mitochondrial protein, 169 HEPES,41,78, 83, 87 heptadecanoate, 60 «-heptadecanoate, 57 herbicide, 150 hierarchical cluster analysis, 143, 145 high-performance liquid chromatography (HPLC), 26, 38, 91, 145, 150, 155, 157 micro—, 104, 107, 114, multidimensional—, 123 L-histidine, 41 homocitrulline, 169 homocysteine, 173, 207 homocystine, 60, 175 homocystinuria, 173 homology search, 214 HPr kinase, 130 Hueckel rule, 201 human serum, 160 — albumin, 162 human urine, 53 hybrid algorithm, 215 hybrid dynamic/static simulation method, 221,222 hydrodynamic injection, 92 hydrogen ion, 48 hydronium ion, 48 hydrophilic functional group, 122 hydrophilic LC (HILIC LC), 123 hydrophobicity, 104, 120 hydroponic culture, 144 8-hydroxyadenine, 181 3-hydroxydicarboxylic aciduria, 169 3-hydroxyisovaleric acid, 185 hydroxyl group, 208 hydroxylpyruvic acid, 199 5-hydroxymethyluracil, 177 3-hydroxymyristate, 57
2-hydroxyundecanoic acid, 60, 67, 185 hyperammonemia, 167, 168 hyperargininemia, 169 hyperglutaminuria, 168 hypermethioninemia, 173 hyperomithinemiahyperammonemia-homocitrullinuria syndrome, 168 hyperoxaluria, 67 hypoglycemia, 54 hypoxanthine, 180 —-guanine phosphoribosyltransferase, 180 iminoglycinemia, 64 inactivation of metabolism, 39 inborn errors of metabolism (lEM), 53, 167, 182 indole, 149 industrial plant, 142 injection volume, 92 in-source fragmentation, 158 internal standard, 10, 88 interstitial porosity, 113 intramolecular condensation, 40 ion exchange, 119, 120, 122 — chromatography, 8 — column, 34 ion suppression, 8, 116, 118 ion trap MS (ITMS), 27 lonCarboPac ASH (Dionex) column, 44 ionic micellar, 96 ionic strength, 94, 101 isocitric acid, 21,43 isocratic elution, 114 isocratic separations, 117 L-isoleucine, 11, 185 isomer, 11 isoprenoid, 236 isotope dilution, 70 isotopic distribution analysis, 37 isotopic peak, 160 C13—, 10 isozyme, 215, 236, 241
Index Joule heating, 92 kaempferol, 149 KEGG database, 4, 84, 136, 212, 215,233,234 keto acid, 57, 63 2-ketocaproic acid, 57 keto-enol tautomerism, 197 ketone, 168 ketonuria, 64 kidney, 186 King-Altman method, 222 KNF model, 222 A;-shortest paths algorithm, 204 lactic acid, 54, 132, 168 lactic acidemia, 54, 63, 167, 171 lactose, 130 laser, 41 —-induced fluorescence (LIF), 93 lead finding, 240 lead optimization, 240 leaf, 145 Lesch-Nyhan syndrome, 176, 180 L-leucine, 11, 60 ds-L-leucine, 65 lifestyle-related disease, 241 LIGAND database, 11 ligase, 203 linear programming, 224 linear velocity, 110, 112, 113 lipid, 34 — database, 157 lipidome, 34 liquid chromatography (LC), 2 —ICE, 122 — two-dimensional analysis, 104 —/ESI-MS,27, 112, 114, 115, 160 —/ESI-MS/MS, 175, 177 —/MS, 8, 38, 42, 44, 75, 142, 155 precolumn —, 46 —/MS/MS, 48 nano —, 34 —-stopped flow, 143 literature search, 218 liver transplantation, 54, 169
251
logarithmic phase, 41, 46, 130 long-term starvation, 144 low conductivity solutions, 10 lymphocyte, 185 L-lysine,21,60, 84, 87, 169 d4-L-lysine, 169 lysinuric protein intolerance (LPI), 168 macromolecule, 110 malic acid, 16,21,43, 101, 104, 130, 132, 137 mammalian cell, 161 mass accuracy, 157 mass balance, 202 mass resolution, 157 mass spectrometry (MS), 2, 93, 142 matrix-assisted laser desorption/ ionization (MALDI), 26, 155 —IT-TOF-MS, 28 — T O F - M S , 28 maximum growth rate, 237 medicinal plant, 142 MEDLINE, 218 megaloblastic anemia, 175 membrane transport system, 168 menaquinone, 236 6-mercaptopurine, 180 — ribose phosphate, 180 metabolic control analysis, 229 metabolic engineering, 37, 224 metabolic fingerprinting, 142, 150 metabolic flux analysis, 222 metabolic graph, 204 metabolic map, 193 metabolic pathway, 193, 195,221 — network, 207, 212, 233, 236, 240 metabolic phenotype, 7, 149 metabolic profiling, 37, 142 metabolic reaction, 202 metabolic snapshot, 8 metabolite analogue, 150 metabolite profile, 135, 234, 237, 240 metabolite profiling, 142 metabolome profiling, 3, 8
252
Index
metabolomic profiling, 76 methanol, 41, 47, 130 cold—,39,78 hot —, 78 L-methionine, 60, 175, 185, 207 — adenosyltransferase, 173 — sulfone, 10, 130 methoxylation, 149 methyl group, 204 methyl jasmonic acid, 145 methylcitric acid, 56, 60, 70, 181, 185, 186 ds-methylcitric acid, 70 methylcobalamin, 173 3-methylcrotonylglycine, 185 A^^' ^^-methylenetetrahydrofolate reductase, 173 methylmalonic acid, 168, 185 methylmalonic acidemia, 53, 184 methylmalonic aciduria, 173, 185 methylmalonyl-CoA, 185 — mutase, 185 A^^-methyltetrahydrofolate homocysteine methyltransferase, 173 methyl-transfer, 207 A^-methyl-A^-trimethylsilyltrifluoroacetamide, 63 micellar electrokinetic chromatography (MEKC), 91, 96, 97, 99, 104 Michaelis-Menten equation, 222 microfiltration, 130 minimum-norm solution, 224 mitochondria, 43, 168, 197 mitochondrial DNA, 171 mitochondrial myopathy, 63 mobile phase, 112, 113 model-selection, 200 MOL, 200 molecular formula, 84 molecular ion, 48 molecular mass, 156 molecular structure, 200 monoamine oxidase, 208 monoisotopic peak, 160
monoisotopic protonated molecular ion, 10 monolithic polymer column, 110 monolithic silica capillary column, 108, 112, 115, 117 monolithic silica column, 108, 113, 119 Moore-Penrose pseudo-inverse matrix, 224 Morgan method, 201, 203 mRNA, 233 MS/MS, 27, 79 multidimensional separation, 118 multiple alignment, 214 multiple reaction monitoring (MRM),29, 156 multiple subunit, 215 multivalent ion, 17 mutation, 241 MWC model, 222 Myb-like transcription factor, 147 NAD, 40, 47, 101,204 NADH,47, 101 NADP, 101 NADPH,47, 101 nano ESI-FT-ICR-MS, 160 nano flow, 33 neonatal ketoacidosis, 184 neonatal screening, 57, 181 network silica skeleton, 108 neuroblastoma, 181, 182 neutral loss scanning, 29, 156, 163 ninhydrin, 40 nitrogen, 144,201,204 — metabolism, 199 NMR, 5, 8, 37, 142 ^H-NMR, 150 noncharged polymer coated capillary, 17 nonenzymatic process, 228 nonredundant gene, 236 nonvolatile salt, 34 normal phase, 120 — column, 34 nucleic acid, 201
Index nucleoside, 11 nucleotide, 17,37,217 — biosynthesis, 47 nutritional stress, 144 omega-amino acid, 65 on-column preconcentration, 93 on-line preconcentration, 102 on-line sample preconcentration, 96 on-line separation, 28 OP A, 41 Operon, 137 Ophiorrhiza pum Ha, 149 optical path length, 93 organic acid, 43, 53, 56, 149 organic acidemia, 56, 167, 169, 171, 186 organic aciduria, 181 organic polymer column, 110 ornithine, 169 — transcarbamylase, 54, 169 orotic acid, 54, 56, 60, 70, 168, 175, 177 ^^N2-orotic acid, 70 orotic aciduria, 56, 70, 169, 175 orotidine monophosphate decarboxylase, 180 orthogonal selectivity, 119 ortholog, 214 orthology, 214 orthophosphoric acid, 204 oxaloacetic acid, 40, 43, 131, 137, 197,236 oxidative phosphorylation, 64, 171 oxidative stress, 160 oxidized PC, 160 oximation, 56, 63, 181 2-oxoglutaric acid, 16, 21, 43, 204 oxygen, 201 PAPl transcriptional factor, 147 particle diameter, 113 particle-packed capillary columns, 108 particle-packed column, 108, 110, 113, 116, 119
253
pathophysiological analysis, 240 pathway database, 215 pathway model, 199 peak capacity, 116, 118 peak spreading, 114 peak volume, 114 peakwidth, 112, 113, 118 PEEK resin, 108 pentose phosphate, 145 — pathway, 14, 16, 19, 21, 37, 47, 195,225 peptide, 34, 217 perchloric acid, 78 Perilla frutescem, 149 permeability, 110 permittivity, 92 pharmacokinetics, 42 L-phenylalanine, 59, 60 — hydroxylase, 59 phenylketonuria, 59 phosphate, 56 phosphate-3TMS, 65 phosphatidic acid (PA), 160 phosphatidylcholine (PC), 159, 160 phosphaturia, 64, 65 phosphoenolpyruvic acid, 16, 137 6-phosphogluconic acid, 137 2-phosphoglyceric acid, 47 3-phosphoglyceric acid, 47 phospholipase, 155, 162 phospholipid, 10, 155 5-phosphoribosyl-1 -pyrophosphate (PRPP), 181 phosphoric acid, 104 phosphoryl group, 155 phosphorylation, 204 phosphotransferase system (PTS), 136 photosynthesis, 145 or//zo-phthalaldehyde, 41 p/,98 PIPES, 10, 130 p/Ca, 98 polar group, 96 polar head, 157 — group, 155
254
Index
polarizability, 120 polychlorinated biphenyl (PCB), 208 poly(ethylene glycol), 108 polymerization, 203 polyol, 63 porous graphitic carbon column, 47 postcolumn derivatization, 93 post-genome science, 141 potassium, 75 precolumn derivatization, 46 precolumn fluorescent derivatization, 41 precursor ion, 88, 156 — scanning, 29, 163 prenatal diagnosis, 186 pressure drop, 118 primary hyperammonemia, 56, 70 primary hyperoxalosis, 67 primary hyperoxaluria, 54 principal component, 150 — analysis (PCA), 143, 149 product, 202, 222 — ion scanning, 29 profile comparison, 3 prokaryote, 212 L-proline, 169 propanol, 203 propionic acidemia, 54, 56, 181, 184 propionylcamitine, 185 propionyl-CoA, 184 protease, 203 protein, 78, 233 — hydrolysates, 40 — localization, 217 — (phospho)-L-histidine, 207 proteome, 212 proteomics, 119, 122 pseudodynamic simulation, 211 pseudostationary phase, 96, 99 PtsG, 136 PTT,215 PubMed, 141 pulsed amperometry, 47 purine, 53, 101, 167, 175 — nucleotide synthesis, 180
pyridine nucleotide, 101 2,6-pyridinedicarboxylic acid, 104 pyridoxine, 173 pyrimidine, 53, 54, 167, 168, 175, 177 — biosynthesis, 175 pyroglutamic acid, 169 pyroglutamic aciduria, 168 pyruvic acid, 40, 43, 47, 137, 197, 204 — carboxylase, 54 — dehydrogenase, 40, 171 Q-TOF-MS, 5 Quadrupole ion trap MS (Q-ITMS), 30 rate constant, 222 rate-limiting, 137 — enzyme, 228 — process, 227 — reaction, 215 — step, 132 reaction formula, 200 reaction network, 211 regular graph, 201 renal transplantation, 180 resolution, 113 retention factor, 99, 113 retention time, 113, 118 reversed-phase, 119, 122, 123 — column, 8, 34 — HPLC, 114 — separation, 120 — 2D-HPLC, 119 riboflavin, 102 ribose 5-phosphate, 16 ribulose 5-phosphate, 16 RNA interference, 241 root, 145 Saccharomyces cerevisiae, 41, 44, 203 «S-adenosyl methionine, 204 salt stress, 150 sample capacity, 112
Index sample diffusion, 114 sample preparation, 60 sampling rate, 39 scanning electron microscope (SEM), 108 secondary metabolite, 149 selected ion monitoring (SIM), 19, 42, 156, 186 selected reaction monitoring (SRM), 29,45,48, 156 selectivity, 113 self-organization mapping (SOM), 143, 145 sensitivity analysis, 229 separation ability, 118 separation efficiency, 91, 110, 112, 113, 118, 122 separation impedance, 110 separation time, 118 sequence motif, 229 L-serine, 19,41, 199 serum, 54 silanol group, 92 silica column, 123 chemically modified—, 110 silica skeleton, 110 silylation, 56, 149 single ion monitoring, 19 size exclusion, 119, 120, 122 skeleton size, 110 small-world network, 207 SMILE(+) cationic polymer (Polybrene)-coated capillary, 15 SMILES format, 200, 202 sodium, 75 — chloride, 56 — dodecyl sulfate (SDS), 96 — hydroxide, 47 — phosphate, 104 — sulfate, 56 soft ionization, 25 — method, 11 solid-phase extraction, 181 solubility, 34 solvent extraction, 56 SORI-CID, 80
255
soybean, 160 sphingomyelin (SM), 160 split-flow injection, 114 spore formation, 239 sporulation, 21 S-System,221,230 stable isotope, 169 — dilution, 65, 167, 168, 177, 180, 181, 186 —-labeled compounds, 60 —labeled internal standard, 56 stacking, 91 stagnant mobile phase, 114 static module, 224, 225 static simulation, 215 stationary phase, 41, 46 chemically bonded—, 122 fluoroalkyl —, 120 pentabromobenzyl—, 120 steady state, 222, 230 stochastic algorithm, 221 stoichiometric coefficient, 223 stoichiometric equation, 214 substrate, 202, 222 — specificity, 25 substructure, 203 sub-zero temperature, 39 succinic acid, 16,43, 64, 132, 197 succinyl-CoA,21, 185,239 sugar, 53, 63, 149 — alcohol, 53, 149 — phosphates, 47 phosphorylated—, 14, 136 sulfate, 56 sulfiir, 144,201,204 — assimilation, 144, 145 —containing amino acid, 173 — deficiency, 145 — starvation, 145 supercritical fluid chromatography (SFC), 118 suppressor, 48 surface charge, 92 sweeping, 91, 96, 99, 101, 104 SWISS-PROT,4,212,214 symmetric molecule, 197
256
Index
syringe pump, 32 systems biology, 211
L-tyrosine, 59, 60 tyrosinemia, 169
targeted analysis, 142 T-DNA activation, 150 T-DNA knock-out mutant, 148 theoretical plates, 112 thermospray ionization, 28 theoretical plate, 110, 112, 113, 118 thiol, 150 L-threonine, 185 through-pore size, 110 thymidylate synthase, 175 thymine, 176 time-of-flight MS (TOF-MS), 27, 143 tissue specific expression, 217 top-down approach, 3, 214 total performance of column, 110 total porosity, 113 toxicity, 240 transcriptional factor, 21 transcriptome, 21, 145,212 transcriptomics, 241 transient-isotachophoresis (t-ITP), 91,97 tricarboxylic acid (TCA), 14, 16, 21, 104 — cycle, 19,37,43,64, 104, 131, 197,239 triglyceride, 157 trimethylchlorosilane (TMCS), 60, 63 trimethylsilyl (TMS), 62 — derivative, 54, 173 trimethylsilylation, 61, 63, 181 triple-quadrupole mass, 45 triple-stage quadrupole MS, 29 tropate, 57 L-tryptophan, 40, 239 turnover rate, 39 two-dimensional (2D) chromatography, 118 —HPLC separation, 118
ultrafiltration, 42, 78 ultrahigh-pressure liquid chromatography (UHPLC), \U UniGene, 217 uracil, 54, 60, 168, 176 uraciluria, 169 urea, 57, 60, 96, 169 — cycle, 104, 168, 199 urease, 57, 60 — pretreatment, 54, 168 ß-ureidoisobutyrate, 177 ß-ureidopropionase, 176 ß-ureidopropionate, 177 urinary metabolic profile, 54 urinary metabolite, 167 urinary stone, 181 urine, 60, 185,239 UV absorption, 93 UV diode array, 93 L-valine, 185 viscosity, 92 — of mobile phase, 110 vitamin B12, 168, 173, 175 water, 203, 204 boiling—, 41 cold —, 39 deuterated MeOH —, 150 Wilson's disease, 169 WIT, 212, 214 xanthine, 180 — oxidase, 180, 181 xenobiotic, 208 xylitol, 197 yeast, 39